100% found this document useful (1 vote)
754 views

15.5 Admin Guide PDF

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
754 views

15.5 Admin Guide PDF

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2560

Symantec™ Data Loss

Prevention Administration
Guide

Version 15.5

Last updated: 19 August 2019


Symantec Data Loss Prevention Administration Guide
Documentation version: 15.5d

Legal Notice
Copyright © 2019 Symantec Corporation. All rights reserved.

Symantec, CloudSOC, Blue Coat, the Symantec Logo, the Checkmark Logo, the Blue Coat logo, and the
Shield Logo are trademarks or registered trademarks of Symantec Corporation or its affiliates in the U.S.
and other countries. Other names may be trademarks of their respective owners.

This Symantec product may contain third party software for which Symantec is required to provide attribution
to the third party (“Third Party Programs”). Some of the Third Party Programs are available under open
source or free software licenses. The License Agreement accompanying the Software does not alter any
rights or obligations you may have under those open source or free software licenses. Please see the
Third Party Legal Notice Appendix to this Documentation or TPIP ReadMe File accompanying this Symantec
product for more information on the Third Party Programs.

The product described in this document is distributed under licenses restricting its use, copying, distribution,
and decompilation/reverse engineering. No part of this document may be reproduced in any form by any
means without prior written authorization of Symantec Corporation and its licensors, if any.

THE DOCUMENTATION IS PROVIDED "AS IS" AND ALL EXPRESS OR IMPLIED CONDITIONS,
REPRESENTATIONS AND WARRANTIES, INCLUDING ANY IMPLIED WARRANTY OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR NON-INFRINGEMENT, ARE
DISCLAIMED, EXCEPT TO THE EXTENT THAT SUCH DISCLAIMERS ARE HELD TO BE LEGALLY
INVALID. SYMANTEC CORPORATION SHALL NOT BE LIABLE FOR INCIDENTAL OR CONSEQUENTIAL
DAMAGES IN CONNECTION WITH THE FURNISHING, PERFORMANCE, OR USE OF THIS
DOCUMENTATION. THE INFORMATION CONTAINED IN THIS DOCUMENTATION IS SUBJECT TO
CHANGE WITHOUT NOTICE.

The Licensed Software and Documentation are deemed to be commercial computer software as defined
in FAR 12.212 and subject to restricted rights as defined in FAR Section 52.227-19 "Commercial Computer
Software - Restricted Rights" and DFARS 227.7202, et seq. "Commercial Computer Software and
Commercial Computer Software Documentation," as applicable, and any successor regulations, whether
delivered by Symantec as on premises or hosted services. Any use, modification, reproduction release,
performance, display or disclosure of the Licensed Software and Documentation by the U.S. Government
shall be solely in accordance with the terms of this Agreement.
Symantec Corporation
350 Ellis Street
Mountain View, CA 94043

https://www.symantec.com
Symantec Support
All support services will be delivered in accordance with your support agreement and the
then-current Enterprise Technical Support policy.

Knowledge Base Articles and Symantec Connect


Before you contact Technical Support, you can find free content in our online Knowledge Base,
which includes troubleshooting articles, how-to articles, alerts, and product manuals. In the
search box of the following URL, type the name of your product:
https://support.symantec.com
Access our blogs and online forums to engage with other customers, partners, and Symantec
employees on a wide range of topics at the following URL:
https://www.symantec.com/connect

Technical Support and Enterprise Customer Support


Symantec Support maintains support centers globally 24 hours a day, 7 days a week. Technical
Support’s primary role is to respond to specific queries about product features and functionality.
Enterprise Customer Support assists with non-technical questions, such as license activation,
software version upgrades, product access, and renewals.
For Symantec Support terms, conditions, policies, and other support information, see:
https://entced.symantec.com/default/ent/supportref
To contact Symantec Support, see:
https://support.symantec.com/en_US/contact-support.html
Contents

Symantec Support .............................................................................................. 4

Section 1 Getting started .............................................................. 71


Chapter 1 Introducing Symantec Data Loss Prevention ................ 72
About updates to the Symantec Data Loss Prevention Administration
Guide ................................................................................... 72
About Symantec Data Loss Prevention ............................................. 75
About the Enforce Server platform ................................................... 77
About Network Monitor and Prevent ................................................. 78
About Network Discover/Cloud Storage Discover ................................ 78
About Network Protect ................................................................... 79
About Endpoint Discover ................................................................ 80
About Endpoint Prevent ................................................................. 80

Chapter 2 Getting started administering Symantec Data Loss


Prevention ...................................................................... 82
About Symantec Data Loss Prevention administration .......................... 82
About the Enforce Server administration console ................................ 83
Logging on and off the Enforce Server administration console ............... 84
About the administrator account ...................................................... 85
Performing initial setup tasks .......................................................... 85
Changing the administrator password ............................................... 86
Adding an administrator email account .............................................. 86
Editing a user profile ..................................................................... 87
Changing your password ............................................................... 89

Chapter 3 Working with languages and locales ............................... 91


About support for character sets, languages, and locales ...................... 91
Supported languages for detection ................................................... 92
Working with international characters ............................................... 93
About Symantec Data Loss Prevention language packs ....................... 94
About locales ............................................................................... 95
Contents 6

Using a non-English language on the Enforce Server administration


console ................................................................................ 95
Using the Language Pack Utility ...................................................... 96

Section 2 Managing the Enforce Server


platform ..................................................................... 100
Chapter 4 Managing Enforce Server services and settings ......... 101
About Symantec Data Loss Prevention services ................................ 101
About starting and stopping services on Windows ............................. 102
Starting an Enforce Server on Windows .................................... 103
Stopping an Enforce Server on Windows ................................... 103
Starting a detection server on Windows .................................... 104
Stopping a detection server on Windows ................................... 104
Starting services on single-tier Windows installations ................... 104
Stopping services on single-tier Windows installations ................. 105
About starting and stopping services on Linux .................................. 105
Starting an Enforce Server on Linux ......................................... 105
Stopping an Enforce Server on Linux ........................................ 106
Starting a detection server on Linux ......................................... 106
Stopping a detection server on Linux ........................................ 107
Starting services on single-tier Linux installations ........................ 107
Stopping services on single-tier Linux installations ...................... 107

Chapter 5 Managing roles and users ............................................... 109


About role-based access control .................................................... 109
About configuring roles and users .................................................. 110
About recommended roles for your organization ................................ 111
Roles included with solution packs ................................................. 112
Configuring roles ........................................................................ 114
Configuring user accounts ............................................................ 121
Configuring password enforcement settings ..................................... 124
Resetting the Administrator password ............................................. 125
Manage and add roles ................................................................. 126
Manage and add users ................................................................ 126
About authenticating users ........................................................... 127
Configuring user authentication ..................................................... 131
About SAML authentication .................................................... 131
Setting up authentication ........................................................ 132
Administrator Bypass URL ..................................................... 133
Set up and configure the authentication method .......................... 133
Contents 7

Set up the SAML authentication configuration ............................ 135


Generate or download Enforce (service providers) SAML
metadata ...................................................................... 135
Configure the Enforce Server as a SAML service provider with
the IdP (Create an application in your identity provider) .......... 136
Export the IdP metadata to DLP .............................................. 136
Configuring Active Directory authentication ................................ 136
Configuring forms-based authentication .................................... 137
Configuring certificate authentication ........................................ 137
Integrating Active Directory for user authentication ............................ 137
Creating the configuration file for Active Directory
integration ..................................................................... 138
Verifying the Active Directory connection ................................... 140
Configuring the Enforce Server for Active Directory
authentication ................................................................ 141
About certificate authentication configuration .................................... 142
Configuring certificate authentication for the Enforce Server
administration console ..................................................... 144
Adding certificate authority (CA) certificates to the Tomcat trust
store ............................................................................ 146
Mapping Common Name (CN) values to Symantec Data Loss
Prevention user accounts ................................................. 149
About certificate revocation checks .......................................... 150
Troubleshooting certificate authentication .................................. 153
Disabling password authentication and forms-based logon ............ 154

Chapter 6 Connecting to group directories ..................................... 155


Creating connections to LDAP servers ............................................ 155
Configuring directory server connections ......................................... 156
Scheduling directory server indexing .............................................. 158

Chapter 7 Managing stored credentials .......................................... 160


About the credential store ............................................................. 160
Adding new credentials to the credential store .................................. 161
Configuring endpoint credentials .................................................... 161
Managing credentials in the credential store ..................................... 162
Managing stored credentials ......................................................... 162

Chapter 8 Managing system events and messages ...................... 164


About system events ................................................................... 164
System events reports ................................................................. 165
Contents 8

Working with saved system reports ................................................ 168


Server and Detectors event detail .................................................. 169
Configuring event thresholds and triggers ........................................ 170
About system event responses ...................................................... 172
Enabling a syslog server .............................................................. 174
About system alerts ..................................................................... 175
Configuring the Enforce Server to send email alerts ........................... 176
Configuring system alerts ............................................................. 177
About log review ......................................................................... 179
System event codes and messages ................................................ 180

Chapter 9 Managing the Symantec Data Loss Prevention


database ........................................................................ 206
Working with Symantec Data Loss Prevention database diagnostic
tools ................................................................................... 206
Viewing tablespaces and data file allocations ................................... 207
Adjusting warning thresholds for tablespace usage in large
databases ..................................................................... 208
Generating a database report ................................................. 208
Viewing table details .................................................................... 209
Checking the database update readiness ........................................ 210
Preparing to run the Update Readiness tool ............................... 211
Creating the Update Readiness tool database account ................. 213
Running the Update Readiness tool from the Enforce Server
administration console ..................................................... 214
Running the Update Readiness tool at the command line ............. 215
Reviewing update readiness results ......................................... 218

Chapter 10 Working with Symantec Information Centric


Encryption ..................................................................... 219
About Symantec Information Centric Encryption ................................ 219
About the Symantec ICE Utility ...................................................... 220
Overview of implementing Information Centric Encryption
capabilities .......................................................................... 222
Configuring the Enforce Server to connect to the Symantec ICE
Cloud ................................................................................. 224
Contents 9

Chapter 11 Working with Symantec Information Centric


Tagging .......................................................................... 226
About integrating Information Centric Tagging with Data Loss
Prevention ........................................................................... 226
Overview of steps to tie Information Centric Tagging to Data Loss
Prevention ........................................................................... 228
Integrating the ICT server with the Enforce Server ............................. 229
About automatic and static imports of the ICT classification
taxonomy ...................................................................... 229
Using the ICT Web Service for scheduled classification taxonomy
imports ......................................................................... 230
Using an XML file for static classification taxonomy imports ........... 231
Importing the ICT classification taxonomy ........................................ 231
Supported file types for ICT-Data Loss Prevention integration .............. 232

Chapter 12 Adding a new product module ........................................ 234


Installing a new license file ........................................................... 234
About system upgrades ............................................................... 235

Chapter 13 Applying a Maintenance Pack ......................................... 236


Applying a Symantec Data Loss Prevention Maintenance Pack ............ 236
Steps to apply a maintenance pack on Windows servers .............. 236
Steps to apply a maintenance pack on Linux servers ................... 240

Section 3 Managing detection servers ................................ 245


Chapter 14 Installing and managing detection servers and
cloud detectors ............................................................ 246
About managing Symantec Data Loss Prevention servers ................... 247
Preparing for Microsoft Rights Management file monitoring ................. 247
Enabling Microsoft Rights Management file monitoring ................. 248
Enabling Advanced Process Control ............................................... 250
Server controls ........................................................................... 251
Server configuration—basic .......................................................... 253
Network Monitor Server—basic configuration ............................. 254
Network Prevent for Email Server—basic configuration ................ 256
Network Prevent for Web Server—basic configuration ................. 259
Network Discover/Cloud Storage Discover Server and Network
Protect—basic configuration ............................................. 261
Endpoint Server—basic configuration ....................................... 262
Contents 10

Single Tier Monitor — basic configuration .................................. 263


Editing a detector ........................................................................ 272
Server and detector configuration—advanced .................................. 273
Adding a detection server ............................................................. 273
Adding a cloud detector ............................................................... 275
Removing a server ...................................................................... 277
Importing SSL certificates to Enforce or Discover servers .................... 277
About the Overview screen ........................................................... 278
Configuring the Enforce Server to use a proxy to connect to cloud
services .............................................................................. 279
Server and detector status overview ............................................... 280
Recent error and warning events list ............................................... 282
Server/Detector Detail screen ........................................................ 283
Advanced server settings ............................................................. 285
Advanced detector settings ........................................................... 326
About using load balancers in an endpoint deployment ....................... 330

Chapter 15 Managing log files ............................................................. 333


About log files ............................................................................ 333
Operational log files .............................................................. 334
Debug log files ..................................................................... 337
Log collection and configuration screen ........................................... 343
Configuring server logging behavior ............................................... 343
Collecting server logs and configuration files .................................... 347
About log event codes ................................................................. 350
Network Prevent for Web operational log files and event
codes ........................................................................... 351
Network Prevent for Web access log files and fields .................... 352
Network Prevent for Web protocol debug log files ....................... 354
Network Prevent for Email log levels ........................................ 355
Network Prevent for Email operational log codes ........................ 355
Network Prevent for Email originated responses and codes .......... 359

Chapter 16 Using Symantec Data Loss Prevention utilities .......... 362


About Symantec Data Loss Prevention utilities ................................. 362
About Endpoint utilities ................................................................ 363
About DBPasswordChanger ......................................................... 364
DBPasswordChanger syntax .................................................. 364
Example of using DBPasswordChanger .................................... 365
Contents 11

Section 4 Authoring policies ..................................................... 366


Chapter 17 Introduction to policies .................................................... 368
About Data Loss Prevention policies ............................................... 368
Policy components ...................................................................... 370
Policy templates ......................................................................... 371
Solution packs ........................................................................... 372
Policy groups ............................................................................. 372
Policy deployment ....................................................................... 373
Policy severity ............................................................................ 374
Policy authoring privileges ............................................................ 375
Data Profiles .............................................................................. 375
User Groups .............................................................................. 376
Policy template import and export .................................................. 377
Workflow for implementing policies ................................................. 378
Viewing, printing, and downloading policy details ............................... 379

Chapter 18 Overview of policy detection ........................................... 381


Detecting data loss ..................................................................... 381
Content that can be detected .................................................. 382
Files that can be detected ...................................................... 382
Protocols that can be monitored .............................................. 382
Endpoint events that can be detected ....................................... 383
Identities that can be detected ................................................. 383
Languages that can be detected .............................................. 383
Data Loss Prevention policy detection technologies ........................... 383
Policy matching conditions ............................................................ 386
Content matching conditions ................................................... 387
File property matching conditions ............................................. 388
Protocol matching condition for network .................................... 389
Endpoint matching conditions ................................................. 389
Groups (identity) matching conditions ....................................... 390
Detection messages and message components ................................ 391
Exception conditions ................................................................... 393
Compound conditions .................................................................. 394
Policy detection execution ............................................................ 394
Two-tier detection for DLP Agents .................................................. 395

Chapter 19 Creating policies from templates ................................... 397


Creating a policy from a template ................................................... 397
US Regulatory Enforcement policy templates ................................... 400
Contents 12

General Data Protection Regulation (GDPR) policy templates .............. 402


International Regulatory Enforcement policy templates ....................... 403
Customer and Employee Data Protection policy templates .................. 404
Confidential or Classified Data Protection policy templates .................. 405
Network Security Enforcement policy templates ................................ 406
Acceptable Use Enforcement policy templates .................................. 407
Columbia Personal Data Regulatory Enforcement policy template ........ 408
Choosing an Exact Data Profile ..................................................... 409
Choosing an Indexed Document Profile ........................................... 411

Chapter 20 Configuring policies .......................................................... 412


Adding a new policy or policy template ............................................ 412
Configuring policies ..................................................................... 413
Adding a rule to a policy ............................................................... 415
Configuring policy rules ................................................................ 417
Defining rule severity ................................................................... 420
Configuring match counting .......................................................... 421
Selecting components to match on ................................................. 423
Adding an exception to a policy ..................................................... 424
Configuring policy exceptions ........................................................ 426
Configuring compound match conditions ......................................... 429
Input character limits for policy configuration .................................... 431

Chapter 21 Administering policies ...................................................... 432


Manage and add policies ............................................................. 432
Manage and add policy groups ...................................................... 435
Creating and modifying policy groups ............................................. 436
Importing policies ........................................................................ 437
About importing policies ......................................................... 437
About policy references ......................................................... 438
Exporting policies ....................................................................... 439
About policy export ............................................................... 439
Cloning policies .......................................................................... 440
Importing policy templates ............................................................ 441
Exporting policy detection as a template .......................................... 442
Adding an automated response rule to a policy ................................. 442
Removing policies and policy groups .............................................. 443
Viewing and printing policy details .................................................. 444
Downloading policy details ........................................................... 444
Troubleshooting policies ............................................................... 445
Updating EDM and IDM profiles to the latest version .......................... 446
Updating policies after upgrading to the latest version ........................ 447
Contents 13

Chapter 22 Best practices for authoring policies ............................ 449


Best practices for authoring policies ................................................ 449
Develop a policy strategy that supports your data security
objectives ........................................................................... 451
Use a limited number of policies to get started .................................. 451
Use policy templates but modify them to meet your requirements ......... 452
Use the appropriate match condition for your data loss prevention
objectives ........................................................................... 452
Test and tune policies to improve match accuracy ............................. 453
Start with high match thresholds to reduce false positives ................... 454
Use a limited number of exceptions to narrow detection scope ............. 455
Use compound conditions to improve match accuracy ........................ 455
Author policies to limit the potential effect of two-tier detection ............. 456
Use policy groups to manage policy lifecycle .................................... 457
Follow detection-specific best practices ........................................... 457

Chapter 23 Increasing the Inspection Content Size ........................ 459


Increasing the inspection content size ............................................. 459

Chapter 24 Installing remote indexers ............................................... 463


About installing remote indexers .................................................... 589
Installing a remote indexer on Windows ........................................... 464
Installing a remote indexer on Linux ................................................ 466
Configuring a remote indexer on Linux ............................................ 466

Chapter 25 Detecting content using Exact Match Data


Identifiers (EMDI) ........................................................ 468
Introducing Exact Match Data Identifiers (EMDI) ............................... 468
About using EMDI to protect content ........................................ 469
About EMDI and key columns ................................................. 470
About EMDI policy features .................................................... 470
EMDI compared to EDM ........................................................ 471
About the Exact Match Data Identifier profile and index ................ 473
About the Exact Match Data Identifier source file ........................ 473
About cleansing the Exact Match Data Identifier source file ........... 474
About EMDI index scheduling ................................................. 475
Configuring Exact Match Data Identifier profiles ................................ 476
Creating the Exact Match Data Identifier source file ..................... 477
Preparing the Exact Match Data Identifier source for
indexing ....................................................................... 478
Contents 14

Uploading the Exact Match Data Identifier source files to the


Enforce Server ............................................................... 480
Adding Exact Match Data Identifier Profiles ................................ 482
Creating and modifying the Exact Match Data Identifier
profiles ......................................................................... 483
Scheduling EMDI profile indexing ............................................ 485
Associating data identifiers with your data source (EMDI) ............. 486
Adding an EMDI check to a built-in or custom data identifier
condition in a policy ........................................................ 487
Using multi-token matching with EMDI ............................................ 488
Characteristics of multi-token cells for EMDI .............................. 489
Multi-token with spaces for EMDI ............................................. 490
Multi-token with mixed language characters for EMDI .................. 490
Multi-token with punctuation for EMDI ....................................... 491
Additional examples for multi-token cells with punctuation for
EMDI ........................................................................... 492
Multi-token punctuation characters for EMDI .............................. 495
Proximity matching example for EMDI ...................................... 496
Memory requirements for EMDI ..................................................... 498
EMDI memory configuration and limitations ............................... 499
Overview of configuring memory and indexing the data source for
EMDI ........................................................................... 500
Determining requirements for both local indexers and remote
indexers for EMDI ........................................................... 500
Detection server memory requirements for EMDI ........................ 501
Increasing the memory for the detection server (File Reader) for
EMDI ........................................................................... 503
Profile size limitations on the DLP Agent for EMDI ...................... 504
Remote EMDI indexing ................................................................ 504
About the Remote EMDI Indexer ............................................. 505
About the SQL Preindexer and EMDI ....................................... 505
System requirements for remote EMDI indexing ......................... 505
Workflow for remote EMDI indexing ......................................... 506
About installing the Remote EMDI indexer ................................. 507
Creating an EMDI profile template for remote indexing ................. 508
Downloading and copying the EMDI profile file to a remote
system ......................................................................... 509
Generating remote index files for EMDI ..................................... 509
Remote EMDI indexing examples using data source file ............... 510
Remote EMDI Indexer command options .................................. 511
Remote EMDI indexing examples using the SQL Preindexer ......... 512
Copying and loading EMDI remote index files to the Enforce
Server .......................................................................... 513
Contents 15

Troubleshooting EMDI preindexing errors .................................. 514


Properties file settings for EMDI ..................................................... 515
Best practices for using EMDI ....................................................... 517
Never use a personal identifier as an optional column in
EMDI ........................................................................... 519
Use three or more columns in a match for EMDI ......................... 519
Don’t use EMDI validators as both optional and required for a
given data identifier in a policy .......................................... 519
Use additional validators with EMDI where possible ..................... 519
Limit the required number of columns to two or three for
EMDI ........................................................................... 519
When matching with only a single optional column, avoid adding
low-variability values as optional columns with EMDI ............. 519
Use full disk encryption on EMDI endpoint deployments ............... 519
Cleanse the EMDI data source file of blank columns and duplicate
rows ............................................................................ 519
Remove ambiguous character types from the EMDI data source
file ............................................................................... 520
Clean up your EMDI data source for multi-token matching ............ 521
Do not use the comma delimiter if the EMDI data source has
number fields ................................................................. 521
Ensure that the EMDI data source is clean for indexing ................ 522
Include column headers as the first row of the EMDI data source
file ............................................................................... 522
Check the EMDI system alerts to tune profile accuracy ................ 522
Use scheduled indexing to automate EMDI profile updates ........... 523
EMDI Troubleshooting ................................................................. 523
The EMDI index doesn’t get published to the Endpoint
Agent ........................................................................... 523
The EMDI index doesn’t get published to the Endpoint Agent and
the EnabledOnAgents setting is true ................................... 523
A key column that is in an EMDI index doesn’t generate an incident
................................................................................... 524
EMDI generates an unexpectedly high number of false
positives ....................................................................... 524

Chapter 26 Detecting content using Exact Data Matching


(EDM) ............................................................................. 525

Introducing Exact Data Matching (EDM) .......................................... 525


About using EDM to protect content ......................................... 526
EDM policy features .............................................................. 527
About the Exact Data Profile and index ..................................... 528
Contents 16

About the exact data source file ............................................... 529


About cleansing the exact data source file for EDM ..................... 530
About using System Fields for data source validation with
EDM ............................................................................ 530
About index scheduling for EDM .............................................. 531
About the Content Matches Exact Data From condition for
EDM ............................................................................ 532
About Data Owner Exception for EDM ...................................... 532
About profiled Directory Group Matching (DGM) for EDM ............. 533
About two-tier detection for EDM on the endpoint ........................ 533
About upgrading EDM deployments ......................................... 534
Configuring Exact Data profiles for EDM .......................................... 534
Creating the exact data source file for EDM ............................... 535
Creating the exact data source file for Data Owner Exception for
EDM ............................................................................ 536
Creating the exact data source file for profiled DGM for
EDM ............................................................................ 537
Preparing the exact data source file for indexing for EDM ............. 537
Uploading exact data source files for EDM to the Enforce
Server .......................................................................... 539
Creating and modifying Exact Data Profiles for EDM .................... 541
Mapping Exact Data Profile fields for EDM ................................. 545
Using system-provided pattern validators for EDM profiles ............ 547
Scheduling Exact Data Profile indexing for EDM ......................... 548
Managing and adding Exact Data Profiles for EDM ...................... 550
Configuring EDM policies ............................................................. 551
Configuring the Content Matches Exact Data policy condition for
EDM ............................................................................ 551
Configuring Data Owner Exception for EDM policy
conditions ..................................................................... 554
Configuring the Sender/User based on a Profiled Directory policy
condition for EDM ........................................................... 554
Configuring the Recipient based on a Profiled Directory policy
condition for EDM ........................................................... 555
About configuring natural language processing for Chinese,
Japanese, and Korean for EDM policies .............................. 556
Configuring Advanced Settings for EDM policies ......................... 557
Using multi-token matching with EDM ............................................. 560
Characteristics of multi-token cells (EDM) .................................. 560
Multi-token with spaces (EDM) ................................................ 561
Multi-token with stopwords (EDM) ............................................ 562
Multi-token with mixed language characters (EDM) ..................... 562
Multi-token with punctuation (EDM) .......................................... 563
Contents 17

Additional examples for multi-token cells with punctuation


(EDM) .......................................................................... 564
Some special use cases for system-recognized data patterns
(EDM) .......................................................................... 567
Multi-token punctuation characters (EDM) ................................. 569
Match count variant examples (EDM) ....................................... 570
Proximity matching example for EDM ....................................... 572
Updating EDM indexes to the latest version ..................................... 574
Update process using the Remote EDM Indexer ......................... 574
Update process using the Enforce Server for EDM ...................... 576
EDM index out-of-date error codes ........................................... 578
Memory requirements for EDM ...................................................... 579
About memory requirements for EDM ....................................... 579
Overview of configuring memory and indexing the data source for
EDM ............................................................................ 580
Determining requirements for both local and remote indexers for
EDM ............................................................................ 580
Detection server memory requirements for EDM ......................... 582
Increasing the memory for the detection server (File Reader) for
EDM ............................................................................ 584
Using the EDM Memory Requirements Spreadsheet ................... 585
Remote EDM indexing ................................................................. 585
About the Remote EDM Indexer .............................................. 586
About the SQL Preindexer for EDM .......................................... 586
System requirements for remote EDM indexing .......................... 587
Workflow for remote EDM indexing .......................................... 587
Installing the Remote EDM Indexer .......................................... 588
Creating an EDM profile template for remote indexing .................. 589
Downloading and copying the EDM profile file to a remote
system ......................................................................... 591
Generating remote index files for EDM ...................................... 591
Remote indexing examples using data source file (EDM) .............. 592
Remote indexing examples using SQL Preindexer (EDM) ............. 593
Copying and loading remote EDM index files to the Enforce
Server .......................................................................... 594
SQL Preindexer command options (EDM) ................................. 595
Remote EDM Indexer command options ................................... 597
Troubleshooting preindexing errors for EDM .............................. 598
Troubleshooting remote indexing errors for EDM ......................... 599
Best practices for using EDM ........................................................ 601
Ensure data source has at least one column of unique data
(EDM) .......................................................................... 602
Contents 18

Cleanse the data source file of blank columns and duplicate rows
(EDM) .......................................................................... 603
Remove ambiguous character types from the data source file
(EDM) .......................................................................... 604
Understand how multi-token cell matching functions (EDM) ........... 604
Do not use the comma delimiter if the data source has number
fields (EDM) .................................................................. 605
Map data source column to system fields to leverage validation
(EDM) .......................................................................... 605
Ensure that the data source is clean for indexing (EDM) ............... 605
Leverage EDM policy templates when possible .......................... 606
Include column headers as the first row of the data source file
(EDM) .......................................................................... 606
Check the system alerts to tune profile accuracy (EDM) ............... 607
Use stopwords to exclude common words from detection
(EDM) .......................................................................... 607
Use scheduled indexing to automate profile updates (EDM) .......... 607
Match on 3 columns in an EDM condition to increase detection
accuracy ....................................................................... 608
Leverage exception tuples to avoid false positives (EDM) ............. 609
Use a WHERE clause to detect records that meet specific criteria
(EDM) .......................................................................... 609
Use the minimum matches field to fine tune EDM rules ................ 610
Combine Data Identifiers with EDM rules to limit the impact of
two-tier detection ............................................................ 610
Include an email address field in the Exact Data Profile for profiled
DGM (EDM) .................................................................. 610
Use profiled DGM for Network Prevent for Web identity detection
(EDM) .......................................................................... 611

Chapter 27 Detecting content using Indexed Document


Matching (IDM) ............................................................ 612
Introducing Indexed Document Matching (IDM) ................................. 612
About using IDM .................................................................. 613
Supported forms of matching for IDM ....................................... 613
Types of IDM detection .......................................................... 614
About the Indexed Document Profile ........................................ 615
About the document data source ............................................. 616
About the indexing process .................................................... 616
About indexing remote documents ........................................... 617
About the server index files and the agent index files ................... 618
About index deployment and logging ........................................ 619
Contents 19

Using IDM to detect exact files ................................................ 620


Using IDM to detect exact and partial file contents ....................... 621
About using the Content Matches Document Signature policy
condition ....................................................................... 623
About white listing partial file contents ....................................... 624
Configuring IDM profiles and policy conditions .................................. 625
Preparing the document data source for indexing ........................ 625
White listing file contents to exclude from partial matching ............ 627
Manage and add Indexed Document Profiles ............................. 628
Creating and modifying Indexed Document Profiles ..................... 629
Configure endpoint partial content matching ............................... 632
Uploading a document archive to the Enforce Server ................... 633
Referencing a document archive on the Enforce Server ............... 634
Using local path on Enforce Server .......................................... 636
Using the remote SMB share option to index file shares ............... 637
Using the remote SMB share option to index SharePoint
documents .................................................................... 637
Filtering documents by file name ............................................. 640
Filtering documents by file size ................................................ 642
Scheduling document profile indexing ....................................... 643
Changing the default indexer properties .................................... 644
Enabling Agent IDM .............................................................. 645
Estimating endpoint memory use for agent IDM .......................... 646
Configuring the Content Matches Document Signature policy
condition ....................................................................... 646
Best practices for using IDM ......................................................... 648
Reindex IDM profiles after upgrade .......................................... 649
Do not compress files in the document source ............................ 649
Do not index empty documents ............................................... 649
Prefer partial matching over exact matching on the DLP
Agent ........................................................................... 650
Understand limitations of exact matching ................................... 650
Use white listing to exclude non-sensitive content from partial
matching ...................................................................... 651
Filter documents from indexing to reduce false positives ............... 652
Distinguish IDM exceptions from white listing and filtering ............. 652
Create separate profiles to index large document sources ............ 653
Use WebDAV or CIFS to index remote document data
sources ........................................................................ 653
Use scheduled indexing to keep profiles up to date ..................... 653
Use parallel IDM rules to tune match thresholds ......................... 654
Remote IDM indexing .................................................................. 655
About the Remote IDM Indexer ............................................... 655
Contents 20

Installing the Remote IDM Indexer .......................................... 656


Indexing the document data source using the GUI edition
(Windows only) .............................................................. 656
Scheduling remote indexing with the Remote IDM Indexer app
for Windows .................................................................. 659
Incremental indexing ............................................................. 661
Logging and troubleshooting ................................................... 662
Copying the preindex file to the Enforce Server host .................... 662
Loading the remote index file into the Enforce Server ................... 663

Chapter 28 Detecting content using Vector Machine Learning


(VML) .............................................................................. 664
Introducing Vector Machine Learning (VML) ..................................... 664
About the Vector Machine Learning Profile ................................ 665
About the content you train ..................................................... 665
About the base accuracy from training percentage rates ............... 666
About the Similarity Threshold and Similarity Score ..................... 667
About using unaccepted VML profiles in policies ......................... 667
Configuring VML profiles and policy conditions ................................. 668
Creating new VML profiles ..................................................... 669
Working with the Current Profile and Temporary Workspace
tabs ............................................................................. 670
Uploading example documents for training ................................ 671
Training VML profiles ............................................................ 672
Adjusting the memory allocation .............................................. 675
Managing training set documents ............................................ 676
Managing VML profiles .......................................................... 677
Changing names and descriptions for VML profiles ..................... 679
Configuring the Detect using Vector Machine Learning Profile
condition ....................................................................... 679
Configuring VML policy exceptions ........................................... 680
Adjusting the Similarity Threshold ............................................ 681
Testing and tuning VML profiles ............................................... 682
Properties for configuring training ............................................ 683
Log files for troubleshooting VML training and policy
detection ...................................................................... 686
Best practices for using VML ......................................................... 687
When to use VML ................................................................. 688
When not to use VML ............................................................ 689
Recommendations for training set definition ............................... 689
Guidelines for training set sizing .............................................. 690
Recommendations for uploading documents for training ............... 691
Contents 21

Guidelines for profile sizing ..................................................... 691


Recommendations for accepting or rejecting a profile .................. 692
Guidelines for accepting or rejecting training results .................... 693
Recommendations for deploying profiles ................................... 694

Chapter 29 Detecting content using Form Recognition -


Sensitive Image Recognition ..................................... 695
About Form Recognition detection .................................................. 695
How Form Recognition works ................................................. 696
Configuring Form Recognition detection .......................................... 696
Preparing a Form Recognition Gallery Archive ........................... 697
Configuring a Form Recognition profile ..................................... 698
Configuring the Form Recognition detection rule ......................... 699
Configuring the Form Recognition exception rule ........................ 700
Managing Form Recognition profiles ............................................... 700
Advanced server settings for Form Recognition ................................ 702
Viewing a Form Recognition incident .............................................. 703

Chapter 30 Detecting Content using OCR - Sensitive Image


Recognition ................................................................... 704
About content detection with OCR Sensitive Image Recognition ........... 705
Detection types supported for OCR extraction ............................ 705
File types supported for OCR extraction .................................... 705
About extracting images from Microsoft Office documents for OCR
and Form Recognition ..................................................... 706
OCR Server system requirements .................................................. 706
Using diagnostics for sizing OCR Server deployments ........................ 706
Creating a null policy to assist in OCR diagnostics for Discover
Servers ............................................................................... 708
Using the OCR Server Sizing Estimator spreadsheet ......................... 710
Setting up OCR Servers ............................................................... 710
Installing an OCR Sensitive Image Recognition license ...................... 711
Creating an OCR configuration ...................................................... 711
Using the OCR engine ................................................................. 713
More about languages and Dictionaries ........................................... 713
Specialized Dictionaries available for OCR content
extraction ...................................................................... 714
Languages supported for OCR extraction .................................. 714
Viewing OCR incidents in reports ................................................... 715
Advanced Server settings and Troubleshooting for Sensitive Image
Recognition content extraction ................................................. 715
Contents 22

Chapter 31 Detecting content using data identifiers ...................... 717


Introducing data identifiers ............................................................ 717
System-defined data identifiers ............................................... 718
Extending and customizing data identifiers ................................ 731
About data identifier configuration ............................................ 731
About data identifier breadths ................................................. 731
About optional validators for data identifiers ............................... 732
About data identifier patterns .................................................. 732
About pattern validators ......................................................... 733
About data normalizers .......................................................... 733
About cross-component matching ............................................ 733
About unique match counting .................................................. 734
Configuring data identifier policy conditions ...................................... 734
Workflow for configuring data identifier policies ........................... 734
Managing and adding data identifiers ....................................... 735
Editing data identifiers ........................................................... 736
Configuring the Content Matches data identifier condition ............. 737
Using data identifier breadths .................................................. 738
Selecting a data identifier breadth ............................................ 739
Using optional validators ........................................................ 762
Configuring optional validators ................................................ 763
Acceptable characters for optional validators .............................. 764
Using unique match counting .................................................. 775
Configuring unique match counting .......................................... 775
Modifying system data identifiers ................................................... 776
Cloning a system data identifier before modifying it ..................... 777
Editing pattern validator input .................................................. 778
List of pattern validators that accept input data ........................... 778
Editing keywords for international PII data identifiers .................... 779
List of keywords for international system data identifiers ............... 780
Updating policies to use the Randomized US SSN data
identifier ....................................................................... 810
Creating custom data identifiers ..................................................... 811
Workflow for creating custom data identifiers .............................. 812
Custom data identifier configuration ......................................... 814
Using the data identifier pattern language .................................. 814
Writing data identifier patterns to match data .............................. 817
Using pattern validators ......................................................... 818
Selecting pattern validators .................................................... 829
Selecting a data normalizer .................................................... 830
Creating custom script validators ............................................. 831
Configuring pre- and post-validators ......................................... 831
Contents 23

Best practices for using data identifiers ........................................... 833


Use data identifiers instead of regular expressions to improve
accuracy ....................................................................... 834
Clone system-defined data identifiers before modifying to preserve
original state .................................................................. 835
Modify data identifier definitions when you want tuning to apply
globally ........................................................................ 835
Consider using multiple breadths in parallel to detect different
severities of confidential data ............................................ 836
Avoid matching on the Envelope over HTTP to reduce false
positives ....................................................................... 836
Use the Randomized US SSN data identifier to detect SSNs ......... 836
Use unique match counting to improve accuracy and ease
remediation ................................................................... 837

Chapter 32 Detecting content using keyword matching ................ 838


Introducing keyword matching ....................................................... 838
About keyword matching for Chinese, Japanese, and Korean
(CJK) languages ............................................................ 839
About keyword proximity ........................................................ 840
Keyword matching syntax ...................................................... 840
Keyword matching examples .................................................. 841
Keyword matching examples for CJK languages ......................... 842
About updates to the Drug, Disease, and Treatment keyword
lists ............................................................................. 843
Configuring keyword matching ...................................................... 844
Configuring the Content Matches Keyword condition ................... 844
Enabling and using CJK token verification for server keyword
matching ...................................................................... 847
Updating the Drug, Disease, and Treatment keyword lists for your
HIPAA and Caldicott policies ............................................. 848
Best practices for using keyword matching ....................................... 849
Enable token verification on the server to reduce false positives
for CJK keyword detection ................................................ 850
Keep the keyword lists for your HIPAA and Caldicott policies up
to date ......................................................................... 850
Tune keywords lists for data identifiers to improve match
accuracy ....................................................................... 851
Use keyword matching to detect document metadata ................... 851
Use VML to generate and maintain large keyword
dictionaries ................................................................... 851
Contents 24

Chapter 33 Detecting content using regular expressions .............. 852


Introducing regular expression matching ......................................... 852
About the updated regular expression engine ................................... 853
About writing regular expressions ................................................... 853
Configuring the Content Matches Regular Expression condition ........... 854
Best practices for using regular expression matching ......................... 855
When to use regular expression matching ................................. 856
Use look ahead and look behind characters to improve regular
expression accuracy ....................................................... 856
Use regular expressions sparingly to support efficient
performance .................................................................. 857
Test regular expressions before deployment to improve
accuracy ....................................................................... 857

Chapter 34 Detecting content using classification


matching ....................................................................... 858

Introducing classification matching ................................................. 858


Supported file types .................................................................... 859
How tag matching works .............................................................. 860
Configuring the Content Matches Classification condition .................... 863

Chapter 35 Detecting international language content ................... 866


Detecting non-English language content .......................................... 866
Best practices for detecting non-English language content .................. 867
Use international policy templates for policy creation ................... 867
Use custom keywords for system data identifiers ........................ 869
Enable token validation to match Chinese, Japanese, and Korean
keywords on the server .................................................... 899

Chapter 36 Detecting file properties .................................................. 900


Introducing file property detection ................................................... 900
About file type matching ......................................................... 900
About file format support for file type matching ........................... 901
About custom file type identification .......................................... 901
About file size matching ......................................................... 902
About file name matching ....................................................... 903
Configuring file property matching .................................................. 903
Configuring the Message Attachment or File Type Match
condition ....................................................................... 904
Contents 25

Configuring the Message Attachment or File Size Match


condition ....................................................................... 905
Configuring the Message Attachment or File Name Match
condition ....................................................................... 906
File name matching syntax ..................................................... 907
File name matching examples ................................................. 907
Enabling the Custom File Type Signature condition in the policy
console ........................................................................ 908
Configuring the Custom File Type Signature condition .................. 908
Best practices for using file property matching .................................. 909
Use compound file property rules to protect design and multimedia
files ............................................................................. 909
Do not use file type matching to detect content ........................... 910
Calculate file size properly to improve match accuracy ................. 910
Use expression patterns to match file names ............................. 910
Use scripts and plugins to detect custom file types ...................... 910

Chapter 37 Detecting network incidents ........................................... 912


Introducing protocol monitoring for network ...................................... 912
Configuring the Protocol Monitoring condition for network
detection ............................................................................. 913
Best practices for using network protocol matching ............................ 914
Use separate policies for specific protocols ................................ 914
Consider detection server network placement to support IP
address matching ........................................................... 914

Chapter 38 Detecting endpoint events .............................................. 915


Introducing endpoint event detection .............................................. 915
About endpoint protocol monitoring .......................................... 915
About endpoint destination monitoring ...................................... 916
About endpoint global application monitoring .............................. 916
About endpoint location detection ............................................ 917
About endpoint device detection .............................................. 917
Configuring endpoint event detection conditions ................................ 917
Configuring the Endpoint Monitoring condition ............................ 918
Configuring the Endpoint Location condition ............................... 919
Configuring the Endpoint Device Class or ID condition ................. 920
Gathering endpoint device IDs for removable devices .................. 921
Creating and modifying endpoint device configurations ................ 922
Best practices for using endpoint detection ...................................... 923
Contents 26

Chapter 39 Detecting described identities ........................................ 925


Introducing described identity matching ........................................... 925
Described identity matching examples ............................................ 925
Configuring described identity matching policy conditions .................... 926
About Reusable Sender/Recipient Patterns ............................... 927
Configuring the Sender/User Matches Pattern condition ............... 927
Configuring a Reusable Sender Pattern .................................... 929
Configuring the Recipient Matches Pattern condition ................... 930
Configuring a Reusable Recipient Pattern ................................. 931
Best practices for using described identity matching ........................... 932
Define precise identity patterns to match users ........................... 932
Specify email addresses exactly to improve accuracy .................. 933
Match domains instead of IP addresses to improve
accuracy ....................................................................... 933

Chapter 40 Detecting synchronized identities ................................. 935


Introducing synchronized Directory Group Matching (DGM) ................. 935
About two-tier detection for synchronized DGM ................................. 936
Configuring User Groups .............................................................. 936
Configuring synchronized DGM policy conditions .............................. 938
Configuring the Sender/User based on a Directory Server Group
condition ....................................................................... 939
Configuring the Recipient based on a Directory Server Group
condition ....................................................................... 940
Best practices for using synchronized DGM ..................................... 941
Refresh the directory on initial save of the User Group ................. 941
Distinguish synchronized DGM from other types endpoint
detection ...................................................................... 941

Chapter 41 Detecting profiled identities ........................................... 942


Introducing profiled Directory Group Matching (DGM) ......................... 942
About two-tier detection for profiled DGM ......................................... 942
Configuring Exact Data profiles for DGM ......................................... 943
Configuring profiled DGM policy conditions ...................................... 944
Configuring the Sender/User based on a Profiled Directory
condition ....................................................................... 944
Configuring the Recipient based on a Profiled Directory
condition ....................................................................... 945
Best practices for using profiled DGM ............................................. 946
Follow EDM best practices when implementing profiled
DGM ............................................................................ 946
Contents 27

Include an email address field in the Exact Data Profile for profiled
DGM ............................................................................ 946
Use profiled DGM for Network Prevent for Web identity
detection ...................................................................... 947

Chapter 42 Using contextual attributes for Application


Detection ....................................................................... 948
Introducing contextual attributes for cloud applications ....................... 948
Configuring contextual attribute conditions ....................................... 948
Contextual attribute categories ................................................ 949

Chapter 43 Supported file formats for detection ............................ 962


Overview of detection file format support ......................................... 962
Supported formats for file type identification ..................................... 964
Supported formats for content extraction ......................................... 980
Supported word-processing formats for content extraction ............ 980
Supported presentation formats for content extraction .................. 982
Supported spreadsheet formats for content extraction .................. 983
Supported text and markup formats for content extraction ............. 984
Supported email formats for content extraction ........................... 985
Supported CAD formats for content extraction ............................ 985
Supported graphics formats for content extraction ....................... 986
Supported database formats for content extraction ...................... 986
Other file formats supported for content extraction ....................... 986
Supported encapsulation formats for subfile extraction ....................... 987
Supported file formats for metadata extraction .................................. 989
About document metadata detection ........................................ 989
Enabling server metadata detection ......................................... 990
Enabling endpoint metadata detection ...................................... 990
Best practices for using metadata detection ............................... 991

Chapter 44 Supported Office Open XML formats for


high-performance content extraction ..................... 996
About high-performance content extraction for Office Open XML
formats ............................................................................... 996
Enabling high-performance content extraction for Office Open XML
files .................................................................................... 998
About metadata extraction for Office Open XML files .......................... 999
About subfile extraction for Office Open XML files ............................ 1000
Contents 28

Chapter 45 Library of system data identifiers ................................ 1004


Library of system data identifiers .................................................. 1013
ABA Routing Number ................................................................. 1013
ABA Routing Number wide breadth ........................................ 1013
ABA Routing Number medium breadth .................................... 1014
ABA Routing Number narrow breadth ..................................... 1014
Argentina Tax Identification Number .............................................. 1015
Argentina Tax Identification Number wide breadth ..................... 1016
Argentina Tax Identification Number medium breadth ................. 1016
Argentina Tax Identification Number narrow breadth .................. 1017
Australia Driver's License Number ................................................ 1018
Australia Driver's License Number wide breadth ........................ 1018
Australia Driver's License Number narrow breadth ..................... 1019
Australian Business Number ....................................................... 1020
Australian Business Number wide breadth ............................... 1020
Australian Business Number medium breadth ........................... 1021
Australian Business Number narrow breadth ............................ 1021
Australian Company Number ....................................................... 1022
Australian Company Number wide breadth .............................. 1023
Australian Company Number medium breadth .......................... 1023
Australian Company Number narrow breadth ........................... 1023
Australian Medicare Number ....................................................... 1024
Australian Medicare Number wide breadth ............................... 1024
Australian Medicare Number medium breadth .......................... 1025
Australian Medicare Number narrow breadth ............................ 1026
Australian Passport Number ........................................................ 1027
Australian Passport Number wide breadth ............................... 1027
Australian Passport Number narrow breadth ............................ 1028
Australian Tax File Number ......................................................... 1029
Australian Tax File Number wide breadth ................................. 1029
Australian Tax File Number narrow breadth .............................. 1029
Austria Passport Number ............................................................ 1030
Austria Passport Number wide breadth ................................... 1030
Austria Passport Number narrow breadth ................................ 1031
Austria Tax Identification Number ................................................. 1031
Austria Tax Identification Number wide breadth ......................... 1032
Austria Tax Identification Number narrow breadth ...................... 1032
Austria Value Added Tax (VAT) Number ......................................... 1033
Austria Value Added Tax (VAT) Number wide breadth ................. 1033
Austria Value Added Tax (VAT) Number medium breadth ............ 1034
Austria Value Added Tax (VAT) Number narrow breadth .............. 1035
Austrian Social Security Number .................................................. 1036
Contents 29

Austrian Social Security Number wide breadth .......................... 1036


Austrian Social Security Number medium breadth ..................... 1037
Austrian Social Security Number narrow breadth ....................... 1037
Belgian National Number ............................................................ 1039
Belgian National Number wide breadth ................................... 1040
Belgian National Number medium breadth ............................... 1040
Belgian National Number narrow breadth ................................. 1041
Belgium Driver's Licence Number ................................................. 1042
Belgium Driver's Licence Number wide breadth ........................ 1042
Belgium Driver's Licence Number narrow breadth ..................... 1043
Belgium Passport Number .......................................................... 1044
Belgium Passport Number wide breadth .................................. 1044
Belgium Passport Number narrow breadth ............................... 1044
Belgium Tax Identification Number ................................................ 1045
Belgium Tax Identification Number wide breadth ....................... 1045
Belgium Tax Identification Number narrow breadth .................... 1046
Belgium Value Added Tax (VAT) Number ....................................... 1047
Belgium Value Added Tax (VAT) Number wide breadth ............... 1048
Belgium Value Added Tax (VAT) Number medium breadth ........... 1048
Belgium Value Added Tax (VAT) Number narrow breadth ............ 1049
Brazilian Election Identification Number ......................................... 1049
Brazilian Election Identification Number wide breadth ................. 1050
Brazilian Election Identification Number medium breadth ............ 1051
Brazilian Election Identification Number narrow breadth .............. 1052
Brazilian National Registry of Legal Entities Number ........................ 1053
Brazilian National Registry of Legal Entities Number wide
breadth ....................................................................... 1054
Brazilian National Registry of Legal Entities Number medium
breadth ....................................................................... 1054
Brazilian National Registry of Legal Entities Number narrow
breadth ....................................................................... 1055
Brazilian Natural Person Registry Number (CPF) ............................. 1055
Brazilian Natural Person Registry Number wide breadth ............. 1056
Brazilian Natural Person Registry Number medium breadth ......... 1056
Brazilian Natural Person Registry Number narrow breadth ......... 1057
British Columbia Personal Healthcare Number ................................ 1058
British Columbia Personal Healthcare Number wide breadth ....
1 0 5 8
British Columbia Personal Healthcare Number medium
breadth ....................................................................... 1058
British Columbia Personal Healthcare Number narrow
breadth ....................................................................... 1059
Bulgaria Value Added Tax (VAT) Number ....................................... 1060
Contents 30

Bulgaria Value Added Tax (VAT) Number wide breadth ............... 1061
Bulgaria Value Added Tax (VAT) Number medium breadth .......... 1061
Bulgaria Value Added Tax (VAT) Number narrow breadth ............ 1062
Bulgarian Uniform Civil Number - EGN .......................................... 1063
Bulgarian Uniform Civil Number - EGN wide breadth .................. 1063
Bulgarian Uniform Civil Number - EGN medium breadth ............. 1064
Bulgarian Uniform Civil Number - EGN narrow breadth ............... 1065
Burgerservicenummer ................................................................ 1066
Burgerservicenummer wide breadth ....................................... 1066
Burgerservicenummer narrow breadth .................................... 1066
Canada Driver's License Number ................................................. 1067
Canada Driver's License Number wide breadth ......................... 1067
Canada Driver's License Number medium breadth .................... 1068
Canada Driver's License Number narrow breadth ...................... 1069
Canada Passport Number .......................................................... 1070
Canada Passport Number wide breadth .................................. 1071
Canada Passport Number narrow breadth ............................... 1071
Canada Permanent Residence (PR) Number .................................. 1072
Canada Permanent Residence (PR) Number wide breadth ......... 1072
Canada Permanent Residence (PR) Number narrow
breadth ....................................................................... 1073
Canadian Social Insurance Number .............................................. 1074
Canadian Social Insurance Number wide breadth ...................... 1075
Canadian Social Insurance Number medium breadth ................. 1075
Canadian Social Insurance Number narrow breadth ................... 1076
Chilean National Identification Number .......................................... 1077
Chilean National Identification Number wide breadth .................. 1077
Chilean National Identification Number medium breadth ............. 1078
Chilean National Identification Number narrow breadth ............... 1078
China Passport Number ............................................................. 1079
China Passport Number wide breadth ..................................... 1080
China Passport Number narrow breadth .................................. 1080
Codice Fiscale .......................................................................... 1081
Codice Fiscale wide breadth ................................................. 1081
Codice Fiscale narrow breadth .............................................. 1082
Colombian Addresses ................................................................ 1082
Colombian Addresses wide breadth ........................................ 1083
Colombian Addresses narrow breadth ..................................... 1084
Colombian Cell Phone Number .................................................... 1085
Colombian Cell Phone Number wide breadth ............................ 1085
Colombian Cell Phone Number narrow breadth ......................... 1086
Colombian Personal Identification Number ..................................... 1088
Colombian Personal Identification Number wide breadth ............. 1088
Contents 31

Colombian Personal Identification Number narrow breadth .......... 1089


Colombian Tax Identification Number ............................................ 1090
Colombian Tax Identification Number wide breadth .................... 1090
Colombian Tax Identification Number narrow breadth ................. 1091
Credit Card Magnetic Stripe Data ................................................. 1092
Credit Card Number .................................................................. 1095
Credit Card Number wide breadth .......................................... 1095
Credit Card Number medium breadth ...................................... 1096
Credit Card Number narrow breadth ....................................... 1100
Croatia National Identification Number ........................................... 1104
Croatia National Identification Number wide breadth .................. 1105
Croatia National Identification Number medium breadth .............. 1105
Croatia National Identification Number narrow breadth ............... 1105
CUSIP Number ......................................................................... 1106
CUSIP Number wide breadth ................................................ 1107
CUSIP Number medium breadth ............................................ 1107
CUSIP Number narrow breadth ............................................. 1108
Cyprus Tax Identification Number ................................................. 1109
Cyprus Tax Identification Number wide breadth ......................... 1109
Cyprus Tax Identification Number medium breadth .................... 1109
Cyprus Tax Identification Number narrow breadth ...................... 1110
Cyprus Value Added Tax (VAT) Number ......................................... 1111
Cyprus Value Added Tax (VAT) Number wide breadth ................ 1111
Cyprus Value Added Tax (VAT) Number medium breadth ............ 1111
Cyprus Value Added Tax (VAT) Number narrow breadth ............. 1112
Czech Republic Driver's Licence Number ....................................... 1112
Czech Republic Driver's License Number wide breadth .............. 1113
Czech Republic Driver's License Number narrow breadth ........... 1113
Czech Republic Personal Identification Number .............................. 1114
Czech Republic personal Identification Number wide
breadth ....................................................................... 1115
Czech Republic Personal Identification Number medium
breadth ....................................................................... 1115
Czech Republic Personal Identification Number narrow
breadth ....................................................................... 1116
Czech Republic Tax Identification Number ...................................... 1117
Czech Republic Tax Identification Number wide breadth ............. 1118
Czech Republic Tax Identification Number medium breadth ......... 1119
Czech Republic Tax Identification Number narrow breadth .......... 1120
Czech Republic Value Added Tax (VAT) Number ............................. 1121
Czech Republic Value Added Tax (VAT) Number wide
breadth ....................................................................... 1122
Contents 32

Czech Republic Value Added Tax (VAT) Number medium


breadth ....................................................................... 1123
Czech Republic Value Added Tax (VAT) Number narrow
breadth ....................................................................... 1124
Denmark Personal Identification Number ....................................... 1126
Denmark Personal Identification Number wide breadth ............... 1126
Denmark Personal Identification Number medium breadth .......... 1126
Denmark Personal Identification Number narrow breadth ............ 1127
Denmark Tax Identification Number .............................................. 1128
Denmark Tax Identification Number wide breadth ...................... 1128
Denmark Tax Identification Number medium breadth .................. 1129
Denmark Tax Identification Number narrow breadth ................... 1129
Denmark Value Added Tax (VAT) Number ...................................... 1130
Denmark Value Added Tax (VAT) Number wide breadth .............. 1131
Denmark Value Added Tax (VAT) Number medium breadth ......... 1131
Denmark Value Added Tax (VAT) Number narrow breadth ........... 1132
Driver's License Number – CA State ............................................ 1133
Driver's License Number – CA State wide breadth ..................... 1133
Driver's License Number – CA State medium breadth ................ 1134
Driver's License Number - FL, MI, MN States .................................. 1134
Driver's License Number- FL, MI, MN States wide breadth .......... 1135
Driver's License Number- FL, MI, MN States medium
breadth ....................................................................... 1135
Driver's License Number - IL State ............................................... 1136
Driver's License Number- IL State wide breadth ........................ 1136
Driver's License Number- IL State medium breadth .................... 1137
Driver's License Number - NJ State .............................................. 1138
Driver's License Number- NJ State wide breadth ....................... 1138
Driver's License Number- NJ State medium breadth .................. 1138
Driver's License Number - NY State .............................................. 1139
Driver's License Number- NY State wide breadth ...................... 1139
Driver's License Number - NY State medium breadth ................. 1140
Driver's License Number - WA State ............................................. 1140
Driver's License Number - WA State wide breadth ..................... 1141
Driver's License Number - WA State medium breadth ................ 1141
Driver's License Number - WA State narrow breadth .................. 1142
Driver's License Number - WI State .............................................. 1142
Driver's License Number - WI State wide breadth ...................... 1143
Driver's License Number - WI State medium breadth .................. 1143
Driver's License Number - WI State narrow breadth ................... 1144
Drug Enforcement Agency (DEA) Number ...................................... 1145
Drug Enforcement Agency (DEA) Number wide breadth ............. 1145
Drug Enforcement Agency (DEA) Number medium breadth ......... 1146
Contents 33

Drug Enforcement Agency (DEA) Number narrow breadth .......... 1146


Estonia Driver's Licence Number .................................................. 1147
Estonia Driver's Licence Number wide breadth ......................... 1147
Estonia Driver's Licence Number narrow breadth ...................... 1148
Estonia Passport Number ........................................................... 1149
Estonia Passport Number wide breadth ................................... 1149
Estonia Passport Number narrow breadth ................................ 1150
Estonia Personal Identification Code ............................................. 1151
Estonia Personal Identification Code wide breadth ..................... 1152
Estonia Personal Identification Code medium breadth ................ 1152
Estonia Personal Identification Code narrow breadth .................. 1153
Estonia Value Added Tax (VAT) Number ........................................ 1153
Estonia Value Added Tax (VAT) Number wide breadth ................ 1154
Estonia Value Added Tax (VAT) Number medium breadth ........... 1154
Estonia Value Added Tax (VAT) Number narrow breadth ............. 1155
European Health Insurance Card Number ...................................... 1156
European Health Insurance Card Number wide breadth .............. 1156
European Health Insurance Card Number narrow breadth ........... 1160
Finland Driver's Licence Number .................................................. 1165
Finland Driver's Licence Number wide breadth ......................... 1166
Finland Driver's Licence Number medium breadth ..................... 1166
Finland Driver's Licence Number narrow breadth ...................... 1166
Finland European Health Insurance Number .................................. 1167
Finland European Health Insurance Number wide breadth .......... 1168
Finland European Health Insurance Number narrow breadth ....... 1168
Finland Passport Number ........................................................... 1169
Finland Passport Number wide breadth ................................... 1170
Finland Passport Number narrow breadth ................................ 1170
Finland Tax Identification Number ................................................. 1171
Finland Tax Identification Number wide breadth ........................ 1171
Finland Tax Identification Number medium breadth .................... 1172
Finland Tax Identification Number narrow breadth ..................... 1172
Finland Value Added Tax (VAT) Number ........................................ 1173
Finland Value Added Tax (VAT) Number wide breadth ................ 1173
Finland Value Added Tax (VAT) Number medium breadth ............ 1174
Finland Value Added Tax (VAT) Number narrow breadth ............. 1175
Finnish Personal Identification Number .......................................... 1175
Finnish Personal Identification Number wide breadth ................. 1176
Finnish Personal Identification Number medium breadth ............. 1176
Finnish Personal Identification Number narrow breadth .............. 1176
France Driver's License Number .................................................. 1177
France Driver's License Number wide breadth .......................... 1178
France Driver's License Number narrow breadth ....................... 1178
Contents 34

France Health Insurance Number ................................................. 1179


France Health Insurance Number wide breadth ......................... 1179
France Health Insurance Number narrow breadth ...................... 1180
France Tax Identification Number ................................................. 1181
France Tax Identification Number wide breadth ......................... 1181
France Tax Identification Number narrow breadth ...................... 1181
France Value Added Tax (VAT) Number ......................................... 1182
France Value Added Tax (VAT) Number wide breadth ................. 1182
France Value Added Tax (VAT) Number medium breadth ............ 1183
France Value Added Tax (VAT) Number narrow breadth .............. 1184
French INSEE Code .................................................................. 1185
French INSEE Code wide breadth .......................................... 1185
French INSEE Code narrow breadth ....................................... 1186
French Passport Number ............................................................ 1187
French Passport Number wide breadth ................................... 1187
French Passport Number narrow breadth ................................ 1187
French Social Security Number .................................................... 1188
French Social Security Number wide breadth ........................... 1188
French Social Security Number medium breadth ....................... 1189
French Social Security Number narrow breadth ........................ 1189
German Passport Number .......................................................... 1190
German Passport Number wide breadth .................................. 1190
German Passport Number medium breadth ............................. 1191
German Passport Number narrow breadth ............................... 1191
German Personal ID Number ...................................................... 1192
German Personal ID Number wide breadth .............................. 1192
German Personal ID Number medium breadth .......................... 1193
German Personal ID Number narrow breadth ........................... 1193
Germany Driver's License Number ............................................... 1194
Germany Driver's License Number wide breadth ....................... 1194
Germany Driver's License Number narrow breadth .................... 1195
Germany Value Added Tax (VAT) Number ...................................... 1196
Germany Value Added Tax (VAT) Number wide breadth .............. 1196
Germany Value Added Tax (VAT) Number medium breadth ......... 1196
Germany Value Added Tax (VAT) Number narrow breadth ........... 1197
Germany Tax Identification Number .............................................. 1198
Germany Tax Identification Number wide breadth ...................... 1198
Germany Tax Identification Number medium breadth ................. 1199
Germany Tax Identification Number narrow breadth ................... 1199
Greece Passport Number ........................................................... 1200
Greece Passport Number wide breadth ................................... 1201
Greece Passport Number narrow breadth ................................ 1201
Greece Social Security Number (AMKA) ........................................ 1202
Contents 35

Greece Social Security Number (AMKA) wide breadth ................ 1202


Greece Social Security Number (AMKA) medium breadth ........... 1203
Greece Social Security Number (AMKA) narrow breadth ............. 1203
Greek Tax Identification Number .................................................. 1204
Greek Tax Identification Number wide breadth .......................... 1204
Greek Tax Identification Number medium breadth ...................... 1205
Greek Tax Identification Number narrow breadth ....................... 1205
Greece Value Added Tax (VAT) Number ........................................ 1206
Greece Value Added Tax (VAT) Number wide breadth ................ 1207
Greece Value Added Tax (VAT) Number medium breadth ............ 1207
Greece Value Added Tax (VAT) Number narrow breadth ............. 1208
Healthcare Common Procedure Coding System (HCPCS CPT
Code) ............................................................................... 1208
Healthcare Common Procedure Coding System (HCPCS CPT
Code) medium breadth .................................................. 1209
Healthcare Common Procedure Coding System (HCPCS CPT
Code) narrow breadth .................................................... 1210
Health Insurance Claim Number ................................................... 1212
Health Insurance Claim Number wide breadth .......................... 1212
Health Insurance Claim Number medium breadth ...................... 1213
Health Insurance Claim Number narrow breadth ....................... 1214
Hong Kong ID .......................................................................... 1215
Hong Kong ID wide breadth .................................................. 1216
Hong Kong ID narrow breadth ............................................... 1216
Hungary Driver's Licence Number ................................................ 1217
Hungary Driver's Licence Number wide breadth ........................ 1218
Hungary Driver's Licence Number narrow breadth ..................... 1218
Hungary Passport Number .......................................................... 1219
Hungary Passport Number wide breadth ................................. 1220
Hungary Passport Number medium breadth ............................. 1220
Hungary Passport Number narrow breadth .............................. 1220
Hungarian Social Security Number ............................................... 1221
Hungarian Social Security Number wide breadth ....................... 1222
Hungarian Social Security Number medium breadth .................. 1222
Hungarian Social Security Number narrow breadth .................... 1222
Hungarian Tax Identification Number ............................................. 1223
Hungarian Tax Identification Number wide breadth .................... 1224
Hungarian Tax Identification Number medium breadth ................ 1224
Hungarian Tax Identification Number narrow breadth ................. 1224
Hungarian VAT Number .............................................................. 1225
Hungarian VAT Number wide breadth ..................................... 1226
Hungarian VAT Number medium breadth ................................. 1226
Hungarian VAT Number narrow breadth .................................. 1226
Contents 36

IBAN Central ............................................................................ 1227


IBAN Central wide breadth ................................................... 1228
IBAN Central narrow breadth ................................................ 1229
IBAN East ............................................................................... 1231
IBAN East wide breadth ....................................................... 1232
IBAN East narrow-breadth .................................................... 1234
IBAN West ............................................................................... 1237
IBAN West wide breadth ...................................................... 1237
IBAN West narrow-breadth ................................................... 1239
Iceland National Identification Number ........................................... 1241
Iceland National Identification Number wide breadth .................. 1242
Iceland National Identification Number medium breadth .............. 1243
Iceland National Identification Number narrow breadth ............... 1244
Iceland Passport Number ........................................................... 1245
Iceland Passport Number wide breadth ................................... 1245
Iceland Passport Number narrow breadth ................................ 1246
Iceland Value Added Tax (VAT) Number ......................................... 1247
Iceland Value Added Tax (VAT) Number wide breadth ................ 1247
Iceland Value Added Tax (VAT) Number narrow breadth ............. 1248
Indian Aadhaar Card Number ...................................................... 1249
Indian Aadhaar Card Number wide breadth .............................. 1249
Indian Aadhaar Card Number medium breadth ......................... 1249
Indian Aadhaar Card Number narrow breadth ........................... 1250
Indian Permanent Account Number .............................................. 1251
Indian Permanent Account Number wide breadth ...................... 1251
Indian Permanent Account Number narrow breadth ................... 1252
India RuPay Card Number .......................................................... 1252
India RuPay Card Number wide breadth .................................. 1253
India RuPay Card Number medium breadth ............................. 1253
India RuPay Card Number narrow breadth ............................... 1254
Indonesian Identity Card Number ................................................. 1255
Indonesian Identity Card Number wide breadth ......................... 1256
Indonesian Identity Card Number medium breadth .................... 1256
Indonesian Identity Card Number narrow breadth ...................... 1256
International Mobile Equipment Identity Number .............................. 1257
International Mobile Equipment Identity Number wide
breadth ....................................................................... 1258
International Mobile Equipment Identity Number medium
breadth ....................................................................... 1258
International Mobile Equipment Identity Number narrow
breadth ....................................................................... 1259
International Securities Identification Number .................................. 1259
International Securities Identification Number wide breadth ......... 1260
Contents 37

International Securities Identification Number medium


breadth ....................................................................... 1260
International Securities Identification Number narrow
breadth ....................................................................... 1260
IP Address ............................................................................... 1261
IP Address wide breadth ...................................................... 1261
IP Address medium breadth .................................................. 1262
IP Address narrow breadth ................................................... 1263
IPv6 Address ........................................................................... 1263
IPv6 Address wide breadth ................................................... 1264
IPv6 Address medium breadth ............................................... 1264
IPv6 Address narrow breadth ................................................ 1265
Ireland Passport Number ............................................................ 1266
Ireland Passport Number wide breadth .................................... 1266
Ireland Passport Number narrow breadth ................................. 1267
Ireland Tax Identification Number ................................................. 1268
Ireland Tax Identification Number wide breadth ......................... 1268
Ireland Tax Identification Number medium breadth ..................... 1269
Ireland Tax Identification Number narrow breadth ...................... 1270
Ireland Value Added Tax (VAT) Number ......................................... 1271
Ireland Value Added Tax (VAT) Number wide breadth ................. 1272
Ireland Value Added Tax (VAT) Number medium breadth ............ 1273
Ireland Value Added Tax (VAT) Number narrow breadth .............. 1273
Irish Personal Public Service Number ............................................ 1274
Irish Personal Public Service Number wide breadth ................... 1275
Irish Personal Public Service Number medium breadth ............... 1275
Irish Personal Public Service Number narrow breadth ................ 1276
Israel Personal Identification Number ............................................ 1276
Israel Personal Identification Number wide breadth .................... 1277
Israel Personal Identification Number medium breadth ............... 1277
Israel Personal Identification Number narrow breadth ................. 1277
Italy Driver's Licence Number ...................................................... 1278
Italy Driver's Licence Number wide breadth .............................. 1279
Italy Driver's Licence Number narrow breadth ........................... 1279
Italy Health Insurance Number ..................................................... 1280
Italy Health Insurance Number wide breadth ............................ 1280
Italy Health Insurance Number narrow breadth ......................... 1281
Italy Passport Number ................................................................ 1282
Italy Passport Number wide breadth ....................................... 1282
Italy Passport Number narrow breadth .................................... 1282
Italy Value Added Tax (VAT) Number ............................................. 1283
Italy Value Added Tax (VAT) Number wide breadth .................... 1283
Italy Value Added Tax (VAT) Number medium breadth ................ 1284
Contents 38

Italy Value Added Tax (VAT) Number narrow breadth ................. 1285
Japan Driver's License Number ................................................... 1285
Japan Driver's License Number wide breadth ........................... 1286
Japan Driver's License Number medium breadth ....................... 1286
Japan Driver's License Number narrow breadth ........................ 1286
Japan Passport Number ............................................................. 1287
Japan Passport Number wide breadth ..................................... 1287
Japan Passport Number narrow breadth .................................. 1288
Japanese Juki-Net Identification Number ....................................... 1289
Japanese Juki-Net Identification Number wide breadth ............... 1289
Japanese Juki-Net Identification Number medium breadth .......... 1290
Japanese Juki-Net Identification Number narrow breadth ............ 1290
Japanese My Number - Corporate ................................................ 1291
Japanese My Number - Corporate wide breadth ........................ 1291
Japanese My Number - Corporate narrow breadth ..................... 1292
Japanese My Number - Personal ................................................. 1292
Japanese My Number - Personal wide breadth ......................... 1293
Japanese My Number - Personal medium breadth ..................... 1293
Japanese My Number - Personal narrow breadth ...................... 1294
Kazakhstan Passport Number ..................................................... 1295
Kazakhstan Passport Number wide breadth ............................. 1295
Kazakhstan Passport Number narrow breadth .......................... 1296
Korea Passport Number ............................................................. 1296
Korea Passport Number wide breadth ..................................... 1297
Korea Passport Number narrow breadth .................................. 1297
Korea Residence Registration Number for Foreigners ...................... 1298
Korea Residence Registration Number for Foreigners wide
breadth ....................................................................... 1298
Korea Residence Registration Number for Foreigners medium
breadth ....................................................................... 1299
Korea Residence Registration Number for Foreigners narrow
breadth ....................................................................... 1299
Korea Residence Registration Number for Korean ........................... 1300
Korea Residence Registration Number for Korean wide
breadth ....................................................................... 1301
Korea Residence Registration Number for Korean medium
breadth ....................................................................... 1301
Korea Residence Registration Number for Korean narrow
breadth ....................................................................... 1302
Latvia Driver's Licence Number .................................................... 1303
Latvia Driver's Licence Number wide breadth ........................... 1303
Latvia Driver's Licence Number narrow breadth ........................ 1304
Latvia Passport Number ............................................................. 1305
Contents 39

Latvia Passport Number wide breadth ..................................... 1305


Latvia Passport Number narrow breadth .................................. 1305
Latvia Personal Identification Number ........................................... 1306
Latvia Personal Identification Number wide breadth ................... 1307
Latvia Personal Identification Number medium breadth ............... 1307
Latvia Personal Identification Number narrow breadth ................ 1307
Latvia Value Added Tax (VAT) Number .......................................... 1308
Latvia Value Added Tax (VAT) Number wide breadth .................. 1309
Latvia Value Added Tax (VAT) Number medium breadth ............. 1309
Latvia Value Added Tax (VAT) Number narrow breadth ............... 1310
Liechtenstein Passport Number ................................................... 1311
Liechtenstein Passport Number wide breadth ........................... 1311
Liechtenstein Passport Number narrow breadth ........................ 1312
Lithuania Personal Identification Number ....................................... 1312
Lithuania Personal Identification Number wide breadth ............... 1313
Lithuania Personal Identification Number medium breadth .......... 1314
Lithuania Personal Identification Number narrow breadth ............ 1314
Lithuania Tax Identification Number .............................................. 1315
Lithuania Tax Identification Number wide breadth ...................... 1315
Lithuania Tax Identification Number medium breadth .................. 1316
Lithuania Tax Identification Number narrow breadth ................... 1316
Lithuania Value Added Tax (VAT) Number ...................................... 1317
Lithuania Value Added Tax (VAT) Number wide breadth .............. 1318
Lithuania Value Added Tax (VAT) Number medium breadth ......... 1318
Lithuania Value Added Tax (VAT) Number narrow breadth ........... 1319
Luxembourg National Register of Individuals Number ....................... 1320
Luxembourg National Register of Individuals Number wide
breadth ....................................................................... 1320
Luxembourg National Register of Individuals Number medium
breadth ....................................................................... 1321
Luxembourg National Register of Individuals Number narrow
breadth ....................................................................... 1321
Luxembourg Passport Number .................................................... 1322
Luxembourg Passport Number wide breadth ............................ 1322
Luxembourg Passport Number narrow breadth ......................... 1323
Luxembourg Tax Identification Number .......................................... 1324
Luxembourg Tax Identification Number wide breadth .................. 1324
Luxembourg Tax Identification Number medium breadth ............. 1325
Luxembourg Tax Identification Number narrow breadth ............... 1326
Luxembourg Value Added Tax (VAT) Number .................................. 1327
Luxembourg Value Added Tax (VAT) Number wide breadth ......... 1328
Luxembourg Value Added Tax (VAT) Number medium
breadth ....................................................................... 1329
Contents 40

Luxembourg Value Added Tax (VAT) Number narrow


breadth ....................................................................... 1329
Macau National Identification Number ........................................... 1331
Macau National Identification Number wide breadth ................... 1331
Macau National Identification Number narrow breadth ................ 1332
Malaysia Passport Number ......................................................... 1333
Malaysia Passport Number wide breadth ................................. 1333
Malaysia Passport Number narrow breadth .............................. 1334
Malaysian MyKad Number (MyKad) .............................................. 1335
Malaysian MyKad Number (MyKad) wide breadth ...................... 1335
Malaysian MyKad Number (MyKad) medium breadth ................. 1336
Malaysian MyKad Number (MyKad) narrow breadth ................... 1336
Malta National Identification Number ............................................. 1337
Malta National Identification Number wide breadth ..................... 1338
Malta National Identification Number narrow breadth .................. 1338
Malta Tax Identification Number ................................................... 1339
Malta Tax Identification Number wide breadth ........................... 1339
Malta Tax Identification Number narrow breadth ........................ 1340
Malta Value Added Tax (VAT) Number ........................................... 1342
Malta Value Added Tax (VAT) Number wide breadth ................... 1342
Malta Value Added Tax (VAT) Number medium breadth .............. 1343
Malta Value Added Tax (VAT) Number narrow breadth ................ 1343
Medicare Beneficiary Identifier ..................................................... 1344
Medicare Beneficiary Identifier wide breadth ............................. 1345
Medicare Beneficiary Identifier medium breadth ........................ 1345
Medicare Beneficiary Identifier narrow breadth .......................... 1345
Mexican Personal Registration and Identification Number .................. 1346
Mexican Personal Registration and Identification Number wide
breadth ....................................................................... 1347
Mexican Personal Registration and Identification Number medium
breadth ....................................................................... 1347
Mexican Personal Registration and Identification Number narrow
breadth ....................................................................... 1348
Mexican Tax Identification Number ............................................... 1349
Mexican Tax Identification Number wide breadth ....................... 1349
Mexican Tax Identification Number medium breadth ................... 1350
Mexican Tax Identification Number narrow breadth .................... 1350
Mexican Unique Population Registry Code ..................................... 1351
Mexican Unique Population Registry Code wide breadth ............. 1352
Mexican Unique Population Registry Code medium breadth ....
1 3 5 2
Mexican Unique Population Registry Code narrow breadth .......... 1352
Mexico CLABE Number .............................................................. 1353
Contents 41

Mexico CLABE Number wide breadth ..................................... 1353


Mexico CLABE Number medium breadth ................................. 1354
Mexico CLABE Number narrow breadth .................................. 1354
National Drug Code (NDC) .......................................................... 1355
National Drug Code (NDC) wide breadth ................................. 1355
National Drug Code (NDC) medium breadth ............................. 1356
National Drug Code (NDC) narrow breadth .............................. 1356
National Provider Identifier Number .............................................. 1357
National Provider Identifier Number wide breadth ...................... 1357
National Provider Identifier Number medium breadth .................. 1358
National Provider Identifier Number narrow breadth ................... 1358
Netherlands Bank Account Number .............................................. 1359
Netherlands Bank Account Number wide breadth ...................... 1360
Netherlands Bank Account Number medium breadth ................. 1360
Netherlands Bank Account Number narrow breadth ................... 1361
Netherlands Driver's License Number ........................................... 1362
Netherlands Driver's License Number wide breadth ................... 1362
Netherlands Driver's License Number narrow breadth ................ 1362
Netherlands Passport Number ..................................................... 1363
Netherlands Passport Number wide breadth ............................. 1363
Netherlands Passport Number narrow breadth .......................... 1364
Netherlands Tax Identification Number .......................................... 1364
Netherlands Tax Identification Number wide breadth .................. 1365
Netherlands Tax Identification Number medium breadth .............. 1365
Netherlands Tax Identification Number narrow breadth ............... 1366
Netherlands Value Added Tax (VAT) Number .................................. 1367
Netherlands Value Added Tax (VAT) Number wide breadth .......... 1368
Netherlands Value Added Tax (VAT) Number medium
breadth ....................................................................... 1368
Netherlands Value Added Tax (VAT) Number narrow
breadth ....................................................................... 1369
New Zealand Driver's Licence Number .......................................... 1370
New Zealand Driver's Licence Number wide breadth .................. 1370
New Zealand Driver's Licence Number narrow breadth ............... 1370
New Zealand National Health Index Number ................................... 1371
New Zealand National Health Index Number wide breadth .......... 1372
New Zealand National Health Index Number medium
breadth ....................................................................... 1372
New Zealand National Health Index Number narrow breadth ....... 1372
New Zealand Passport Number ................................................... 1373
New Zealand Passport Number wide breadth ........................... 1373
New Zealand Passport Number narrow breadth ........................ 1374
Norway Driver's Licence Number ................................................. 1375
Contents 42

Norway Driver's Licence Number wide breadth ......................... 1376


Norway Driver's Licence Number narrow breadth ...................... 1376
Norway National Identification Number .......................................... 1377
Norway National Identification Number wide breadth .................. 1377
Norway National Identification Number medium breadth ............. 1378
Norway National Identification Number narrow breadth ............... 1379
Norway Value Added Tax Number ................................................ 1379
Norway Value Added Tax Number wide breadth ........................ 1380
Norway Value Added Tax Number medium breadth ................... 1381
Norway Value Added Tax Number narrow breadth ..................... 1381
Norwegian Birth Number ............................................................ 1382
Norwegian Birth Number wide breadth .................................... 1382
Norwegian Birth Number medium breadth ................................ 1383
Norwegian Birth Number narrow breadth ................................. 1383
People's Republic of China ID ..................................................... 1384
People's Republic of China ID wide breadth ............................. 1385
People's Republic of China ID narrow breadth .......................... 1385
Poland Driver's Licence Number .................................................. 1386
Poland Driver's Licence Number wide breadth .......................... 1386
Poland Driver's Licence Number narrow breadth ....................... 1387
Poland European Health Insurance Number ................................... 1387
Poland European Health Insurance Number wide breadth ........... 1388
Poland European Health Insurance Number narrow breadth ........ 1388
Poland Passport Number ............................................................ 1389
Poland Passport Number wide breadth ................................... 1390
Poland Passport Number narrow breadth ................................ 1390
Poland Value Added Tax (VAT) Number ......................................... 1391
Poland Value Added Tax (VAT) Number wide breadth ................. 1392
Poland Value Added Tax (VAT) Number medium breadth ............ 1392
Poland Value Added Tax (VAT) Number narrow breadth .............. 1393
Polish Identification Number ........................................................ 1394
Polish Identification Number wide breadth ................................ 1394
Polish Identification Number medium breadth ........................... 1395
Polish Identification Number narrow breadth ............................. 1395
Polish REGON Number .............................................................. 1396
Polish REGON Number wide breadth ..................................... 1396
Polish REGON Number medium breadth ................................. 1397
Polish REGON Number narrow breadth .................................. 1397
Polish Social Security Number (PESEL) ........................................ 1398
Polish Social Security Number (PESEL) wide breadth ................ 1399
Polish Social Security Number (PESEL) medium breadth ............ 1399
Polish Social Security Number (PESEL) narrow breadth ............. 1399
Polish Tax Identification Number .................................................. 1400
Contents 43

Polish Tax Identification Number wide breadth .......................... 1401


Polish Tax Identification Number medium breadth ...................... 1401
Polish Tax Identification Number narrow breadth ....................... 1401
Portugal Driver's Licence Number ................................................ 1402
Portugal Driver's Licence Number wide breadth ........................ 1403
Portugal Driver's Licence Number narrow breadth ..................... 1403
Portugal National Identification Number ......................................... 1404
Portugal National Identification Number wide breadth ................. 1405
Portugal National Identification Number medium breadth ............ 1405
Portugal National Identification Number narrow breadth .............. 1406
Portugal Passport Number .......................................................... 1407
Portugal Passport Number wide breadth .................................. 1408
Portugal Passport Number narrow breadth ............................... 1408
Portugal Tax Identification Number ............................................... 1408
Portugal Tax Identification Number wide breadth ....................... 1409
Portugal Tax Identification Number medium breadth ................... 1409
Portugal Tax Identification Number narrow breadth .................... 1410
Portugal Value Added Tax (VAT) Number ....................................... 1411
Portugal Value Added Tax (VAT) Number wide breadth ............... 1412
Portugal Value Added Tax (VAT) Number medium breadth .......... 1412
Portugal Value Added Tax (VAT) Number narrow breadth ............ 1413
Randomized US Social Security Number (SSN) .............................. 1414
Randomized US Social Security Number (SSN) medium
breadth ....................................................................... 1415
Randomized US Social Security Number (SSN) narrow
breadth ....................................................................... 1415
Romania Driver's Licence Number ................................................ 1416
Romania Driver's Licence Number wide breadth ....................... 1417
Romania Driver's Licence Number narrow breadth .................... 1418
Romania National Identification Number ........................................ 1419
Romania National Identification Number wide breadth ................ 1419
Romania National Identification Number medium breadth ........... 1419
Romania National Identification Number narrow breadth ............. 1420
Romania Value Added Tax (VAT) Number ...................................... 1420
Romania Value Added Tax (VAT) Number wide breadth .............. 1421
Romania Value Added Tax (VAT) Number medium breadth ......... 1422
Romania Value Added Tax (VAT) Number narrow breadth ........... 1423
Romanian Numerical Personal Code ............................................ 1425
Romanian Numerical Personal Code wide breadth .................... 1425
Romanian Numerical Personal Code medium breadth ................ 1425
Romanian Numerical Personal Code narrow breadth ................. 1426
Russian Passport Identification Number ......................................... 1427
Russian Passport Identification Number wide breadth ................ 1427
Contents 44

Russian Passport Identification Number narrow breadth ............. 1427


Russian Taxpayer Identification Number ........................................ 1428
Russian Taxpayer Identification Number wide breadth ................ 1429
Russian Taxpayer Identification Number medium breadth ........... 1429
Russian Taxpayer Identification Number narrow breadth ............. 1429
SEPA Creditor Identifier Number North .......................................... 1430
SEPA Creditor Identifier Number North wide breadth .................. 1431
SEPA Creditor Identifier Number North medium breadth ............. 1433
SEPA Creditor Identifier Number North narrow breadth ............... 1435
SEPA Creditor Identifier Number South ......................................... 1437
SEPA Creditor Identifier Number South wide breadth ................. 1438
SEPA Creditor Identifier Number South medium breadth ............. 1439
SEPA Creditor Identifier Number South narrow breadth .............. 1439
SEPA Creditor Identifier Number West .......................................... 1441
SEPA Creditor Identifier Number West wide breadth .................. 1442
SEPA Creditor Identifier Number West medium breadth .............. 1443
SEPA Creditor Identifier Number West narrow breadth ............... 1443
Serbia Unique Master Citizen Number ........................................... 1445
Serbia Unique Master Citizen Number wide breadth .................. 1446
Serbia Unique Master Citizen Number medium breadth .............. 1446
Serbia Unique Master Citizen Number narrow breadth ............... 1447
Serbia Value Added Tax (VAT) Number .......................................... 1448
Serbia Value Added Tax (VAT) Number wide breadth ................. 1449
Serbia Value Added Tax (VAT) Number medium breadth ............. 1449
Serbia Value Added Tax (VAT) Number narrow breadth .............. 1450
Singapore NRIC data identifier ..................................................... 1451
Slovakia Driver's Licence Number ................................................ 1451
Slovakia Driver's Licence Number wide breadth ........................ 1452
Slovakia Driver's Licence Number narrow breadth ..................... 1452
Slovakia National Identification Number ......................................... 1453
Slovakia National Identification Number wide breadth ................. 1454
Slovakia National Identification Number medium breadth ............ 1455
Slovakia National Identification Number narrow breadth .............. 1455
Slovakia Passport Number .......................................................... 1457
Slovakia Passport Number wide breadth ................................. 1458
Slovakia Passport Number narrow breadth .............................. 1458
Slovakia Value Added Tax (VAT) Number ....................................... 1459
Slovakia Value Added Tax (VAT) Number wide breadth ............... 1460
Slovakia Value Added Tax (VAT) Number medium breadth .......... 1460
Slovakia Value Added Tax (VAT) Number narrow breadth ............ 1460
Slovenia Passport Number .......................................................... 1461
Slovenia Passport Number wide breadth ................................. 1462
Slovenia Passport Number narrow breadth .............................. 1462
Contents 45

Slovenia Tax Identification Number ............................................... 1463


Slovenia Tax Identification Number wide breadth ....................... 1463
Slovenia Tax Identification Number medium breadth .................. 1464
Slovenia Tax Identification Number narrow breadth .................... 1464
Slovenia Unique Master Citizen Number ........................................ 1465
Slovenia Unique Master Citizen Number wide breadth ................ 1465
Slovenia Unique Master Citizen Number medium breadth ........... 1466
Slovenia Unique Master Citizen Number narrow breadth ............. 1466
Slovenia Value Added Tax (VAT) Number ....................................... 1467
Slovenia Value Added Tax (VAT) Number wide breadth .............. 1468
Slovenia Value Added Tax (VAT) Number medium breadth .......... 1468
Slovenia Value Added Tax (VAT) Number narrow breadth ........... 1469
South African Personal Identification Number ................................. 1469
South African Personal Identification Number wide breadth ......... 1470
South African Personal Identification Number medium
breadth ....................................................................... 1470
South African Personal Identification Number narrow
breadth ....................................................................... 1471
South Korea Resident Registration Number .................................... 1471
South Korea Resident Registration Number wide breadth ........... 1472
South Korea Resident Registration Number medium
breadth ....................................................................... 1472
South Korea Resident Registration Number narrow breadth ........ 1473
Spain Value Added Tax (VAT) Number ........................................... 1474
Spain Value Added Tax (VAT) Number wide breadth .................. 1474
Spain Value Added Tax (VAT) Number medium breadth .............. 1475
Spain Value Added Tax (VAT) Number narrow breadth ............... 1476
Spain Driver's Licence Number .................................................... 1477
Spain Driver's Licence Number wide breadth ............................ 1477
Spain Driver's Licence Number narrow breadth ......................... 1478
Spanish Customer Account Number ............................................. 1479
Spanish Customer Account Number wide breadth ..................... 1480
Spanish Customer Account Number medium breadth ................. 1480
Spanish Customer Account Number narrow breadth .................. 1481
Spanish DNI ID ......................................................................... 1481
Spanish DNI ID wide breadth ................................................ 1482
Spanish DNI ID narrow breadth ............................................. 1482
Spanish Passport Number .......................................................... 1483
Spanish Passport Number wide breadth .................................. 1483
Spanish Passport Number narrow breadth ............................... 1484
Spanish Social Security Number ................................................. 1485
Spanish Social Security Number wide breadth .......................... 1485
Spanish Social Security Number medium breadth ..................... 1486
Contents 46

Spanish Social Security Number narrow breadth ....................... 1486


Spanish Tax Identification (CIF) .................................................... 1487
Spanish Tax Identification (CIF) wide breadth ........................... 1487
Spanish Tax Identification (CIF) medium breadth ....................... 1488
Spanish Tax Identification (CIF) narrow breadth ........................ 1489
Sri Lanka National Identity Number ............................................... 1490
Sri Lanka National Identity Number wide breadth ...................... 1490
Sri Lanka National Identity Number medium breadth .................. 1491
Sri Lanka National Identity Number narrow breadth ................... 1491
Sweden Driver's Licence Number ................................................. 1492
Sweden Driver's Licence Number wide breadth ........................ 1493
Sweden Driver's Licence Number medium breadth .................... 1493
Sweden Driver's Licence Number narrow breadth ..................... 1493
Sweden Tax Identification Number ................................................ 1494
Sweden Tax Identification Number wide breadth ....................... 1495
Sweden Tax Identification Number medium breadth ................... 1495
Sweden Tax Identification Number narrow breadth .................... 1496
Sweden Value Added Tax (VAT) Number ....................................... 1496
Sweden Value Added Tax (VAT) Number wide breadth ............... 1497
Sweden Value Added Tax (VAT) Number medium breadth ........... 1497
Sweden Value Added Tax (VAT) Number narrow breadth ............ 1498
Swedish Passport Number .......................................................... 1499
Swedish Passport Number wide breadth ................................. 1499
Swedish Passport Number narrow breadth .............................. 1500
Sweden Personal Identification Number ......................................... 1501
Sweden Personal Identification Number wide breadth ................ 1501
Sweden Personal Identification Number medium breadth ........... 1502
Sweden Personal Identification Number narrow breadth ............. 1502
SWIFT Code ........................................................................... 1503
SWIFT Code wide breadth .................................................... 1503
SWIFT Code narrow breadth ................................................. 1504
Swiss AHV Number ................................................................... 1505
Swiss AHV Number wide breadth ........................................... 1506
Swiss AHV Number narrow breadth ........................................ 1506
Swiss Social Security Number (AHV) ............................................ 1507
Swiss Social Security Number (AHV) wide breadth .................... 1507
Swiss Social Security Number (AHV) medium breadth ............... 1508
Swiss Social Security Number (AHV) narrow breadth ................. 1508
Switzerland Health Insurance Card Number ................................... 1509
Switzerland Health Insurance Card Number wide breadth ........... 1510
Switzerland Health Insurance Card Number narrow breadth ........ 1510
Switzerland Passport Number ...................................................... 1511
Switzerland Passport Number wide breadth ............................. 1512
Contents 47

Switzerland Passport Number narrow breadth .......................... 1512


Switzerland Value Added Tax (VAT) Number ................................... 1513
Switzerland Value Added Tax (VAT) Number wide breadth .......... 1514
Switzerland Value Added Tax (VAT) Number medium
breadth ....................................................................... 1514
Switzerland Value Added Tax (VAT) Number narrow
breadth ....................................................................... 1515
Taiwan ROC ID ......................................................................... 1515
Taiwan ROC ID wide breadth ................................................ 1516
Taiwan ROC ID narrow breadth ............................................. 1516
Thailand Passport Number .......................................................... 1517
Thailand Passport Number wide breadth ................................. 1517
Thailand Passport Number narrow breadth .............................. 1518
Thailand Personal Identification Number ........................................ 1519
Thailand Personal Identification Number wide breadth ................ 1519
Thailand Personal Identification Number medium breadth ........... 1520
Thailand Personal Identification Number narrow breadth ............. 1520
Turkish Identification Number ...................................................... 1521
Turkish Identification Number wide breadth .............................. 1521
Turkish Identification Number medium breadth .......................... 1522
Turkish Identification Number narrow breadth ........................... 1522
UK Bank Account Number Sort Code ............................................ 1523
UK Bank Account Number Sort Code wide breadth .................... 1523
UK Bank Account Number Sort Code medium breadth ............... 1524
UK Bank Account Number Sort Code narrow breadth ................. 1524
UK Drivers Licence Number ........................................................ 1525
UK Drivers Licence Number wide breadth ................................ 1525
UK Drivers Licence Number medium breadth ........................... 1526
UK Drivers Licence Number narrow breadth ............................. 1526
UK Electoral Roll Number ........................................................... 1527
UK National Health Service (NHS) Number .................................... 1528
UK National Health Service (NHS) Number medium breadth ....... 1529
UK National Health Service (NHS) Number narrow breadth ......... 1529
UK National Insurance Number .................................................... 1530
UK National Insurance Number wide breadth ........................... 1531
UK National Insurance Number medium breadth ....................... 1531
UK National Insurance Number narrow breadth ........................ 1531
UK Passport Number ................................................................. 1532
UK Passport Number wide breadth ......................................... 1532
UK Passport Number medium breadth .................................... 1533
UK Passport Number narrow breadth ...................................... 1533
UK Tax ID Number .................................................................... 1534
UK Tax ID Number wide breadth ............................................ 1534
Contents 48

UK Tax ID Number medium breadth ....................................... 1535


UK Tax ID Number narrow breadth ......................................... 1535
UK Value Added Tax (VAT) Number .............................................. 1536
UK Value Added Tax (VAT) Number wide breadth ...................... 1536
UK Value Added Tax (VAT) Number medium breadth ................. 1537
UK Value Added Tax (VAT) Number narrow breadth ................... 1538
Ukraine Identity Card ................................................................. 1539
Ukraine Identity Card wide breadth ......................................... 1540
Ukraine Identity Card medium breadth .................................... 1540
Ukraine Identity Card narrow breadth ...................................... 1541
Ukraine Passport (Domestic) ....................................................... 1541
Ukraine Passport (Domestic) wide breadth ............................... 1542
Ukraine Passport (Domestic) narrow breadth ............................ 1542
Ukraine Passport (International) ................................................... 1543
Ukraine Passport (International) wide breadth ........................... 1543
Ukraine Passport (International) narrow breadth ........................ 1544
United Arab Emirates Personal Number ......................................... 1544
United Arab Emirates Personal Number wide breadth ............... 1545
United Arab Emirates Personal Number medium breadth ............ 1545
United Arab Emirates Personal Number narrow breadth ............. 1545
US Individual Tax Identification Number (ITIN) ................................ 1546
US Individual Tax Identification Number (ITIN) wide breadth ........ 1547
US Individual Tax Identification Number (ITIN) medium
breadth ....................................................................... 1547
US Individual Tax Identification Number (ITIN) narrow
breadth ....................................................................... 1548
US Passport Number ................................................................. 1548
US Passport Number wide breadth ......................................... 1549
US Passport Number narrow breadth ...................................... 1549
US Social Security Number (SSN) ................................................ 1550
US Social Security Number (SSN) wide breadth ........................ 1551
US Social Security Number (SSN) medium breadth ................... 1551
US Social Security Number (SSN) narrow breadth ..................... 1552
US ZIP+4 Postal Codes ............................................................. 1553
US ZIP+4 Postal Codes wide breadth ..................................... 1553
US ZIP+4 Postal Codes medium breadth ................................. 1554
US ZIP+4 Postal Codes narrow breadth .................................. 1554
Venezuela National Identification Number ...................................... 1555
Venezuela National Identification Number wide breadth .............. 1556
Venezuela National Identification Number medium breadth ......... 1556
Venezuela National Identification Number narrow breadth ........... 1556
Contents 49

Chapter 46 Library of policy templates ............................................ 1558


Caldicott Report policy template ................................................... 1561
Canadian Social Insurance Numbers policy template ........................ 1562
CAN-SPAM Act policy template .................................................... 1563
Colombian Personal Data Protection Law 1581 policy template .......... 1564
Common Spyware Upload Sites policy template .............................. 1564
Competitor Communications policy template ................................... 1565
Confidential Documents policy template ......................................... 1565
Credit Card Numbers policy template ............................................ 1566
Customer Data Protection policy template ...................................... 1567
Data Protection Act 1998 policy template ....................................... 1568
Data Protection Directives (EU) policy template ............................... 1570
Defense Message System (DMS) GENSER Classification policy
template ............................................................................ 1572
Design Documents policy template ............................................... 1573
Employee Data Protection policy template ...................................... 1574
Encrypted Data policy template .................................................... 1575
Export Administration Regulations (EAR) policy template .................. 1576
FACTA 2003 (Red Flag Rules) policy template ................................ 1577
Financial Information policy template ............................................. 1581
Forbidden Websites policy template .............................................. 1581
Gambling policy template ............................................................ 1582
General Data Protection Regulation (Banking and Finance) ............... 1583
General Data Protection Regulation (Digital Identity) ........................ 1617
General Data Protection Regulation (Government Identification) ......... 1618
General Data Protection Regulation (Healthcare and Insurance) ......... 1656
General Data Protection Regulation (Personal Profile) ...................... 1672
General Data Protection Regulation (Travel) ................................... 1675
Gramm-Leach-Bliley policy template ............................................. 1688
HIPAA and HITECH (including PHI) policy template ......................... 1690
Human Rights Act 1998 policy template ......................................... 1694
Illegal Drugs policy template ........................................................ 1695
Individual Taxpayer Identification Numbers (ITIN) policy template ........ 1695
International Traffic in Arms Regulations (ITAR) policy template .......... 1696
Media Files policy template ......................................................... 1697
Medicare and Medicaid (including PHI) .......................................... 1698
Merger and Acquisition Agreements policy template ......................... 1699
NASD Rule 2711 and NYSE Rules 351 and 472 policy template ......... 1700
NASD Rule 3010 and NYSE Rule 342 policy template ...................... 1702
NERC Security Guidelines for Electric Utilities policy template ............ 1703
Network Diagrams policy template ................................................ 1704
Network Security policy template .................................................. 1705
Contents 50

Offensive Language policy template .............................................. 1705


Office of Foreign Assets Control (OFAC) policy template ................... 1706
OMB Memo 06-16 and FIPS 199 Regulations policy template ............ 1707
Password Files policy template .................................................... 1709
Payment Card Industry (PCI) Data Security Standard policy
template ............................................................................ 1709
PIPEDA policy template ............................................................. 1711
Price Information policy template .................................................. 1713
Project Data policy template ........................................................ 1713
Proprietary Media Files policy template .......................................... 1713
Publishing Documents policy template ........................................... 1714
Racist Language policy template .................................................. 1715
Restricted Files policy template .................................................... 1715
Restricted Recipients policy template ............................................ 1715
Resumes policy template ............................................................ 1716
Sarbanes-Oxley policy template ................................................... 1716
SEC Fair Disclosure Regulation policy template .............................. 1719
Sexually Explicit Language policy template ..................................... 1721
Source Code policy template ....................................................... 1722
State Data Privacy policy template ................................................ 1723
SWIFT Codes policy template ...................................................... 1726
Symantec DLP Awareness and Avoidance policy template ................ 1726
UK Drivers License Numbers policy template .................................. 1727
UK Electoral Roll Numbers policy template ..................................... 1727
UK National Health Service (NHS) Number policy template ............... 1728
UK National Insurance Numbers policy template ............................. 1728
UK Passport Numbers policy template ........................................... 1728
UK Tax ID Numbers policy template .............................................. 1729
US Intelligence Control Markings (CAPCO) and DCID 1/7 policy
template ............................................................................ 1729
US Social Security Numbers policy template ................................... 1730
Violence and Weapons policy template .......................................... 1731
Webmail policy template ............................................................. 1731
Yahoo Message Board Activity policy template ................................ 1732
Yahoo and MSN Messengers on Port 80 policy template ................... 1733

Section 5 Configuring policy response rules ................. 1736


Chapter 47 Responding to policy violations .................................... 1737
About response rules ................................................................. 1738
About response rule actions ........................................................ 1738
Response rule actions for all detection servers ................................ 1739
Contents 51

Response rule actions for endpoint detection .................................. 1740


Response rule actions for Network Prevent detection ....................... 1741
Response rule actions for Network Protect detection ........................ 1742
Response rule actions for Cloud Storage detection .......................... 1743
Response rule actions for Cloud Applications and API appliance
detectors ........................................................................... 1744
About response rule execution types ............................................. 1750
About Automated Response rules ................................................ 1751
About Smart Response rules ....................................................... 1751
About response rule conditions .................................................... 1752
About response rule action execution priority .................................. 1753
About response rule authoring privileges ....................................... 1757
Implementing response rules ....................................................... 1758
Response rule best practices ...................................................... 1759

Chapter 48 Configuring and managing response rules ................ 1761


Manage response rules .............................................................. 1761
Adding a new response rule ........................................................ 1762
Configuring response rules ......................................................... 1763
About configuring Smart Response rules ....................................... 1764
Configuring response rule conditions ............................................ 1764
Configuring response rule actions ................................................ 1765
Modifying response rule ordering ................................................. 1769
About removing response rules .................................................... 1770

Chapter 49 Response rule conditions ............................................... 1771


Configuring the Endpoint Location response condition ...................... 1771
Configuring the Endpoint Device response condition ........................ 1772
Configuring the Incident Type response condition ............................ 1773
Configuring the Incident Match Count response condition .................. 1774
Configuring the Protocol or Endpoint Monitoring response
condition ........................................................................... 1775
Configuring the SEP Intensity Level response condition .................... 1777
Configuring the Severity response condition ................................... 1778

Chapter 50 Response rule actions ..................................................... 1780


Configuring the Add Note action ................................................... 1782
Configuring the Encrypt Smart Response action .............................. 1783
Configuring the Limit Incident Data Retention action ......................... 1783
Retaining data for endpoint incidents ...................................... 1784
Discarding data for network incidents ...................................... 1785
Contents 52

Configuring the Log to a Syslog Server action ................................. 1785


Configuring the Send Email Notification action ................................ 1786
Configuring the Server FlexResponse action .................................. 1788
Configuring the Set Attribute action ............................................... 1789
Configuring the Set Status action ................................................. 1790
Configuring the Cloud Storage: Add Visual Tag action ...................... 1791
Configuring the Cloud Storage: Quarantine action ............................ 1791
Configuring the Quarantine Smart Response action ......................... 1793
Configuring the Network Protect: SharePoint Quarantine smart
response action .................................................................. 1793
Configuring the Network Protect: SharePoint Release from Quarantine
smart response action .......................................................... 1795
Configuring the Remove Collaborator Access Smart Response
action ............................................................................... 1797
Configuring the Remove Shared Links Smart Response action ........... 1797
Configuring the Restore File Smart Response action ........................ 1797
Configuring the Custom Action on Data-at-Rest action ...................... 1798
Configuring the Delete Data-at-Rest action ..................................... 1799
Configuring the Encrypt Data-at-Rest action ................................... 1799
Configuring the Perform DRM on Data-at-Rest action ....................... 1800
Configuring the Quarantine Data-at-Rest action ............................... 1801
Configuring the Remove Shared Links in Data-at-Rest action ............. 1802
Configuring the Tag Data-at-Rest action ......................................... 1802
Configuring the Prevent download, copy, print action ........................ 1803
Configuring the Remove Collaborator Access action ........................ 1804
Configuring the Set Collaborator Access to 'Edit' action ..................... 1804
Configuring the Set Collaborator Access to 'Preview' action ............... 1805
Configuring the Set Collaborator Access to 'Read' action ................... 1805
Configuring the Set File Access to 'All Read' action .......................... 1806
Configuring the Set File Access to 'Internal Edit' .............................. 1806
Configuring the Set File Access to 'Internal Read' action ................... 1807
Configuring the Add two-factor authentication action ........................ 1808
Configuring the Block Data-in-Motion action ................................... 1808
Configuring the Custom Action on Data-in-Motion action ................... 1809
Configuring the Encrypt Data-in-Motion action ................................. 1810
Configuring the Perform DRM on Data-in-Motion action .................... 1810
Configuring the Quarantine Data-in-Motion action ............................ 1811
Configuring the Redact Data-in-Motion action ................................. 1812
Configuring the Endpoint: FlexResponse action ............................... 1813
Configuring the Endpoint: ICT Classification And Tagging action ......... 1814
Configuring the Endpoint Discover: Quarantine File action ................. 1815
Configuring the Endpoint Prevent: Block action ............................... 1817
Configuring the Endpoint Prevent: Encrypt action ............................ 1821
Contents 53

Configuring the Endpoint Prevent: Notify action ............................... 1825


Configuring the Endpoint Prevent: User Cancel action ...................... 1828
Configuring the Network Prevent for Web: Block FTP Request
action ............................................................................... 1831
Configuring the Network Prevent for Web: Block HTTP/S action ......... 1831
Configuring the Network Prevent: Block SMTP Message action .......... 1832
Configuring the Network Prevent: Modify SMTP Message action ........ 1833
Configuring the Network Prevent for Web: Remove HTTP/S Content
action ............................................................................... 1835
Configuring the Network Protect: Copy File action ............................ 1836
Configuring the Network Protect: Quarantine File action .................... 1837
Configuring the Network Protect: Encrypt File action ........................ 1838

Section 6 Remediating and managing incidents ......... 1840


Chapter 51 Remediating incidents .................................................... 1841
About incident remediation .......................................................... 1841
Remediating incidents ................................................................ 1844
Executing Smart response rules ................................................... 1845
Incident remediation action commands .......................................... 1845
Response action variables .......................................................... 1847
General incident variables .................................................... 1847
Network Monitor and Network Prevent incident variables ............ 1848
Discover incident variables ................................................... 1848
Endpoint incident variables ................................................... 1849
Application incident variables ................................................ 1849

Chapter 52 Remediating Network incidents ................................... 1851


Network incident list ................................................................... 1851
Network incident list—Actions ...................................................... 1854
Network incident list—Columns .................................................... 1856
Network incident snapshot .......................................................... 1857
Network incident snapshot—Heading and navigation ....................... 1857
Network incident snapshot—General information ............................. 1858
Network incident snapshot—Matches ............................................ 1860
Network incident snapshot—Attributes .......................................... 1861
Network summary report ............................................................ 1861

Chapter 53 Remediating Endpoint incidents .................................. 1863


About endpoint incident lists ........................................................ 1863
Endpoint incident snapshot ......................................................... 1866
Contents 54

Reporting on Endpoint Prevent response rules ................................ 1871


Endpoint incident destination or protocol-specific information ............. 1872
Endpoint incident summary reports ............................................... 1874

Chapter 54 Remediating Discover incidents ................................... 1876


About reports for Network Discover ............................................... 1876
About incident reports for Network Discover/Cloud Storage
Discover ........................................................................... 1877
Discover incident reports ............................................................ 1878
Discover incident lists ................................................................ 1879
Discover incident actions ............................................................ 1879
Discover incident entries ............................................................. 1880
Discover incident snapshot ......................................................... 1882
Discover summary reports .......................................................... 1885

Chapter 55 Working with Application incidents ............................. 1887


About Applications incident reports ............................................... 1887
Applications incident list ............................................................. 1889
Applications incident entries ........................................................ 1889
Applications incident actions ........................................................ 1891
Applications incident snapshot ..................................................... 1892
Applications summary reports ...................................................... 1896

Chapter 56 Managing and reporting incidents ............................... 1897


About Symantec Data Loss Prevention reports ................................ 1899
About strategies for using reports ................................................. 1900
Setting report preferences ........................................................... 1901
About incident reports ................................................................ 1902
About dashboard reports and executive summaries ......................... 1903
Viewing dashboards .................................................................. 1905
Creating dashboard reports ......................................................... 1906
Configuring dashboard reports ..................................................... 1907
Choosing reports to include in a dashboard .................................... 1909
About summary reports .............................................................. 1909
Viewing summary reports ........................................................... 1909
Creating summary reports ........................................................... 1910
Viewing incidents ...................................................................... 1911
About custom reports and dashboards .......................................... 1912
Using IT Analytics to manage incidents .......................................... 1913
Filtering reports ........................................................................ 1914
Saving custom incident reports .................................................... 1914
Contents 55

Scheduling custom incident reports ............................................... 1915


Delivery schedule options for incident and system reports ................. 1917
Delivery schedule options for dashboard reports ............................. 1919
Using the date widget to schedule reports ...................................... 1921
Editing custom dashboards and reports ......................................... 1921
Exporting incident reports ........................................................... 1921
Exported fields for Network Monitor .............................................. 1922
Exported fields for Network Discover/Cloud Storage Discover ............ 1923
Exported fields for Endpoint Discover ............................................ 1924
Deleting incidents ..................................................................... 1925
About the incident deletion process ........................................ 1926
Configuring the incident deletion job schedule .......................... 1927
Starting and stopping incident deletion jobs .............................. 1927
Working with the deletion jobs history ..................................... 1928
About automatically flagging incidents for deletion ..................... 1929
About creating incident reports for automatic incident deletion
flagging ...................................................................... 1930
Configuring automatic incident deletion flagging ........................ 1931
Managing automatic incident deletion flagging .......................... 1931
Troubleshooting automatic incident deletion flagging .................. 1932
Deleting custom dashboards and reports ....................................... 1932
Common incident report features .................................................. 1933
Page navigation in incident reports ............................................... 1934
Incident report filter and summary options ...................................... 1934
Sending incident reports by email ................................................. 1935
Printing incident reports .............................................................. 1936
Incident snapshot history tab ....................................................... 1936
Incident snapshot notes tab ......................................................... 1937
Incident snapshot attributes section .............................................. 1937
Incident snapshot correlations tab ................................................ 1937
Incident snapshot policy section ................................................... 1938
Incident snapshot matches section ............................................... 1938
Incident snapshot access information section .................................. 1938
Customizing incident snapshot pages ........................................... 1939
About filters and summary options for reports ................................. 1940
General filters for reports ............................................................ 1941
Summary options for incident reports ............................................ 1944
Advanced filter options for reports ................................................ 1949

Chapter 57 Hiding incidents ............................................................... 1958


About incident hiding ................................................................. 1958
Hiding incidents ....................................................................... 1959
Contents 56

Unhiding hidden incidents ........................................................... 1959


Preventing incidents from being hidden ......................................... 1960
Deleting hidden incidents ............................................................ 1961

Chapter 58 Working with incident data ........................................... 1962


About incident status attributes .................................................... 1962
Configuring status attributes and values ......................................... 1964
Configuring status groups ........................................................... 1965
Export web archive .................................................................... 1966
Export web archive—Create Archive ............................................. 1966
Export web archive—All Recent Events ......................................... 1967
About custom attributes .............................................................. 1968
About using custom attributes ...................................................... 1969
How custom attributes are populated ............................................ 1970
Configuring custom attributes ...................................................... 1970
Setting custom attributes ............................................................ 1971
Setting the values of custom attributes manually .............................. 1972

Chapter 59 Working with user risk .................................................... 1973


About user risk ......................................................................... 1973
About user data sources ............................................................. 1975
Defining custom attributes for user data ................................... 1976
Bringing in user data ........................................................... 1977
About identifying users in web incidents ........................................ 1981
Enabling user identification and configuring the mapping
schedule ..................................................................... 1982
Checking the status of the domain controllers ........................... 1983
Viewing the user list ................................................................... 1983
Viewing user details ................................................................... 1984
Working with the user risk summary .............................................. 1984

Chapter 60 Implementing lookup plug-ins ...................................... 1986


About lookup plug-ins ................................................................ 1986
Types of lookup plug-ins ....................................................... 1987
About lookup parameters ..................................................... 1990
About plug-in deployment ..................................................... 1991
About plug-in chaining ......................................................... 1991
About upgrading lookup plug-ins ............................................ 1991
Implementing and testing lookup plug-ins ....................................... 1992
Managing and configuring lookup plug-ins ............................... 1994
Creating new lookup plug-ins ................................................ 1995
Contents 57

Selecting lookup parameters ................................................. 1996


Enabling lookup plug-ins ...................................................... 2001
Chaining lookup plug-ins ...................................................... 2002
Reloading lookup plug-ins .................................................... 2002
Troubleshooting lookup plug-ins ............................................. 2003
Configuring detailed logging for lookup plug-ins ........................ 2004
Configuring advanced plug-in properties .................................. 2005
Configuring the CSV Lookup Plug-In ............................................. 2006
Requirements for creating the CSV file .................................... 2008
Specifying the CSV file path .................................................. 2009
Choosing the CSV file delimiter ............................................. 2009
Selecting the CSV file character set ........................................ 2009
Mapping attributes and parameter keys to CSV fields ................. 2010
CSV attribute mapping example ............................................. 2011
Testing and troubleshooting the CSV Lookup Plug-In ................ 2012
CSV Lookup Plug-In tutorial .................................................. 2013
Configuring LDAP Lookup Plug-Ins ............................................... 2015
Requirements for LDAP server connections ............................. 2016
Mapping attributes to LDAP data ............................................ 2017
Attribute mapping examples for LDAP ..................................... 2018
Testing and troubleshooting LDAP Lookup Plug-ins ................... 2018
LDAP Lookup Plug-In tutorial ................................................ 2019
Configuring Script Lookup Plug-Ins ............................................... 2020
Writing scripts for Script Lookup Plug-Ins ................................. 2021
Specifying the Script Command ............................................. 2022
Specifying the Arguments ..................................................... 2023
Enabling the stdin and stdout options ...................................... 2023
Enabling incident protocol filtering for scripts ............................ 2024
Enabling and encrypting script credentials ............................... 2025
Chaining multiple Script Lookup Plug-Ins ................................. 2027
Script Lookup Plug-In tutorial ................................................ 2027
Example script ................................................................... 2029
Configuring migrated Custom (Legacy) Lookup Plug-Ins ................... 2031

Section 7 Monitoring and preventing data loss in


the network ............................................................ 2033
Chapter 61 Implementing Network Monitor ................................... 2034
Implementing Network Monitor ..................................................... 2034
About IPv6 support for Network Monitor ......................................... 2036
Choosing a network packet capture method ................................... 2037
Contents 58

About packet capture software installation and configuration .............. 2038


Installing WinPcap on a Windows platform ............................... 2038
Updating the Endace card driver ............................................ 2039
Installing and updating the Napatech network adapter and driver
software ...................................................................... 2039
Configuring the Network Monitor Server ......................................... 2045
Enabling GET processing with Network Monitor .............................. 2046
Creating a policy for Network Monitor ............................................ 2047
Testing Network Monitor ............................................................. 2048

Chapter 62 Implementing Network Prevent for Email .................. 2049


Implementing Network Prevent for Email ........................................ 2049
About Mail Transfer Agent (MTA) integration ................................... 2051
Configuring Network Prevent for Email Server for reflecting or
forwarding mode ................................................................. 2051
Configuring Linux IP tables to reroute traffic from a restricted
port ............................................................................ 2056
Specifying one or more upstream mail transfer agents (MTAs) ........... 2057
Creating a policy for Network Prevent for Email ............................... 2058
About policy violation data headers ............................................... 2059
Enabling policy violation data headers ........................................... 2060
Testing Network Prevent for Email ................................................ 2061

Chapter 63 Implementing Network Prevent for Web .................... 2062


Implementing Network Prevent for Web ......................................... 2062
Configuring Network Prevent for Web Server .................................. 2064
Configuring a secure ICAP keystore for Network Prevent for Web ....... 2067
About proxy server configuration .................................................. 2070
Configuring request and response mode services ...................... 2070
Specifying one or more proxy servers ............................................ 2071
Enabling GET processing for Network Prevent for Web ..................... 2072
Creating policies for Network Prevent for Web ................................ 2073
Testing Network Prevent for Web ................................................. 2074
Troubleshooting information for Network Prevent for Web Server ........ 2075
Contents 59

Section 8 Discovering where confidential data is


stored ........................................................................ 2076
Chapter 64 About Network Discover ................................................ 2078

About Network Discover/Cloud Storage Discover ............................. 2078


How Network Discover/Cloud Storage Discover works ...................... 2079

Chapter 65 Setting up and configuring Network Discover ........... 2082


Setting up and configuring Network Discover/Cloud Storage
Discover ........................................................................... 2082
Modifying the Network Discover/Cloud Storage Discover Server
configuration ...................................................................... 2083
Configuring Network Discover to use a proxy to connect to the
Symantec ICE Cloud for file share scans ................................. 2085
Adding a new Network Discover/Cloud Storage Discover target .......... 2086
Editing an existing Network Discover/Cloud Storage Discover
target ................................................................................ 2088

Chapter 66 Network Discover scan target configuration


options ......................................................................... 2090

Network Discover/Cloud Storage Discover scan target configuration


options ............................................................................. 2090
Configuring the required fields for Network Discover targets ............... 2092
Scheduling Network Discover/Cloud Storage Discover scans ............. 2093
Providing the password authentication for Network Discover scanned
content ............................................................................. 2095
Managing cloud storage authorizations .......................................... 2096
Providing Box cloud storage authorization credentials ................ 2097
Encrypting passwords in configuration files ..................................... 2100
Setting up Network Discover/Cloud Storage Discover filters to include
or exclude items from the scan .............................................. 2100
Filtering Discover targets by item size ........................................... 2103
Filtering Discover targets by date last accessed or modified ............... 2103
Optimizing resources with Network Discover/Cloud Storage Discover
scan throttling ..................................................................... 2106
Creating an inventory of the locations of unprotected sensitive
data ................................................................................. 2107
Contents 60

Chapter 67 Managing Network Discover target scans .................. 2110


Managing Network Discover/Cloud Storage Discover target
scans ............................................................................... 2111
Managing Network Discover/Cloud Storage Discover targets ............. 2111
About the Network Discover/Cloud Storage Discover scan target
list ............................................................................. 2111
Working with Network Discover/Cloud Storage Discover scan
targets ........................................................................ 2113
Removing Network Discover/Cloud Storage Discover scan
targets ........................................................................ 2113
Managing Network Discover/Cloud Storage Discover scan histories
....................................................................................... 2114
About Discover and Endpoint Discover scan histories ................ 2114
Working with Network Discover/Cloud Storage Discover scan
histories ...................................................................... 2116
Deleting Network Discover/Cloud Storage Discover scans .......... 2116
About Discover scan details .................................................. 2117
Working with Network Discover/Cloud Storage Discover scan
details ........................................................................ 2120
Managing Network Discover/Cloud Storage Discover Servers ............ 2121
Viewing Network Discover/Cloud Storage Discover server
status ......................................................................... 2121
About Network Discover/Cloud Storage Discover scan
optimization ....................................................................... 2122
About the difference between incremental scans and differential
scans ............................................................................... 2124
About incremental scans ............................................................ 2125
Scanning new or modified items with incremental scans .................... 2126
About managing incremental scans .............................................. 2127
Scanning new or modified items with differential scans ..................... 2128
Configuring parallel scanning of Network Discover/Cloud Storage
Discover targets .................................................................. 2128
About grid scanning ................................................................... 2130
Configuring grid scanning ........................................................... 2132
Renewing grid communication certificates for Discover detection
servers ............................................................................. 2134
Migrating a Discover scan from a single server to a grid .................... 2136
Grid scanning performance guidelines ........................................... 2136
Troubleshooting grid scans ......................................................... 2138
Contents 61

Chapter 68 Using Server FlexResponse plug-ins to remediate


incidents ...................................................................... 2140
About the Server FlexResponse platform ....................................... 2140
Using Server FlexResponse custom plug-ins to remediate
incidents ........................................................................... 2142
Deploying a Server FlexResponse plug-in ...................................... 2143
Adding a Server FlexResponse plug-in to the plug-ins properties
file ............................................................................. 2143
Creating a properties file to configure a Server FlexResponse
plug-in ........................................................................ 2145
Locating incidents for manual remediation ...................................... 2148
Using the action of a Server FlexResponse plug-in to remediate an
incident manually ................................................................ 2149
Verifying the results of an incident response action .......................... 2150
Troubleshooting a Server FlexResponse plug-in .............................. 2151

Chapter 69 Setting up scans of Box cloud storage using an


on-premises detection server ................................. 2153
Setting up scans of Box cloud storage targets using an on-premises
detection server .................................................................. 2153
Configuring scans of Box cloud storage targets ............................... 2154
Optimizing Box cloud storage scanning ......................................... 2156
Configuring remediation options for Box cloud storage targets ............ 2157

Chapter 70 Setting up scans of file shares ...................................... 2159


Setting up server scans of file systems .......................................... 2159
Supported file system targets ...................................................... 2160
Automatically discovering servers and shares before configuring a file
system target ..................................................................... 2161
Working with Content Root Enumeration scans ......................... 2162
Troubleshooting Content Root Enumeration scans ..................... 2164
Automatically discovering open file shares ..................................... 2165
About automatically tracking incident remediation status ................... 2166
Troubleshooting automated incident remediation tracking ............ 2167
Configuration options for Automated Incident Remediation
Tracking ...................................................................... 2168
Excluding internal DFS folders ..................................................... 2171
Configuring scans of Microsoft Outlook Personal Folders (.pst
files) ................................................................................. 2171
Configuring scans of file systems ................................................. 2172
Optimizing file system target scanning ........................................... 2176
Contents 62

Configuring Network Protect for file shares ..................................... 2177


Priority of write-access credentials for file shares ............................. 2179

Chapter 71 Setting up scans of Lotus Notes databases ............... 2181


Setting up server scans of IBM (Lotus) Notes databases ................... 2181
Supported IBM (Lotus) Notes targets ............................................ 2182
Configuring and running IBM (Lotus) Notes scans ............................ 2182
Configuring IBM (Lotus) Notes DIIOP mode configuration scan
options ............................................................................. 2185

Chapter 72 Setting up scans of SQL databases .............................. 2187


Setting up server scans of SQL databases ..................................... 2187
Supported SQL database targets ................................................. 2188
Configuring and running SQL database scans ................................ 2188
Installing the JDBC driver for SQL database targets ......................... 2192
SQL database scan configuration properties ................................... 2192

Chapter 73 Setting up scans of SharePoint servers ...................... 2195


Setting up server scans of SharePoint servers ................................ 2195
About scans of SharePoint servers ............................................... 2196
Supported SharePoint server targets ............................................. 2198
Access privileges for SharePoint scans ......................................... 2198
About Alternate Access Mapping Collections .................................. 2198
Configuring and running SharePoint server scans ............................ 2198
Configuring Network Protect for SharePoint servers ......................... 2203
Installing the SharePoint solution on the Web Front Ends in a
farm ................................................................................. 2205
Enabling SharePoint scanning without installing the SharePoint
solution ............................................................................. 2207
Setting up SharePoint scans to use Kerberos authentication .............. 2208
Troubleshooting SharePoint scans ............................................... 2209

Chapter 74 Setting up scans of Exchange servers ......................... 2211


Setting up server scans of Exchange repositories ............................ 2211
About scans of Exchange servers ................................................. 2212
Supported Exchange Server targets .............................................. 2213
Configuring Exchange Server scans ............................................. 2214
Setting up Exchange scans to use Kerberos authentication ............... 2217
Example configurations and use cases for Exchange scans ............... 2218
Troubleshooting Exchange scans ................................................. 2219
Contents 63

Chapter 75 About Network Discover scanners ............................... 2220


How Network Discover scanners work ........................................... 2220
Troubleshooting scanners ........................................................... 2221
Scanner processes ................................................................... 2222
Scanner installation directory structure .......................................... 2223
Scanner configuration files .......................................................... 2224
Scanner controller configuration options ........................................ 2225

Chapter 76 Setting up scanning of Documentum


repositories ................................................................. 2227
Setting up remote scanning of Documentum repositories .................. 2227
Supported Documentum (scanner) targets ..................................... 2228
Installing Documentum scanners .................................................. 2228
Starting Documentum scans ........................................................ 2230
Configuration options for Documentum scanners ............................. 2231
Example configuration for scanning all documents in a Documentum
repository .......................................................................... 2233

Chapter 77 Setting up scanning of file systems ............................. 2235


Setting up remote scanning of file systems ..................................... 2236
Supported file system scanner targets ........................................... 2237
Installing file system scanners ..................................................... 2237
Starting file system scans ........................................................... 2239
Installing file system scanners silently from the command line ............ 2241
Configuration options for file system scanners ................................. 2242
Example configuration for scanning the C drive on a Windows
computer ........................................................................... 2243
Example configuration for scanning the /usr directory on UNIX .......... 2243
Example configuration for scanning with include filters ...................... 2243
Example configuration for scanning with exclude filters ..................... 2244
Example configuration for scanning with include and exclude filters
....................................................................................... 2244
Example configuration for scanning with date filtering ...................... 2245
Example configuration for scanning with file size filtering ................... 2245
Example configuration for scanning that skips symbolic links on UNIX
systems ............................................................................ 2246
Contents 64

Chapter 78 Setting up scanning of OpenText (Livelink)


targets .......................................................................... 2247
Setting up remote scanning of OpenText (Livelink) repositories ........... 2247
Supported OpenText (Livelink) scanner targets ............................... 2248
Creating an ODBC data source for SQL Server ............................... 2248
Installing Livelink scanners ......................................................... 2249
Starting OpenText (Livelink) scans ................................................ 2251
Configuration options for Livelink scanners ..................................... 2253
Example configuration for scanning a Livelink database .................... 2254

Chapter 79 Setting up scanning of Web servers ............................ 2255


Setting up remote scanning of web servers .................................... 2255
Supported web server (scanner) targets ........................................ 2256
Installing web server scanners ..................................................... 2256
Starting web server scans ........................................................... 2258
Configuration options for web server scanners ................................ 2260
Example configuration for a web site scan with no authentication ........ 2262
Example configuration for a web site scan with basic
authentication .................................................................... 2262
Example configuration for a web site scan with form-based
authentication .................................................................... 2263
Example configuration for a web site scan with NTLM ....................... 2263
Example of URL filtering for a web site scan ................................... 2264
Example of date filtering for a web site scan ................................... 2265

Chapter 80 Setting up Web Services for custom scan


targets .......................................................................... 2266
Setting up Web Services for custom scan targets ............................ 2266
About setting up the Web Services Definition Language (WSDL) ........ 2267
Example of a Web Services Java client ......................................... 2267
Sample Java code for the Web Services example ............................ 2268

Section 9 Discovering and preventing data loss on


endpoints ................................................................ 2272
Chapter 81 Overview of Symantec Data Loss Prevention for
endpoints ..................................................................... 2273
About discovering and preventing data loss on endpoints .................. 2273
Guidelines for authoring Endpoint policies ...................................... 2275
Contents 65

Chapter 82 Summary of DLP Agent for Mac support .................... 2277


About DLP Agent feature-level support .......................................... 2277
Mac agent installation and tools feature details ................................ 2278
Mac agent installation support ............................................... 2278
Mac endpoint tools features .................................................. 2279
Mac agent management features ................................................. 2279
Mac agent endpoint location ................................................. 2280
Mac agent groups features ................................................... 2280
Overview of Mac agent detection technologies and policy authoring
features ............................................................................ 2280
Mac agent detection technologies .......................................... 2281
Mac agent policy response rule features .................................. 2284
Mac agent monitoring support ...................................................... 2297
Mac agent removable storage features .................................... 2287
Clipboard features supported on Mac agents ............................ 2288
Mac agent Email features ..................................................... 2289
Mac agent browser features .................................................. 2290
Mac agent Application Monitoring features ............................... 2291
Mac agent copy to network share features ............................... 2292
Mac agent filter by file properties features ................................ 2292
Mac agent filter by network properties features ......................... 2293
Endpoint Prevent for Mac agent advanced agent settings
features ............................................................................ 2293
Endpoint Discover for Mac targets features .................................... 2294
Endpoint Discover for Mac file system support ................................ 2295
Endpoint Discover for Mac advanced agent settings support .............. 2295

Chapter 83 Using Endpoint Prevent .................................................. 2296


About Endpoint Prevent monitoring ............................................... 2296
About removable storage monitoring ....................................... 2297
About endpoint network monitoring ......................................... 2299
About CD/DVD monitoring .................................................... 2300
About print/fax monitoring ..................................................... 2301
About network share monitoring ............................................. 2302
About clipboard monitoring ................................................... 2303
About global application monitoring ........................................ 2303
About group-specific application monitoring: using overrides ........ 2304
About cloud storage application monitoring .............................. 2305
About virtual desktop support with Endpoint Prevent .................. 2306
About rules results caching (RRC) .......................................... 2309
About policy creation for Endpoint Prevent ..................................... 2309
Contents 66

About monitoring policies with response rules for Endpoint


Servers ....................................................................... 2310
How to implement Endpoint Prevent ............................................. 2312
Setting the endpoint location ................................................. 2313
About Endpoint Prevent response rules in different locales .......... 2314

Chapter 84 Using Endpoint Discover ................................................ 2316


How Endpoint Discover works ..................................................... 2316
About Endpoint Discover scanning ............................................... 2316
About scanning targeted endpoints ........................................ 2317
About Endpoint Discover full scanning .................................... 2318
About Endpoint Discover incremental scanning ......................... 2318
About Endpoint Discover classification scanning ....................... 2320
About parallel scans on targeted endpoints .............................. 2321
Optimizing the scan for endpoint performance .......................... 2322
Preparing to set up Endpoint Discover ........................................... 2322
Creating a policy group for Endpoint Discover ........................... 2323
Creating a policy for Endpoint Discover ................................... 2324
Adding a rule for Endpoint Discover ........................................ 2324
Setting up and configuring Endpoint Discover ................................. 2325
Creating an Endpoint Discover scan ............................................. 2326
Creating a new Endpoint Discover target ................................. 2327
About Endpoint Discover filters .............................................. 2334
Configuring Endpoint Discover scan timeout settings ................. 2341
Managing Endpoint Discover target scans ...................................... 2342
About managing Endpoint Discover scans ............................... 2342
About Endpoint Discover targeted endpoints scan details ............ 2343
About remediating Endpoint Discover incidents ......................... 2345
About Endpoint reports ........................................................ 2345

Chapter 85 Working with agent configurations .............................. 2347


About agent configurations ......................................................... 2347
About cloning agent configurations ......................................... 2348
Adding and editing agent configurations ........................................ 2348
Channel settings ................................................................. 2349
Channel Filters settings ........................................................ 2353
Application Monitoring settings .............................................. 2362
Device Control settings ........................................................ 2364
Agent settings .................................................................... 2364
Advanced agent settings ...................................................... 2372
Setting specific channels to monitor based on the endpoint
location ....................................................................... 2411
Contents 67

Applying agent configurations to an agent group ............................. 2412


Configuring the agent connection status ........................................ 2412

Chapter 86 Working with Agent Groups ........................................... 2414


About agent groups ................................................................... 2414
Developing a strategy for deploying Agent Groups ........................... 2415
Overview of the agent group deployment process ............................ 2416
Creating and managing agent attributes ........................................ 2417
Creating a new agent attribute ............................................... 2418
Defining a search filter for creating user-defined attributes ........... 2419
Verifying attribute queries with the Attribute Query Resolver
tool ............................................................................ 2419
Applying a new attribute or changed attribute to agents .............. 2420
Undoing changes to agent attributes ....................................... 2421
Editing user-defined agent attributes ....................................... 2421
Viewing and managing agent groups ............................................. 2421
Agent group conditions ........................................................ 2422
Creating a new agent group .................................................. 2423
Updating outdated agent configurations ................................... 2423
Assigning configurations to deploy groups ............................... 2424
Verify that group assignments are correct ................................ 2424
Viewing group conflicts ............................................................... 2425
Changing groups ...................................................................... 2425

Chapter 87 Managing Symantec DLP Agents .................................. 2427


About Symantec DLP Agent administration .................................... 2427
Agent Overview screen ........................................................ 2428
About agent events ............................................................. 2446
About Symantec DLP Agent removal ...................................... 2454
About DLP Agent logs ................................................................ 2457
Setting the log levels for an Endpoint Agent ............................. 2457
About agent password management ............................................. 2458
Create a new agent uninstall or Endpoint tools password ............ 2459
Change an existing agent uninstall or Endpoint tools
password .................................................................... 2460
Retain existing agent uninstall or Endpoint tools passwords ......... 2460

Chapter 88 Using application monitoring ........................................ 2461


About global application monitoring .............................................. 2461
Changing global application monitoring settings ........................ 2462
Contents 68

Monitoring instant messenger applications on Mac


endpoints .................................................................... 2465
List of CD/DVD applications .................................................. 2466
About adding applications ........................................................... 2467
Adding a Windows application ..................................................... 2468
Using the GetAppInfo tool ..................................................... 2471
Adding a macOS application ....................................................... 2472
Defining macOS application binary names ............................... 2475
Ignoring macOS applications ....................................................... 2475
About Application File Access monitoring ....................................... 2476
Implementing Application File Access monitoring ............................. 2477

Chapter 89 Working with Endpoint FlexResponse ......................... 2479


About Endpoint FlexResponse ..................................................... 2479
Deploying Endpoint FlexResponse ............................................... 2481
About deploying Endpoint FlexResponse plug-ins on endpoints .......... 2481
Deploying Endpoint FlexResponse plug-ins using a silent installation
process ............................................................................ 2482
About the Endpoint FlexResponse utility ........................................ 2483
Deploying an Endpoint FlexResponse plug-in using the Endpoint
FlexResponse utility ............................................................ 2485
Enabling Endpoint FlexResponse on the Enforce Server ................... 2486
Uninstalling an Endpoint FlexResponse plug-in using the Endpoint
FlexResponse utility ............................................................ 2486
Retrieving an Endpoint FlexResponse plug-in from a specific
endpoint ............................................................................ 2487
Retrieving a list of Endpoint FlexResponse plug-ins from an
endpoint ............................................................................ 2488

Chapter 90 Using Endpoint tools ....................................................... 2489


About Endpoint tools .................................................................. 2489
Using Endpoint tools with Windows 7/8.1/10 ............................. 2491
Shutting down the agent and the watchdog services on Windows
endpoints .................................................................... 2492
Using Endpoint tools with macOS .......................................... 2492
Shutting down the agent service on Mac endpoints .................... 2493
Inspecting the database files accessed by the agent .................. 2493
Viewing extended log files .................................................... 2494
About the Device ID utilities .................................................. 2496
Starting DLP Agents that run on Mac endpoints ........................ 2499
Contents 69

Chapter 91 Using SEP Intensive Protection .................................... 2501


About the SEP Intensive Protection file reputation service ................. 2501
Enabling SEP Intensive Protection ................................................ 2502
Setting the SEP Intensity Level .................................................... 2503
Adding a SEP Intensive Protection response rule ............................ 2503

Section 10 Monitoring data loss in cloud


applications ........................................................... 2505
Chapter 92 Working with Application Detection ............................ 2506
About Application Detection ........................................................ 2506
Managing Application Detection ................................................... 2507

Chapter 93 Working with Cloud Service for Email ......................... 2513


About Cloud Service for Email ..................................................... 2513
About updating email domains in the Enforce Server administration
console ............................................................................. 2514
Viewing Cloud Service for Email detector details ....................... 2514
Adding the unique TXT record to your DNS settings ................... 2516
Updating email domains ....................................................... 2516
Update override by the Symantec Cloud Service ....................... 2518
Encrypting cloud email with Symantec Information Centric
Encryption ......................................................................... 2518
Implementing ICE with Cloud Service for Email ......................... 2519
Configuring the Enforce Server to communicate with the ICE
service ....................................................................... 2519
Creating encryption response rules for ICE encryption ................ 2520
About decrypting ICE encrypted email ..................................... 2522
Viewing details about ICE incidents ........................................ 2522

Section 11 Monitoring data loss using DLP


Appliances .............................................................. 2526
Chapter 94 Implementing and working with DLP
Appliances ................................................................... 2527
About DLP Appliances ............................................................... 2527
About obtaining the appliance activation file and licenses .................. 2528
Contents 70

Obtaining activation and license files for the virtual


appliance .................................................................... 2528
Obtaining license files for the DLP S500-10 Hardware
Appliance .................................................................... 2530
About the Command Line Interface (CLI) ....................................... 2531
About performance tuning and sizing for appliances ......................... 2531

Chapter 95 Deploying DLP Appliances ............................................. 2532


Deployment overview for the virtual appliance ................................. 2532
Setting up the virtual appliance .................................................... 2534
Deployment overview for the DLP-S500 hardware appliance .............. 2536
Setting up the DLP-S500 Appliance .............................................. 2537
Adding an appliance .................................................................. 2539
Configuring the API Detection for Developer Apps Appliance ............. 2540

Chapter 96 Post-deployment tasks ................................................... 2542


Unbinding or resetting a DLP appliance ......................................... 2542
Updating appliance software ....................................................... 2543
Log files and logging for appliances .............................................. 2544

Index ................................................................................................................. 2545


Section 1
Getting started

■ Chapter 1. Introducing Symantec Data Loss Prevention

■ Chapter 2. Getting started administering Symantec Data Loss Prevention

■ Chapter 3. Working with languages and locales


Chapter 1
Introducing Symantec Data
Loss Prevention
This chapter includes the following topics:

■ About updates to the Symantec Data Loss Prevention Administration Guide

■ About Symantec Data Loss Prevention

■ About the Enforce Server platform

■ About Network Monitor and Prevent

■ About Network Discover/Cloud Storage Discover

■ About Network Protect

■ About Endpoint Discover

■ About Endpoint Prevent

About updates to the Symantec Data Loss Prevention


Administration Guide
This guide is occasionally updated as new information becomes available. You can find the
latest version of the Symantec Data Loss Prevention Administration Guide at the following link
to the Symantec Support Center article: http://www.symantec.com/docs/DOC9261.
Subscribe to the article at the Support Center to be notified when there are updates.
The following table provides the history of updates to this version of the Symantec Data Loss
Prevention Administration Guide:
Introducing Symantec Data Loss Prevention 73
About updates to the Symantec Data Loss Prevention Administration Guide

Table 1-1 Change history for the Symantec Data Loss Prevention Administration Guide

Date Description

19 August 2019 Corrected the path to the index data files on Windows to read
ProgramData\Symantec\DataLossPrevention\ServerPlatformCommon
\15.5\Protect\datafiles.

Updated the Windows and Linux pathnames for the Indexer.properties


file.

Added a table of channel-specific limits to the Increasing the inspection


content size chapter.

Changed the name of SmtpPrevent0.log to


SmtpPrevent_operational0.log.

Corrected name of the log file for SMTP Prevent. The filename was
SmtpPrevent0.log; now it is RequestProcessor0.log.

Clarified that you must use dedicated hardware or VMs with dedicated
resources for OCR Servers.

Corrected the description of the Limit Incident Data Retention response rule
to indicate that the rule is only supported for Endpoint Prevent.

Corrected the default value for the advanced agent setting


"NetworkMonitor.NUM_OF_LISTENER_THREADS."

14 June 2019 Minor editorial updates.


Introducing Symantec Data Loss Prevention 74
About updates to the Symantec Data Loss Prevention Administration Guide

Table 1-1 Change history for the Symantec Data Loss Prevention Administration Guide
(continued)

Date Description

11 June 2019 Updated Secure ICAP content to reflect that both self-signed and CA-issued
certificates are supported.

Deleted text that indicated document source files uploaded to the Enforce
Server are deleted after indexing. IDM source files are not deleted from the
Enforce Server after indexing.

Added that for OCR, only load balancers without persistence enabled are
supported.

Added support for OCR and Cloud Prevent for Office 365 on Azure.

Updated default value for Lexer.MaximumNumberOfTokens to 30000.

Fixed squished text in the "Advanced settings for OCR and FR image
extraction" table.

Added detailed information about pre- and post-validator characters for custom
data identifiers.

Added information about high-performance content extraction for Office Open


XML files.

Added information about whitelisting the Titanium server for SEP Intensive
Protection.

26 March 2019 Updated cross reference to locale settings to one for JDK 8 and JRE 8:
Changed to
https://www.oracle.com/technetwork/java/javase/java8locales-2095355.html.

Updated "About updating email domains in the Enforce Server administration


console" to remove "new" and add references to 15.5.

Fixed broken link to "Detecting content using data identifiers" in "Introducing


Exact Match Data Identifiers" topic to point to "About data identifiers."

Fixed broken xref and properties name in Profile size limitations on the DLP
Agent for EMDI.

Removed the “Only available with Network Prevent for Email” text from two
response rule topics: "Network Prevent: Modify SMTP Message" and "Network
Prevent: Block SMTP Message".

6 March 2019 Reinstated procedure for starting an Enforce Server (inadvertently dropped
in previous release).

11 February 2019 Minor fixes to layout for readability.


Introducing Symantec Data Loss Prevention 75
About Symantec Data Loss Prevention

Table 1-1 Change history for the Symantec Data Loss Prevention Administration Guide
(continued)

Date Description

1 February 2019 ■ Revised entire "Detecting content using Exact Match Data Identifiers"
chapter.
■ Made minor updates to "Using diagnostics for OCR Server deployments"
section.
■ Added new "Creating a null policy to assist in OCR diagnostics for Discover
Servers" section.
■ Fixed formatting in table 69-2.
■ Added Cloud Applications and API Appliance lookup parameters.
■ Corrected data exposure detail description for "Document is Exposed."

About Symantec Data Loss Prevention


Symantec Data Loss Prevention enables you to:
■ Discover and locate confidential information in cloud storage repositories, on file and web
servers, in databases, and on endpoints (desk and laptop systems)
■ Protect confidential information through quarantine
■ Monitor network traffic for transmission of confidential data
■ Monitor the use of sensitive data on endpoints
■ Prevent transmission of confidential data to outside locations
■ Automatically enforce data security and encryption policies
Symantec Data Loss Prevention includes the following components:
■ Enforce Server
See “About the Enforce Server platform” on page 77.
See “About Symantec Data Loss Prevention administration” on page 82.
See “About the Enforce Server administration console” on page 83.
■ Network Discover/Cloud Storage Discover
See “About Network Discover/Cloud Storage Discover” on page 78.
■ Network Protect
See “About Network Protect” on page 79.
■ Network Monitor
See “About Network Monitor and Prevent” on page 78.
■ Network Prevent
Introducing Symantec Data Loss Prevention 76
About Symantec Data Loss Prevention

See “About Network Monitor and Prevent” on page 78.


■ Endpoint Discover
See “About Endpoint Discover” on page 80.
■ Endpoint Prevent
See “About Endpoint Prevent” on page 80.
The Discover, Protect, Monitor, and Prevent modules can be deployed as stand-alone products
or in combination. Regardless of which stand-alone products you deploy, the Enforce Server
is always provided for central management. Note that the Network Protect module requires
the Network Discover/Cloud Storage Discover module.
Associated with each product module are corresponding detection servers and cloud detectors:
■ Network Discover/Cloud Storage Discover Server locates the exposed confidential data
on a broad range of enterprise data repositories including:
■ Box cloud storage
■ File servers
■ Databases
■ Microsoft SharePoint
■ IBM/Lotus Notes
■ EMC Documentum
■ Livelink
■ Microsoft Exchange
■ Web servers
■ Other data repositories
If you are licensed for Network Protect, this server also copies and quarantines sensitive
data on file servers and in Box cloud storage, as specified in your policies.
See “About Network Discover/Cloud Storage Discover” on page 78.
■ Network Monitor Server monitors the traffic on your network.
See “About Network Monitor and Prevent” on page 78.
■ Network Prevent for Email Server blocks emails that contain sensitive data.
See “About Network Monitor and Prevent” on page 78.
■ Network Prevent for Web Server blocks HTTP postings and FTP transfers that contain
sensitive data.
See “About Network Monitor and Prevent” on page 78.
■ Endpoint Server monitors and prevents the misuse of confidential data on endpoints.
See “About Endpoint Discover” on page 80.
Introducing Symantec Data Loss Prevention 77
About the Enforce Server platform

See “About Endpoint Prevent” on page 80.


The distributed architecture of Symantec Data Loss Prevention allows organizations to:
■ Perform centralized management and reporting.
■ Centrally manage data security policies once and deploy immediately across the entire
Symantec Data Loss Prevention suite.
■ Scale data loss prevention according to the size of your organization.

About the Enforce Server platform


The Symantec Data Loss Prevention Enforce Server is the central management platform that
enables you to define, deploy, and enforce data loss prevention and security policies. The
Enforce Server administration console provides a centralized, web-based interface for deploying
detection servers, authoring policies, remediating incidents, and managing the system.
See “About Symantec Data Loss Prevention” on page 75.
The Enforce platform provides you with the following capabilities:
■ Build and deploy accurate data loss prevention policies. You can choose among various
detection technologies, define rules, and specify actions to include in your data loss
prevention policies. Using provided regulatory and best-practice policy templates, you can
meet your regulatory compliance, data protection and acceptable-use requirements, and
address specific security threats.
See “About Data Loss Prevention policies” on page 368.
See “Detecting data loss” on page 381.
■ Automatically deploy and enforce data loss prevention policies. You can automate policy
enforcement options for notification, remediation workflow, blocking, and encryption.
■ Measure risk reduction and demonstrate compliance. The reporting features of the Enforce
Server enables you to create actionable reports identifying risk reduction trends over time.
You can also create compliance reports to address conformance with regulatory
requirements.
See “About Symantec Data Loss Prevention reports” on page 1899.
See “About incident reports” on page 1902.
■ Empower rapid remediation. Based on incident severity, you can automate the entire
remediation process using detailed incident reporting and workflow automation. Role-based
access controls empower individual business units and departments to review and remediate
those incidents that are relevant to their business or employees.
See “About incident remediation” on page 1841.
See “Remediating incidents” on page 1844.
Introducing Symantec Data Loss Prevention 78
About Network Monitor and Prevent

■ Safeguard employee privacy. You can use the Enforce Server to review incidents without
revealing the sender identity or message content. In this way, multi-national companies
can meet legal requirements on monitoring European Union employees and transferring
personal data across national boundaries.
See “About role-based access control” on page 109.

About Network Monitor and Prevent


The Symantec Data Loss Prevention network data monitoring and prevention products include:
■ Network Monitor
Network Monitor captures and analyzes traffic on your network. It detects confidential data
and significant traffic metadata over the protocols that you specify. For example, SMTP,
FTP, HTTP, and various IM protocols. You can configure a Network Monitor Server to
monitor custom protocols and to use a variety of filters (per protocol) to filter out low-risk
traffic.
■ Network Prevent for Email
Network Prevent for Email integrates with standard MTAs and hosted email services to
provide in-line active SMTP email management. Policies that are deployed on in-line
Network Prevent for Email Server direct the next-hop mail server to block, reroute, or tag
email messages. These blocks are based on specific content and other message attributes.
Communication between MTAs and Network Prevent for Email Server can be secured as
necessary using TLS.
Implement Network Monitor, review the incidents it captures, and refine your policies
accordingly before you implement Network Prevent for Email.
See the Symantec Data Loss Prevention MTA Integration Guide for Network Prevent for
Email.
■ Network Prevent for Web
For in-line active web request management, Network Prevent for Web integrates with an
HTTP, HTTPS, or FTP proxy server. This integration uses the Internet Content Adaptation
Protocol (ICAP) . The Network Prevent for Web Server detects confidential data in HTTP,
HTTPS, or FTP content. When it does, it causes the proxy to reject requests or remove
HTML content as specified by the governing policies.

About Network Discover/Cloud Storage Discover


Network Discover/Cloud Storage Discover scans cloud storage repositories, networked file
shares, web content servers, databases, document repositories, and endpoint systems at high
speeds to detect exposed data and documents. Network Discover/Cloud Storage Discover
enables companies to understand exactly where confidential data is exposed and helps
significantly reduce the risk of data loss.
Introducing Symantec Data Loss Prevention 79
About Network Protect

Network Discover/Cloud Storage Discover gives organizations the following capabilities:


■ Pinpoint unprotected confidential data. Network Discover/Cloud Storage Discover helps
organizations accurately locate at risk data that is stored on their networks. You can then
inform shared file server owners to protect the data.
■ Reduce proliferation of confidential data. Network Discover/Cloud Storage Discover helps
organizations to detect the spread of sensitive information throughout the company and
reduce the risk of data loss.
■ Automate investigations and audits. Network Discover/Cloud Storage Discover streamlines
data security investigations and compliance audits. It accomplishes this task by enabling
users to scan for confidential data automatically, as well as review access control and
encryption policies.
■ During incident remediation, Veritas Data Insight helps organizations solve the problem of
identifying data owners and responsible parties for information due to incomplete or
inaccurate metadata or tracking information.
See the Symantec Data Loss Prevention Data Insight Implementation Guide.
■ To provide additional flexibility in remediating Network Discover/Cloud Storage Discover
incidents, use the FlexResponse application programming interface (API), or the
FlexResponse plug-ins that are available.
See the Symantec Data Loss Prevention FlexResponse Platform Developers Guide, or
contact Symantec Professional Services for a list of plug-ins.
See “About Symantec Data Loss Prevention” on page 75.

About Network Protect


Network Protect reduces your risk by removing exposed confidential data, intellectual property,
and classified information from open file shares on network servers or desktop computers.
Note that there is no separate Network Protect server; the Network Protect product module
adds protection functionality to the Network Discover Server.
Network Protect gives organizations the following capabilities:
■ Apply visual tags to content in Box cloud storage. Network Protect can apply a text tag to
files that violate policies that are store in Box cloud storage.
■ Quarantine exposed files. Network Protect can automatically move those files that violate
policies to a quarantine area that re-creates the source file structure for easy location.
Optionally, Symantec Data Loss Prevention can place a marker text file in the original
location of the offending file. The marker file can explain why and where the original file
was quarantined.
Introducing Symantec Data Loss Prevention 80
About Endpoint Discover

■ Copy exposed or suspicious files. Network Protect can automatically copy those files that
violate policies to a quarantine area. The quarantine area can re-create the source file
structure for easy location, and leave the original file in place.
■ Quarantine file restoration. Network Protect can easily restore quarantined files to their
original or a new location.
■ Enforce access control and encryption policies. Network Protect proactively ensures
workforce compliance with existing access control and encryption policies.
See “About Symantec Data Loss Prevention” on page 75.
See “Configuring Network Protect for file shares” on page 2177.

About Endpoint Discover


Endpoint Discover detects sensitive data on your desktop or your laptop endpoints. It consists
of at least one Endpoint Server and at least one Symantec DLP Agent that runs on an endpoint.
You can have many Symantec DLP Agents connected to a single Endpoint Server. Symantec
DLP Agents:
■ Detect sensitive data in the endpoint file system.
■ Collect data on that activity.
■ Send incidents to the Endpoint Server.
■ Send the data to the associated Endpoint Server for analysis, if necessary.
See “About Endpoint Prevent” on page 80.
See “About Symantec Data Loss Prevention” on page 75.

About Endpoint Prevent


Endpoint Prevent detects and prevents sensitive data from leaving from your desktop or your
laptop endpoints. It consists of at least one Endpoint Server and all the Symantec DLP Agents
running on the endpoint systems that are connected to it. You can have many Symantec DLP
Agents connected to a single Endpoint Server. Endpoint Prevent detects on the following data
transfers:
■ Application monitoring
■ CD/DVD
■ Clipboard
■ Email/SMTP
■ eSATA removable drives
Introducing Symantec Data Loss Prevention 81
About Endpoint Prevent

■ FTP
■ HTTP/HTTPS
■ IM
■ Network shares
■ Print/Fax
■ USB removable media devices
See “About Endpoint Discover” on page 80.
See “About Symantec Data Loss Prevention” on page 75.
Chapter 2
Getting started
administering Symantec
Data Loss Prevention
This chapter includes the following topics:

■ About Symantec Data Loss Prevention administration

■ About the Enforce Server administration console

■ Logging on and off the Enforce Server administration console

■ About the administrator account

■ Performing initial setup tasks

■ Changing the administrator password

■ Adding an administrator email account

■ Editing a user profile

■ Changing your password

About Symantec Data Loss Prevention administration


The Symantec Data Loss Prevention system consists of one Enforce Server and one or more
detection servers.
The Enforce Server stores all system configuration, policies, saved reports, and other Symantec
Data Loss Prevention information and manages all activities.
Getting started administering Symantec Data Loss Prevention 83
About the Enforce Server administration console

System administration is performed from the Enforce Server administration console, which is
accessed by a Firefox or Internet Explorer Web browser. The Enforce console is displayed
after you log on.
See “About the Enforce Server administration console” on page 83.
After completing the installation steps in the Symantec Data Loss Prevention Installation Guide,
you must perform initial configuration tasks to get Symantec Data Loss Prevention up and
running for the first time. These are essential tasks that you must perform before the system
can begin monitoring data on your network.
See “Performing initial setup tasks” on page 85.

About the Enforce Server administration console


You administer the Symantec Data Loss Prevention system through the Enforce Server
administration console.
The Administrator user can see and access all parts of the administration console. Other users
can see only the parts to which their roles grant them access. The user account under which
you are currently logged on appears at the top right of the screen.
When you first log on to the administration console, the default Home page is displayed. You
and your users can change the default Home page using the Home page selection button.
See Table 2-1 on page 83.
To navigate through the system, select items from one of the four menu clusters (Home,
Incidents, Manage, and System).
Located in the upper-right portion of the administration console are the following navigation
and operation icons:

Table 2-1 Administration console navigation and operation icons

Icon Description

Help. Click this icon to access the context-sensitive online help for your current page.

Select this page as your Home page. If the current screen cannot be selected as
your Home page, this icon is unavailable.

Back to previous screen. Symantec recommends using this Back button rather than
your browser Back button. Use of your browser Back button may lead to
unpredictable behavior and is not recommended.

Screen refresh. Symantec recommends using this Refresh button rather than your
browser Reload or Refresh button. Use of your browser buttons may lead to
unpredictable behavior and is not recommended.
Getting started administering Symantec Data Loss Prevention 84
Logging on and off the Enforce Server administration console

Table 2-1 Administration console navigation and operation icons (continued)

Icon Description

Print the current report. If the current screen contents cannot be sent to the printer,
this icon is unavailable.

Email the current report to one or more recipients. If the current screen contents
cannot be sent as an email, this icon is unavailable.

See “Logging on and off the Enforce Server administration console” on page 84.

Logging on and off the Enforce Server administration


console
If you are assigned more than one role, you can only log on under one role at a time. You must
specify the role name and user name at logon.
To log on to the Enforce Server
1 On the Enforce Server host, open a browser and point it to the URL for your server (as
provided by the Symantec Data Loss Prevention administrator).
2 On the Symantec Data Loss Prevention logon screen, enter your user name in the
Username field. For the administrator role, this user name is always Administrator.
Users with multiple roles should specify the role name and the user name in the format
role\user (for example, ReportViewer\bsmith). If they do not, Symantec Data Loss
Prevention assigns the user a role upon logon.
See “Configuring roles” on page 114.
3 In the Password field, type the password. For the administrator at first logon, this password
is the password you created during the installation.
For installation details, see the appropriate Symantec Data Loss Prevention Installation
Guide.
4 Click login.
The Enforce Server administration console appears. The administrator can access all
parts of the administration console, but another user can see only those parts that are
authorized for that particular role.
To log out of the Enforce Server
1 Click logout at the top right of the screen.
2 Click OK to confirm.
Symantec Data Loss Prevention displays a message confirming the logout was successful.
Getting started administering Symantec Data Loss Prevention 85
About the administrator account

See “Editing a user profile” on page 87.

About the administrator account


The Symantec Data Loss Prevention system is preconfigured with a permanent administrator
account. Note that the name is case sensitive and cannot be changed. You configured a
password for the administrator account during installation.
Refer to the Symantec Data Loss Prevention Installation Guide for more information.
Only the administrator can see or modify the administrator account. Role options do not appear
on the administrator configure screen, because the administrator always has access to
every part of the system.
See “Changing the administrator password” on page 86.
See “Adding an administrator email account” on page 86.

Performing initial setup tasks


After completing the installation steps in the Symantec Data Loss Prevention Installation Guide,
you must perform initial configuration tasks to get Symantec Data Loss Prevention up and
running for the first time. These are essential tasks that you must perform before the system
can begin monitoring data on your network.
■ Change the Administrator's password to a unique password only you know, and add an
email address for the Administrator user account so you can be notified of various system
events.
See “About the administrator account” on page 85.
■ Add and configure your detection servers.
See “Adding a detection server” on page 273.
See “Server configuration—basic” on page 253.
■ Add any user accounts you need in addition to those supplied by your Symantec Data Loss
Prevention solution pack.
■ Review the policy templates provided with your Symantec Data Loss Prevention solution
pack to familiarize yourself with their content and data requirements. Revise the polices or
create new ones as needed.
■ Add the data profiles that you plan to associate with policies.
Data profiles are not always required. This step is necessary only if you are licensed for
data profiles and if you intend to use them in policies.
Getting started administering Symantec Data Loss Prevention 86
Changing the administrator password

Changing the administrator password


During installation, you created a generic administrator password. When you log on for the
first time, you should change this password to a unique, secret password.
See the Symantec Data Loss Prevention Installation Guide for more information.
Passwords are case-sensitive and they must contain at least eight characters.
Note that you can configure Symantec Data Loss Prevention to require strong passwords.
Strong passwords are passwords specifically designed to be difficult to break. Password policy
is configured from the System > Settings > General > Configure screen.
When your password expires, Symantec Data Loss Prevention displays the Password Renewal
window at the next logon. When the Password Renewal window appears, type your old
password, and then type your new password and confirm it.
See “Configuring user accounts” on page 121.
To change the administrator password
1 Log on as administrator.
2 Click Profile in the upper-right corner of the administration console.
3 On the Edit Profile screen:
■ Enter your new password in the New Password field.
■ Re-enter your new password in the Re-enter New Password field. The two new
passwords must be identical.
Note that passwords are case-sensitive.
4 Click Save.
See “About the administrator account” on page 85.
See “About the Enforce Server administration console” on page 83.
See “About the Overview screen” on page 278.

Adding an administrator email account


You can specify an email address to receive administrator account related messages.
Getting started administering Symantec Data Loss Prevention 87
Editing a user profile

To add or change an administrator email account


1 Click Profile in the upper-right corner of the administration console.
2 Type the new (or changed) administrator email address in the email Address field.
The email addresses must include a fully qualified domain name. For example:
[email protected].

3 Click Save.
See “About the administrator account” on page 85.
See “About the Enforce Server administration console” on page 83.
See “About the Overview screen” on page 278.

Editing a user profile


System users can use the Profile screen to configure their profile passwords, email addresses,
and languages.
Users can also specify their report preferences at the Profile screen.
To display the Profile screen, click the drop-down list at the top-right of the Enforce Server
administration console, then select Profile.
The Profile screen is divided into the following sections:
■ Authentication. Use this section to change your password, or select certificate
authentication, if available.
■ General. Use this section to specify your email address, choose a language preference,
and view your selected home page.
■ Report Preferences. Use this section to specify your preferred text encoding, CSV delimiter,
and XML export preferences.
■ Roles This section displays your role. Note that this section is not displayed for the
administrator because the administrator is authorized to perform all roles.

The Authentication section:


To change your password
1 Enter your new password in the New Password field.
2 Re-enter your new password in the Re-enter New Password field.
3 Click Save.
Getting started administering Symantec Data Loss Prevention 88
Editing a user profile

To use certificate authentication


1 If certificate authentication is available to you, select Use Certificate authentication.
2 Enter your LDAP common name (CN) in the Common Name (CN) field.
3 Click Save.

The General section:


The next time you log on, you must use your new password.
See “Changing your password” on page 89.
To specify a new personal email address
1 In the Email Address field enter your personal email address.
2 Click Save.
Individual Symantec Data Loss Prevention users can choose which of the available languages
and locales they want to use.
To choose a language for individual use
1 Click the option next to your language choice.
2 Click Save.
The Enforce Server administration console is re-displayed in the new language.
Choosing a language profile has no effect on the detection of policy violations. Detection is
performed on all content that is written in any supported language regardless of the language
you choose for your profile.
See “About support for character sets, languages, and locales” on page 91.
The languages available to you are determined when the product is installed and the later
addition of language packs for Symantec Data Loss Prevention. The effect of choosing a
different language varies as follows:
■ Locale only. If the language you choose has the notice Translations not available, dates
and numbers are displayed in formats appropriate for the language. Reports and lists are
sorted in accordance with that language. But the administration console menus, labels,
screens, and Help system are not translated and remain in English.
See “About locales” on page 95.
■ Translated. The language you choose may not display the notice Translations not available.
In this case, in addition to the number and date format, and sort order, the administration
console menus, labels, screens, and in some cases the Help system, are translated into
the chosen language.
See “About Symantec Data Loss Prevention language packs” on page 94.
Getting started administering Symantec Data Loss Prevention 89
Changing your password

The Report Preferences section:


To select your text encoding
1 Select a text encoding option:
■ Use browser default encoding. Check this box to specify that text files use the same
encoding as your browser.
■ Pull down menu. Click on an encoding option in the pull down menu to select it.

2 Click Save.
The new text encoding is applied to CSV exported files. This encoding lets you select a
text encoding that matches the encoding that is expected by CSV applications.
To select a CSV delimiter
1 Choose one of the delimiters from the pull-down menu.
2 Click Save.
The new delimiter is applied to the next comma-separated values (CSV) list that you
export.
See “About incident reports” on page 1902.
See “Exporting incident reports” on page 1921.
To select XML export details
1 Include Incident Violations in XML Export. If this box is checked, reports exported to
XML include the highlighted matches on each incident snapshot.
2 Include Incident History in XML Export. If this box is checked, reports exported to XML
include the incident history data that is contained in the History tab of each incident
snapshot.
3 Click Save.
Your selections are applied to the next report you export to XML.
If neither box is checked, the exported XML report contains only the basic incident information.
See “About incident reports” on page 1902.
See “Exporting incident reports” on page 1921.

Changing your password


When your password expires, Symantec Data Loss Prevention displays the Password Renewal
window at the next logon. When the Password Renewal window appears, enter your new
password and confirm it.
Getting started administering Symantec Data Loss Prevention 90
Changing your password

When your password expires, the system requires you to specify a new one the next time you
attempt to log on. If you are required to change your password, the Password Renewal window
appears.
To change your password from the Password Renewal window
1 Enter your old password in the Old password field of the Password Renewal window.
2 Enter your new password in the New Password field of the Password Renewal window.
3 Re-enter your new password in the Re-enter New Password field of the Password
Renewal window.
The next time you log on, you must use your new password.
You can also change your password at any time from the Profile screen.
See “Editing a user profile” on page 87.
See “About the administrator account” on page 85.
See “Logging on and off the Enforce Server administration console” on page 84.
Chapter 3
Working with languages
and locales
This chapter includes the following topics:

■ About support for character sets, languages, and locales

■ Supported languages for detection

■ Working with international characters

■ About Symantec Data Loss Prevention language packs

■ About locales

■ Using a non-English language on the Enforce Server administration console

■ Using the Language Pack Utility

About support for character sets, languages, and


locales
Symantec Data Loss Prevention fully supports international deployments by offering a large
number of languages and localization options:
■ Policy creation and violation detection across many languages.
The supported languages can be used in keywords, data identifiers, regular expressions,
exact data profiles (EDM) and document profiles (IDM).
See “Supported languages for detection” on page 92.
■ Operation on localized and Multilingual User Interface (MUI) versions of Windows operating
systems.
Working with languages and locales 92
Supported languages for detection

■ International character sets. To view and work with international character sets, the system
on which you are viewing the Enforce Server administration console must have the
appropriate capabilities.
See “Working with international characters” on page 93.
■ Locale-based date and number formats, as well as sort orders for lists and reports.
See “About locales” on page 95.
■ Localized user interface (UI) and Help system. Language packs for Symantec Data Loss
Prevention provide language-specific versions of the Enforce Server administration console.
They may also provide language-specific versions of the online Help system.

Note: These language packs are added separately following initial product installation.

■ Localized product documentation.


■ Language-specific notification pop-ups. Endpoint notification pop-ups appear in the display
language that is selected on the endpoint instead of the system locale language. For
example, if the system locale is set to English and the user sets the display language to
German, the notification pop-up appears in German.

Note: A mixed language notification pop-up displays if the user locale language does not
match the language used in the response rule.

Supported languages for detection


■ Arabic
■ Brazilian Portuguese
■ Chinese (traditional)
■ Chinese (simplified)
■ Czech
■ Danish
■ Dutch
■ English
■ Finnish
■ French
■ German
Working with languages and locales 93
Working with international characters

■ Greek
■ Hebrew
■ Hungarian
■ Italian
■ Japanese
■ Korean
■ Norwegian
■ Polish
■ Portuguese
■ Romanian
■ Russian
■ Spanish
■ Swedish
See “About support for character sets, languages, and locales” on page 91.

Working with international characters


You can use a variety of languages in Symantec Data Loss Prevention, based on:
■ The operating system-based character set installed on the computer from which you view
the Enforce Server administration console
■ The capabilities of your browser
For example, an incident report on a scan of Russian-language data would contain Cyrillic
characters. To view that report, the computer and browser you use to access the Enforce
Server administration console must be capable of displaying these characters. Here are some
general guidelines:
■ If the computer you use to access the Enforce Server administration console has an
operating system localized for a particular language, you should be able to view and use
a character set that supports that language.
■ If the operating system of the computer you use to access the administration console is
not localized for a particular language, you may need to add supplemental language support.
This supplemental language support is added to the computer you use to access the
administration console, not on the Enforce Server.
Working with languages and locales 94
About Symantec Data Loss Prevention language packs

■ On a Windows system, you add supplemental language support using the Control
Panel > Regional and Language Options > Languages (tab) - Supplemental
Language Support to add fonts for some character sets.

■ It may also be necessary to set your browser to accommodate the characters you want to
view and enter.

Note: The Enforce Server administration console supports UTF-8 encoded data.

■ On a Windows system, it may also be necessary to use the Languages – Supplemental


Language Support tab under Control Panel > Regional and Language Options to add
fonts for some character sets.
See the Symantec Data Loss Prevention Release Notes for known issues regarding specific
languages.
See “About support for character sets, languages, and locales” on page 91.

About Symantec Data Loss Prevention language packs


Language packs for Symantec Data Loss Prevention localize the product for a particular
language on Windows-based systems. After a language pack is added to Symantec Data Loss
Prevention, administrators can specify it as the system-wide default. If administrators make
multiple language packs available for use, individual users can choose the language they want
to work in.
See “Using a non-English language on the Enforce Server administration console” on page 95.
Language packs provide the following:
■ The locale of the selected language becomes available to administrators and end users in
Enforce Server Configuration screen.
■ Enforce Server screens, menu items, commands, and messages appear in the language.
■ The Symantec Data Loss Prevention online Help system may be displayed in the language.
Language packs for Symantec Data Loss Prevention are available from Symantec File Connect.

Caution: When you install a new version of Symantec Data Loss Prevention, any language
packs you have installed are deleted. For a new, localized version of Symantec Data Loss
Prevention, you must upgrade to a new version of the language pack.

See “About locales” on page 95.


See “About support for character sets, languages, and locales” on page 91.
Working with languages and locales 95
About locales

About locales
Locales are installed as part of a language pack.
A locale provides the following:
■ Displays dates and numbers in formats appropriate for that locale.
■ Sorts lists and reports based on text columns, such as "policy name" or "file owner,"
alphabetically according to the rules of the locale.
An administrator can also configure an additional locale for use by individual users. This
additional locale need only be supported by the required version of Java.
For a list of these locales, see
https://www.oracle.com/technetwork/java/javase/java8locales-2095355.html.
The locale can be specified at product installation time, as described in the Symantec Data
Loss Prevention Installation Guide. It can also be configured at a later time using the Language
Pack Utility.
See “Using a non-English language on the Enforce Server administration console” on page 95.
See “About support for character sets, languages, and locales” on page 91.

Using a non-English language on the Enforce Server


administration console
The use of locales and languages is specified through the Enforce Server administration
console by the following roles:
■ Symantec Data Loss Prevention administrator. Specifies that one of the available languages
be the default system-wide language and sets the locale.
■ Individual Symantec Data Loss Prevention user. Chooses which of the available locales
to use.

Note: The addition of multiple language packs could slightly affect Enforce Server performance,
depending on the number of languages and customizations present. This occurs because an
additional set of indexes has to be built and maintained for each language.

Warning: Do not modify the Oracle database NLS_LANGUAGE and NLS_TERRITORY settings.

See “About Symantec Data Loss Prevention language packs” on page 94.
See “About locales” on page 95.
Working with languages and locales 96
Using the Language Pack Utility

A Symantec Data Loss Prevention administrator specifies which of the available languages
is the default system-wide language.
To choose the default language for all users
1 On the Enforce Server, go to System > Settings > General and click Configure.
The Edit General Settings screen is displayed.
2 Scroll to the Language section of the Edit General Settings screen, and click the button
next to the language you want to use as the system-wide default.
3 Click Save.
Individual Symantec Data Loss Prevention users can choose which of the available languages
and locales they want to use by updating their profiles.
See “Editing a user profile” on page 87.
Administrators can use the Language Pack Utility to update the available languages.
See “Using the Language Pack Utility” on page 96.
See “About support for character sets, languages, and locales” on page 91.

Note: If the Enforce Server runs on a Linux host, you must install language fonts on the host
machine using the Linux Package Manager application. Language font packages begin with
fonts-<language_name>. For example, fonts-japanese-0.20061016-4.el5.noarch

Using the Language Pack Utility


To make a specific locale available for Symantec Data Loss Prevention, you add language
packs through the Language Pack Utility.
You run the Language Pack Utility from the command line. Its executable,
LanguagePackUtility.exe, resides in the \Program
Files\Symantec\DataLossPrevention\EnforceServer\15.5\Protect\bin directory on
Windows and /opt/Symantec/DataLossPrevention/EnforceServer/15.5/Protect/bin on
Linux.
To use the Language Pack Utility, you must have Read, Write, and Execute permissions on
all of the\Program Files\Symantec\DataLossPrevention\EnforceServer\15.5 (Windows)
or /opt/Symantec/DataLossPrevention/EnforceServer/15.5 (Linux) folders and subfolders.
If you are running the utility on Linux, you must be a root user.
To display help for the utility, such as the list of valid options and their flags, enter
LanguagePackUtility without any flags.
Working with languages and locales 97
Using the Language Pack Utility

Note: Running the Language Pack Utility causes the SymantecDLPManagerService and
SymantecDLPIncidentPersisterService services to stop for as long as 20 seconds. Any
users who are logged on to the Enforce Server administration console will be logged out
automatically. When finished making its updates, the utility restarts the services automatically,
and users can log back on to the administration console.

Language packs for Symantec Data Loss Prevention can be obtained from Symantec File
Connect.
To add a language pack (Windows)
1 Advise other users that anyone currently using the Enforce Server administration console
must save their work and log off.
2 Run the Language Pack Utility with the -a flag followed by the name of the ZIP file for
that language pack. Enter:

LanguagePackUtility -a filename

where filename is the fully qualified path and name of the language pack ZIP file.
For example, if the Japanese language pack ZIP file is stored in c:\temp, add it by entering:

LanguagePackUtility -a c:\temp\Symantec_DLP_15.5_Japanese.zip

To add multiple language packs during the same session, specify multiple file names,
separated by spaces, for example:

LanguagePackUtility -a
c:\temp\Symantec_DLP_15.5_Japanese.zip
Symantec_DLP_15.5_Chinese.zip

3 Log on to the Enforce Server administration console and confirm that the new language
option is available on the Edit General Settings screen. To do this, go to System >
Settings > General > Configure > Edit General Settings.
To add a language pack (Linux)
1 Advise other users that anyone currently using the Enforce Server administration console
must save their work and log off.
2 Open a terminal session to the Enforce Server host and switch to the DLP_system_account
by running the following command:
su - DLP_system_account
Working with languages and locales 98
Using the Language Pack Utility

3 Run the following command:


DLP_home/Protect/bin/LanguagePackUtility -a <path to language pack zip
file>

4 Log on to the Enforce Server administration console and confirm that the new language
option is available on the Edit General Settings screen. To do this, go to System >
Settings > General > Configure > Edit General Settings.
To remove a language pack
1 Advise users that anyone currently using the Enforce Server administration console must
save their work and log off.
2 Run the Language Pack Utility with the -r flag followed by the Java locale code of the
language pack you want to remove. Enter:

LanguagePackUtility -r locale

where locale is a valid Java locale code corresponding to a Symantec Data Loss Prevention
language pack.
For example, to remove the French language pack enter:

LanguagePackUtility -r fr_FR

To remove multiple language packs during the same session, specify multiple file names,
separated by spaces.
3 Log on to the Enforce Server administration console and confirm that the language pack
is no longer available on the Edit General Settings screen. To do this, go to System >
Settings > General > Configure > Edit General Settings.
Removing a language pack has the following effects:
■ Users can no longer select the locale of the removed language pack for individual use.

Note: If the locale of the language pack is supported by the version of Java required for
running Symantec Data Loss Prevention, the administrator can later specify it as an alternate
locale for any users who need it.

■ The locale reverts to the system-wide default configured by the administrator.


■ If the removed language was the system-wide default locale, the system locale reverts to
English.
Working with languages and locales 99
Using the Language Pack Utility

To change or add a locale


1 Advise users that anyone currently using the Enforce Server administration console must
save their work and log off.
2 Run the Language Pack Utility using the -c flag followed by the Java locale code for the
locale that you want to change or add. Enter:

LanguagePackUtility -c locale

where locale is a valid locale code recognized by Java, such as pt_PT for Portuguese.
For example, to change the locale to Brazilian Portuguese enter:

LanguagePackUtility -c pt_BR

3 Log on to the Enforce Server administration console and confirm that the new alternate
locale is now available on the Edit General Settings screen. To do this, go to System >
Settings > General > Configure > Edit General Settings.
If you specify a locale for which there is no language pack, "Translations not available"
appears next to the locale name. This means that formatting and sort order are appropriate
for the locale, but the Enforce Server administration console screens and online Help are
not translated.

Note: Administrators can only make one additional locale available for users that is not based
on a previously installed Symantec Data Loss Prevention language pack.

See “About support for character sets, languages, and locales” on page 91.
Section 2
Managing the Enforce Server
platform

■ Chapter 4. Managing Enforce Server services and settings

■ Chapter 5. Managing roles and users

■ Chapter 6. Connecting to group directories

■ Chapter 7. Managing stored credentials

■ Chapter 8. Managing system events and messages

■ Chapter 9. Managing the Symantec Data Loss Prevention database

■ Chapter 10. Working with Symantec Information Centric Encryption

■ Chapter 11. Working with Symantec Information Centric Tagging

■ Chapter 12. Adding a new product module

■ Chapter 13. Applying a Maintenance Pack


Chapter 4
Managing Enforce Server
services and settings
This chapter includes the following topics:

■ About Symantec Data Loss Prevention services

■ About starting and stopping services on Windows

■ About starting and stopping services on Linux

About Symantec Data Loss Prevention services


The Symantec Data Loss Prevention services may need to be stopped and started periodically.
This section provides a brief description of each service and how to start and stop the services
on supported platforms.
The Symantec Data Loss Prevention services for the Enforce Server are described in the
following table:

Table 4-1 Symantec Data Loss Prevention Enforce Server services

Service Name Description

Symantec DLP Manager Provides the centralized reporting and management services for Symantec
Data Loss Prevention.

If you have more than 50 policies, 50 detection servers, or 50,000 agents,


increase the Max Memory for this service from 2048 to 4096. You can
adjust this setting in the SymantecDLPManager.conf file.

See “To increase memory for the Symantec DLP Manager service”
on page 102.
Managing Enforce Server services and settings 102
About starting and stopping services on Windows

Table 4-1 Symantec Data Loss Prevention Enforce Server services (continued)

Service Name Description

Symantec DLP Detection Controls the detection servers.


Server Controller

Symantec DLP Notifier Provides the database notifications.

Symantec DLP Incident Writes the incidents to the database.


Persister

To increase memory for the Symantec DLP Manager service


1 Open the SymantecDLPManager.conf file in a text editor.
You can find this configuration file in one of the following locations:
■ Windows: \Program
Files\Symantec\DataLossPrevention\EnforceServer\Services

■ Linux: /opt/Symantec/DataLossPrevention/EnforceServer/Services

2 Change the value of the wrapper.java.maxmemory parameter to 4096.

wrapper.java.maxmemory = 4096

3 Save and close the file.


See “About starting and stopping services on Windows” on page 102.

About starting and stopping services on Windows


The procedures for starting and stopping services vary according to installation configurations
and between Enforce and detection servers.
■ See “Starting an Enforce Server on Windows” on page 103.
■ See “Stopping an Enforce Server on Windows” on page 103.
■ See “Starting a detection server on Windows” on page 104.
■ See “Stopping a detection server on Windows” on page 104.
■ See “Starting services on single-tier Windows installations” on page 104.
■ See “Starting services on single-tier Windows installations” on page 104.
■ See “Stopping services on single-tier Windows installations” on page 105.
Managing Enforce Server services and settings 103
About starting and stopping services on Windows

Starting an Enforce Server on Windows


Use the following procedure to start the Symantec Data Loss Prevention services on a Windows
Enforce Server.
To start the Symantec Data Loss Prevention services on a Windows Enforce Server
1 On the computer that hosts the Enforce Server, navigate to Start > All Programs >
Administrative Tools > Services to open the Windows Services menu.
2 Start the Symantec Data Loss Prevention services in the following order:
■ SymantecDLPNotifierService

■ SymantecDLPManagerService

■ SymantecDLPIncidentPersisterService

■ SymantecDLPDetectionServerControllerService

Note: Start the SymantecDLPNotifierService service first before starting other services.

See “Stopping an Enforce Server on Windows” on page 103.

Stopping an Enforce Server on Windows


Use the following procedure to stop the Symantec Data Loss Prevention services on a Windows
Enforce Server.
To stop the Symantec Data Loss Prevention services on a Windows Enforce Server
1 On the computer that hosts the Enforce Server, navigate to Start > All Programs >
Administrative Tools > Services to open the Windows Services menu.
2 From the Services menu, stop all running Symantec Data Loss Prevention services in the
following order:
■ SymantecDLPDetectionServerControllerService

■ SymantecDLPIncidentPersisterService

■ SymantecDLPManagerService

■ SymantecDLPNotifierService

See “Starting an Enforce Server on Windows” on page 103.


Managing Enforce Server services and settings 104
About starting and stopping services on Windows

Starting a detection server on Windows


To start the Symantec Data Loss Prevention service on a Windows detection server
1 On the computer that hosts the detection server, navigate to Start > All Programs >
Administrative Tools > Services to open the Windows Services menu.
2 Start the SymantecDLPDetectionServerService service.
See “Stopping a detection server on Windows” on page 104.

Stopping a detection server on Windows


Use the following procedure to stop the Symantec Data Loss Prevention service on a Windows
detection server.
To stop the Symantec Data Loss Prevention service on a Windows detection server
1 On the computer that hosts the detection server, navigate to Start > All Programs >
Administrative Tools > Services to open the Windows Services menu.
2 Stop the SymantecDLPDetectionServerService service.
See “Starting a detection server on Windows” on page 104.

Starting services on single-tier Windows installations


Use the following procedure to start the Symantec Data Loss Prevention services on a single-tier
installation on Windows.
To start the Symantec Data Loss Prevention services on a single-tier Windows installation
1 On the computer that hosts the Symantec Data Loss Prevention server applications,
navigate to Start > All Programs > Administrative Tools > Services to open the Windows
Services menu.
2 Start the Symantec Data Loss Prevention in the following order:
■ SymantecDLPNotifierService

■ SymantecDLPManagerService

■ SymantecDLPIncidentPersisterService

■ SymantecDLPDetectionServerControllerService

■ SymantecDLPDetectionServerService

Note: Start the SymantecDLPNotifierService service before starting other services.

See “Stopping services on single-tier Windows installations” on page 105.


Managing Enforce Server services and settings 105
About starting and stopping services on Linux

Stopping services on single-tier Windows installations


Use the following procedure to stop the Symantec Data Loss Prevention services on a single-tier
installation on Windows.
To stop the Symantec Data Loss Prevention services on a single-tier Windows installation
1 On the computer that hosts the Symantec Data Loss Prevention server applications,
navigate to Start > All Programs > Administrative Tools > Services to open the Windows
Services menu.
2 From the Services menu, stop all running Symantec Data Loss Prevention services in the
following order:
■ SymantecDLPDetectionServerService

■ SymantecDLPDetectionServerControllerService

■ SymantecDLPIncidentPersisterService

■ SymantecDLPManagerService

■ SymantecDLPNotifierService

See “Starting services on single-tier Windows installations” on page 104.

About starting and stopping services on Linux


The procedures for starting and stopping services vary according to installation configurations
and between Enforce and detection servers.
■ See “Starting an Enforce Server on Linux” on page 105.
■ See “Stopping an Enforce Server on Linux” on page 106.
■ See “Starting a detection server on Linux” on page 106.
■ See “Stopping a detection server on Linux” on page 107.
■ See “Starting services on single-tier Linux installations” on page 107.
■ See “Stopping services on single-tier Linux installations” on page 107.

Starting an Enforce Server on Linux


Use the following procedure to start the Symantec Data Loss Prevention services on a Linux
Enforce Server.
Managing Enforce Server services and settings 106
About starting and stopping services on Linux

To start the Symantec Data Loss Prevention services on a Linux Enforce Server
1 On the computer that hosts the Enforce Server, log on as root.
2 Start the Symantec DLP Notifier service by running the following command:

service SymantecDLPNotifierService start

3 Start the remaining Symantec Data Loss Prevention services, by running the following
commands:

service SymantecDLPManagerService start


service SymantecDLPIncidentPersisterService start
service SymantecDLPDetectionServerControllerService start

See “Stopping an Enforce Server on Linux” on page 106.

Stopping an Enforce Server on Linux


Use the following procedure to stop the Symantec Data Loss Prevention services on a Linux
Enforce Server.
To stop the Symantec Data Loss Prevention services on a Linux Enforce Server
1 On the computer that hosts the Enforce Server, log on as root.
2 Stop all running Symantec Data Loss Prevention services by running the following
commands:

service SymantecDLPIncidentPersisterService stop


service SymantecDLPManagerService stop
service SymantecDLPDetectionServerControllerService stop
service SymantecDLPNotifierService stop

See “Starting an Enforce Server on Linux” on page 105.

Starting a detection server on Linux


Use the following procedure to start the Symantec Data Loss Prevention service on a Linux
detection server.
To start the Symantec Data Loss Prevention service on a Linux detection server
1 On the computer that hosts the detection server, log on as root.
2 Start the Symantec Data Loss Prevention service by running the following command:

service SymantecDLPDetectionServerService start


Managing Enforce Server services and settings 107
About starting and stopping services on Linux

See “Stopping a detection server on Linux” on page 107.

Stopping a detection server on Linux


Use the following procedure to stop the Symantec Data Loss Prevention service on a Linux
detection server.
To stop the Symantec Data Loss Prevention service on a Linux detection server
1 On the computer that hosts the detection server, log on as root.
2 Stop the Symantec Data Loss Prevention service by running the following command:

service SymantecDLPDetectionServerService stop

See “Starting a detection server on Linux” on page 106.

Starting services on single-tier Linux installations


Use the following procedure to start the Symantec Data Loss Prevention services on a single-tier
installation on Linux.
To start the Symantec Data Loss Prevention services on a single-tier Linux installation
1 On the computer that hosts the Symantec Data Loss Prevention server applications, log
on as root.
2 Start the Symantec DLP Notifier service by running the following command:

service SymantecDLPNotifierService start

3 Start the remaining Symantec Data Loss Prevention services by running the following
commands:

service SymantecDLPManagerService start


service SymantecDLPDetectionServerService start
service SymantecDLPIncidentPersisterService start
service SymantecDLPDetectionServerControllerService start

See “Stopping services on single-tier Linux installations” on page 107.

Stopping services on single-tier Linux installations


Use the following procedure to stop the Symantec Data Loss Prevention services on a single-tier
installation on Linux.
Managing Enforce Server services and settings 108
About starting and stopping services on Linux

To stop the Symantec Data Loss Prevention services on a single-tier Linux installation
1 On the computer that hosts the Symantec Data Loss Prevention servers, log on as root.
2 Stop all running Symantec Data Loss Prevention services by running the following
commands:

service SymantecDLPIncidentPersisterService stop


service SymantecDLPManagerService stop
service SymantecDLPDetectionServerService stop
service SymantecDLPDetectionServerControllerService stop
service SymantecDLPNotifierService stop

See “Starting services on single-tier Linux installations” on page 107.


Chapter 5
Managing roles and users
This chapter includes the following topics:

■ About role-based access control

■ About configuring roles and users

■ About recommended roles for your organization

■ Roles included with solution packs

■ Configuring roles

■ Configuring user accounts

■ Configuring password enforcement settings

■ Resetting the Administrator password

■ Manage and add roles

■ Manage and add users

■ About authenticating users

■ Configuring user authentication

■ Integrating Active Directory for user authentication

■ About certificate authentication configuration

About role-based access control


Symantec Data Loss Prevention provides role-based access control to govern how users
access product features and functionality. For example, a role might let users view reports,
but prevent users from creating policies or deleting incidents. Or, a role might let users author
policy response rules but not detection rules.
Managing roles and users 110
About configuring roles and users

Roles determine what a user can see and do in the Enforce Server administration console.
For example, the Report role is a specific role that is included in most Symantec Data Loss
Prevention solution packs. Users in the Report role can view incidents and create policies,
and configure Discover targets (if you are running a Discover Server). However, users in the
Report role cannot create Exact Data or Document Profiles. Also, users in the Report role
cannot perform system administration tasks. When a user logs on to the system in the Report
role, the Manage > Data Profiles and the System > Login Management modules in the
Enforce Server administration console are not visible to this user.
You can assign a user to more than one role. Membership in multiple roles allows a user to
perform different kinds of work in the system. For example, you grant the information security
manager user (InfoSec Manager) membership in two roles: ISR (information security first
responder) and ISM (information security manager). The InfoSec Manager can log on to the
system as either a first responder (ISR) or a manager (ISM), depending on the task(s) to
perform. The InfoSec Manager only sees the Enforce Server components appropriate for those
tasks.
You can also combine roles and policy groups to limit the policies and detection servers that
a user can configure. For example, you associate a role with the European Office policy group.
This role grants access to the policies that are designed only for the European office.
See “Policy deployment” on page 373.
Users who are assigned to multiple roles must specify the desired role at log on. Consider an
example where you assign the user named "User01" to two roles, "Report" and "System
Admin." If "User01" wanted to log on to the system to administer the system, the user would
log on with the following syntax: Login: System Admin\User01
See “Logging on and off the Enforce Server administration console” on page 84.
The Administrator user (created during installation) has access to every part of the system
and therefore is not a member of any access-control role.
See “About the administrator account” on page 85.

About configuring roles and users


When you install the Enforce Server, you create a default Administrator user that has access
to all roles. If you import a solution pack to the Enforce Server, the solution pack includes
several roles and users to get you started.
See “About the administrator account” on page 85.
You may want to add roles and users to the Enforce Server. When adding roles and users,
consider the following guidelines:
■ Understand the roles necessary for your business users and for the information security
requirements and procedures of your organization.
Managing roles and users 111
About recommended roles for your organization

See “About recommended roles for your organization” on page 111.


■ Review the roles that created when you installed a solution pack. You can likely use several
of them (or modified versions of them) for users in your organization.
See “Roles included with solution packs” on page 112.
■ If necessary, modify the solution-pack roles and create any required new roles.
See “Configuring roles” on page 114.
■ Create users and assign each of them to one or more roles.
See “Configuring user accounts” on page 121.
■ Manage roles and users and remove those not being used.
See “Manage and add roles” on page 126.
See “Manage and add users” on page 126.

About recommended roles for your organization


To determine the most useful roles for your organization, review your business processes and
security requirements.
Most businesses and organizations find the following roles fundamental when they implement
the Symantec Data Loss Prevention system:
■ System Administrator
This role provides access to the System module and associated menu options in the
Enforce Server administration console. Users in this role can monitor and manage the
Enforce Server and detection servers(s). Users in this role can also deploy detection servers
and run Discover scans. However, users in this role cannot view detailed incident information
or author policies. All solution packs create a "Sys Admin" role that has system administrator
privileges.
■ User Administrator
This role grants users the right to manage users and roles. Typically this role grants no
other access or privileges. Because of the potential for misuse, it is recommended that no
more than two people in the organization be assigned this role (primary and backup).
■ Policy Admininistrator
This role grants users the right to manage policies and response rules. Typically this role
grants no other access or privileges. Because of the potential for misuse, it is recommended
that no more than two people in the organization be assigned this role (primary and backup).
■ Policy Author
This role provides access to the Policies module and associated menu options in the
Enforce Server administration console. This role is suited for information security managers
who track incidents and respond to risk trends. An information security manager can author
Managing roles and users 112
Roles included with solution packs

new policies or modifying existing policies to prevent data loss. All solution packs create
an "InfoSec Manager" (ISM) role that has policy authoring privileges.
■ Incident Responder
This role provides access to the Incidents module and associated menu options in the
Enforce Server administration console. Users in this role can track and remediate incidents.
Businesses often have at least two incident responder roles that provide two levels of
privileges for viewing and responding to incidents.
A first-level responder may view generic incident information, but cannot access incident
details (such as sender or recipient identity). In addition, a first-level responder may also
perform some incident remediation, such as escalating an incident or informing the violator
of corporate security policies. A second-level responder might be escalation responder
who has the ability to view incident details and edit custom attributes. A third-level responder
might be an investigation responder who can create response rules, author policies, and
create policy groups.
All solution packs create an "InfoSec Responder" (ISR) role. This role serves as a first-level
responder. You can use the ISM (InfoSec Manager) role to provide second-level responder
access.
Your business probably requires variations on these roles, as well as other roles. For more
ideas about these and other possible roles, see the descriptions of the roles that are imported
with solution packs.
See “Roles included with solution packs” on page 112.

Roles included with solution packs


The various solution packs offered with Symantec Data Loss Prevention create roles and users
when installed. For all solution packs there is a standard set of roles and users. You may see
some variation in those roles and users, depending on the solution pack you import.
The following table summarizes the Financial Services Solution Pack roles. These roles are
largely the same as the roles that are found in other Symantec Data Loss Prevention solution
packs.
See Table 5-1 on page 113.
Managing roles and users 113
Roles included with solution packs

Table 5-1 Financial Services Solution Pack roles

Role Name Description

Compliance Compliance Officer:


■ Users in this role can view, remediate, and delete incidents; look up attributes;
and edit all custom attributes.
■ This comprehensive role provides users with privileges to ensure that
compliance regulations are met. It also allows users to develop strategies for
risk reduction at a business unit (BU) level, and view incident trends and risk
scorecards.

Exec Executive:

■ Users in this role can view, remediate, and delete incidents; look up attributes;
and view all custom attributes.
■ This role provides users with access privileges to prevent data loss risk at the
macro level. Users in this role can review the risk trends and performance
metrics, as well as incident dashboards.

HRM HR Manager:

■ Users in this role can view, remediate, and delete incidents; look up attributes;
and edit all custom attributes.
■ This role provides users with access privileges to respond to the security
incidents that are related to employee breaches.

Investigator Incident Investigator:

■ Users in this role can view, remediate, and delete incidents; look up attributes;
and edit all custom attributes.
■ This role provides users with access privileges to research details of incidents,
including forwarding incidents to forensics. Users in this role may also
investigate specific employees.

ISM InfoSec Manager:

■ Users in this role can view, remediate, and delete incidents. They can look
up attributes, edit all custom attributes, author policies and response rules.
■ This role provides users with second-level incident response privileges. Users
can manage escalated incidents within information security team.

ISR InfoSec Responder:

■ Users in this role can view, remediate, and delete incidents; look up attributes;
and view or edit some custom attributes. They have no access to sender or
recipient identity details.
■ This role provides users with first-level incident response privileges. Users
can view policy incidents, find broken business processes, and enlist the
support of the extended remediation team to remediate incidents.
Managing roles and users 114
Configuring roles

Table 5-1 Financial Services Solution Pack roles (continued)

Role Name Description

Report Reporting and Policy Authoring:


■ Users in this role can view and remediate incidents, and author policies. They
have no access to incident details.
■ This role provides a single role for policy authoring and data loss risk
management.

Sys Admin System administrator:

■ Users in this role can administer the system and the system users, and can
view incidents. They have no access to incident details.

Configuring roles
Each Symantec Data Loss Prevention user is assigned to one or more roles that define the
privileges and rights that user has within the system. A user’s role determines system
administration privileges, policy authoring rights, incident access, and more. If a user is a
member of multiple roles, the user must specify the role when logging on, for example: Login:
Sys Admin/sysadmin01.

See “About role-based access control” on page 109.


See “About configuring roles and users” on page 110.
To configure a role
1 Navigate to the System > Login Management > Roles screen.
2 Click Add Role.
The Configure Role screen appears, displaying the following tabs: General, Incident
Access, Policy Management, and Users.
3 In the General tab:
■ Enter a unique Name for the role. The name field is case-sensitive and is limited to
30 characters. The name you enter should be short and self-describing. Use the
Description field to annotate the role name and explain its purpose in more details.
The role name and description appear in the Role List screen.
■ In the User Privileges section, you grant user privileges for the role.
System privileges(s):

User Select the User Administration option to enable users to create


Administration additional roles and users in the Enforce Server.
(Superuser)
Managing roles and users 115
Configuring roles

Server Select the Server Administration option to enable users to perform the
Administration following functions:
■ Configure detection servers.
■ Create and manage Data Profiles for Exact Data Matching (EDM),
Form Recognition, Indexed Document Matching (IDM), and Vector
Machine Learning (VML).
■ Configure and assign incident attributes.
■ Configure system settings.
■ Configure response rules.
■ Create policy groups.
■ Configure recognition protocols.
■ View system event and traffic reports.
■ Import policies.
Note: Selecting Server Administration also provides Agent Management
privileges.

Agent Management Select the Agent Management option to enable users to perform the
following functions:
■ Review agent status
■ Review agent events
■ Manage agents and perform troubleshooting tasks
■ Delete, restart, and shut down agents
■ Change the Endpoint Server to which agents connect
■ Pull agent logs
■ Access agent summary reports
■ Add and update agent configurations
■ Manage and create agent groups
■ View agent group conflicts
■ Review server logs
■ Manage server logs, including canceling log collection, configuring
logs, and downloading and deleting logs

People privilege:

User Select the User Reporting option to enable users to view the user risk summary.
Reporting
Note: The Incident > View privilege is automatically enabled for all incident
(Risk
types for users with the User Reporting privilege.
Summary,
User See “About user risk” on page 1973.
Snapshot)
Managing roles and users 116
Configuring roles

■ In the Incidents section, you grant users in this role the following incident privilege(s).
These settings apply to all incident reports in the system, including the Executive
Summary, Incident Summary, Incident List, and Incident Snapshots.

View Select the View option to enable users in this role to view policy violation
incidents.
You can customize incident viewing access by selecting various Actions
and Display Attribute options as follows:
■ By default the View option is enabled (selected) for all types of
incidents: Network Incidents, Discover Incidents, and Endpoint
Incidents.
■ To restrict viewing access to only certain incident types, select
(highlight) the type of incident you want to authorize this role to view.
(Hold down the Ctrl key to make multiple selections.) If a role does
not allow a user to view part of an incident report, the option is
replaced with "Not Authorized" or is blank.
Note: If you revoke an incident-viewing privilege for a role, the system
deletes any saved reports for that role that rely on the revoked privilege.
For example, if you revoke (deselect) the privilege to view network
incidents, the system deletes any saved network incident reports
associated with the role.
Managing roles and users 117
Configuring roles

Actions Select among the following Actions to customize the actions a user can
perform when an incident occurs:
■ Remediate Incidents
This privilege lets users change the status or severity of an incident,
set a data owner, add a comment to the incident history, set the Do
Not Hide and Allow Hiding options, and execute response rule
actions. In addition, if you are using the Incident Reporting and Update
API, select this privilege to remediate the location and status attributes.
■ Smart Response Rules to execute
You specify which Smart Response Rules that can be executed on
a per role basis. Configured Smart Response Rules are listed in the
"Available" column on the left. To expose a Smart Response Rule
for execution by a user of this role, select it and click the arrow to add
it to the right-side column. Use the CTRL key to select multiple rules.
■ Perform attribute lookup
Lets users look up incident attributes from external sources and
populate their values for incident remediation.
■ Delete incidents
Lets users delete an incident.
■ Hide incidents
Lets users hide an incident.
■ Unhide incidents
Lets users restore previously hidden incidents.
■ Export Web archive
Lets users export a report that the system compiles from a Web
archive of incidents.
■ Export XML
Lets users export a report of incidents in XML format.
■ Email incident report as CSV attachment
Lets users email as an attachment a report containing a
comma-separated listing of incident details.

Incident Reporting Select among the following user privileges to enable access for Web
and Update API Services clients that use the Incident Reporting and Update API or the
deprecated Reporting API:
■ Incident Reporting
Enables Web Services clients to retrieve incident details.
■ Incident Update
Enables Web Services clients to update incident details. (Does not
apply to clients that use the deprecated Reporting API.)

See the Symantec Data Loss Prevention Incident Reporting and Update
API Developers Guide for more information.
Managing roles and users 118
Configuring roles

Display Attributes Select among the following Display Attributes to customize what
attributes appear in the Incidents view for the policy violations that users
of the role can view.

Shared attributes are common to all types of incidents:


■ Matches
The highlighted text of the message that violated the policy appears
on the Matches tab of the Incident Snapshot screen.
■ History
The incident history.
■ Body
The body of the message.
■ Attachments
The names of any attachments or files.
■ Sender
The message sender.
■ Recipients
The message recipients.
■ Subject
The subject of the message.
■ Original Message
Controls whether or not the original message that caused the policy
violation incident can be viewed.
Note: To view an attachment properly, both the "Attachment" and the
"Original Message" options must be checked.

Endpoint attributes are specific to Endpoint incidents:


■ Username
The name of the Endpoint user.
■ Machine name
The name of the computer where the Endpoint Agent is installed.
Discover attributes are specific to Discover incidents:
■ File Owner
The name of the owner of the file being scanned.
■ Location
The location of the file being scanned.
Managing roles and users 119
Configuring roles

Custom Attributes The Custom Attributes list includes all of the custom attributes
configured by your system administrator, if any.
■ Select View All if you want users to be able to view all custom attribute
values.
■ Select Edit All if you want users to edit all custom attribute values.
■ To restrict the users to certain custom attributes, clear the View All
and Edit All check boxes and individually select the View and/or Edit
check box for each custom attribute you want viewable or editable.
Note: If you select Edit for any custom attribute, the View check box is
automatically selected (indicated by being grayed out). If you want the
users in this role to be able to view all custom attribute values, select
View All.

■ In the Discover section, you grant users in this role the following privileges:

Folder Risk Reporting This privilege lets users view Folder Risk Reports. Refer to the Symantec
Data Loss Prevention Data Insight Implementation Guide.
Note: This privilege is only available for Symantec Data Loss Prevention
Data Insight licenses.

Content Root This privilege lets users configure and run Content Root Enumeration
Enumeration scans. For more information about Content Root Enumeration scans, See
“Working with Content Root Enumeration scans” on page 2162.

4 In the Incident Access tab, configure any conditions (filters) on the types of incidents
that users in this role can view.

Note: You must select the View option on the General tab for settings on the Incident
Access tab to have any effect.

To add an Incident Access condition:


■ Click Add Condition.
■ Select the type of condition and its parameters from left to right, as if writing a sentence.
(Note that the first drop-down list in a condition contains the alphabetized
system-provided conditions that are associated with any custom attributes.)
For example, select Policy Group from the first drop-down list, select Is Any Of from
the second list, and then select Default Policy Group from the final listbox. These
settings would limit users to viewing only those incidents that the default policy group
detected.

5 In the Policy Management tab, select one of the following policy privileges for the role:
Managing roles and users 120
Configuring roles

■ Import Policies
This privilege lets users import policy files that have been exported from an Enforce
Server.
To enable this privilege, the role must also have the Server Administration, Author
Policies, Author Response Rules, and All Policy Groups privileges.
■ Author Policies
This privilege lets users add, edit, and delete policies within the policy groups that are
selected.
It also lets users modify system data identifiers, and create custom data identifiers.
It also lets users create and modify User Groups.
This privilege does not let users create or manage Data Profiles. This activity requires
Enforce Server administrator privileges.
■ Discover Scan Control
Lets the users in this role create Discover targets, run scans, and view Discover
Servers.
■ Credential Management
Lets users create and modify the credentials that the system requires to access target
systems and perform Discover scans.
■ Policy Groups
Select All Policy Groups only if users in this role need access to all existing policy
groups and any that will be created in the future.
Otherwise you can select individual policy groups or the Default Policy Group.

Note: These options do not grant the right to create, modify, or delete policy groups.
Only the users whose role includes the Server Administration privilege can work with
policy groups.

■ Author Response Rules


Enables users in this role to create, edit, and delete response rules.

Note: Users cannot edit or author response rules for policy remediation unless you
select the Author Response Rules option.

Note: Preventing users from authoring response rules does not prevent them from executing
response rules. For example, a user with no response-rule authoring privileges can still
execute smart response rules from an incident list or incident snapshot.
Managing roles and users 121
Configuring user accounts

6 In the Users tab, select any users to which to assign this role. If you have not yet configured
any users, you can assign users to roles after you create the users.
7 Click Save to save your newly created role to the Enforce Server database.

Configuring user accounts


User accounts are the means by which users log onto the system and perform tasks. The role
that the user account belongs to limits what the user can do in the system.
To configure a user account:
1 In the Enforce Server Administration Console, select System > Login Management >
DLP Users to create a new user account or to reconfigure an existing user account. Or,
click Profile to reconfigure the user account to which you are currently logged on.
2 Click Add DLP User to add a new user, or click the name of an existing user to modify
that user's configuration.
3 Enter a name for a new user account in the Name field.
■ The user account name must be between 8 and 30 characters long, is case-sensitive
, and cannot contain backslashes (\).
■ If you use certificate authentication, the Name field value does not have to match the
user's Common Name (CN). However, you may choose to use the same value for
both the Name and Common Name (CN) so that you can easily locate the configuration
for a specific CN. The Enforce Server administration console shows only the Name
field value in the list of configured users.
■ If you are using Active Directory authentication, the user account name must match
the name of the Active Directory user account. Note that all Symantec Data Loss
Prevention user names are case-sensitive, even though Active Directory user names
are not. Active Directory users will need to enter the case-sensitive account name
when logging onto the Enforce Server administration console.
See “Integrating Active Directory for user authentication” on page 137.
Managing roles and users 122
Configuring user accounts

4 Configure the Authentication section of the Configure User page. Only options that are
enabled are available on this page.

Option Instructions

Use Single Sign On If SAML authentication had been enabled, the user can sign on using Single Sign On Mapping
Mapping on the Configure User page.

Use Password Select this option to use password authentication and allow the user to sign on using the
access Enforce Server administration console log on page. This option is required if the user account
will be used for a Reporting API Web Service client.

If you select this option, also enter the user password in the Password and the Re-enter
Password fields. The password must be at least eight characters long and is case-sensitive.
For security purposes, the password is obfuscated and each character appears as an asterisk.

If you configure advanced password settings, the user must specify a strong password. In
addition, the password may expire at a certain date and the user has to define a new one
periodically.

See “Configuring password enforcement settings” on page 124.

You can choose password authentication even if you also use certificate authentication. If you
use certificate authentication, you can optionally disable sign on from the Enforce Server
administration console log on page.

See “Disabling password authentication and forms-based logon” on page 154.

Symantec Data Loss Prevention authenticates all Reporting API clients using password
authentication. If you configure Symantec Data Loss Prevention to use certificate authentication,
any user account that is used to access the Reporting API Web Service must have a valid
password. See the Symantec Data Loss Prevention Reporting API Developers Guide.
Note: If you configure Active Directory integration with the Enforce Server, users authenticate
using their Active Directory passwords. In this case the password field does not appear on
the Users screen.

See “Integrating Active Directory for user authentication” on page 137.


Managing roles and users 123
Configuring user accounts

Option Instructions

Use Certificate Select this option to use certificate authentication and allow the user to automatically single
authentication sign-on with a certificate that is generated by a separate Private Key Infrastructure (PKI). This
option is available only if you have manually configured support for certificate authentication.

See “About authenticating users” on page 127.

See “About certificate authentication configuration” on page 142.

If you select this option, you must specify the common name (CN) value for the user in the
Common Name (CN) field. The CN value appears in the Subject field of the user's certificate,
which is generated by the PKI. Common names generally use the format, first_name
last_name identification_number.

The Enforce Server uses the CN value to map the certificate to this user account. If an
authenticated certificate contains the specified CN value, all other attributes of this user
account, such as the default role and reporting preferences, are applied when the user logs
on.
Note: You cannot specify the same Common Name (CN) value in multiple Enforce Server
user accounts.

Account Disabled Select this option to lock the user out of the Enforce Server administration console. This option
disables access for the user account regardless of which authentication mechanism you use.

For security, after a certain number of consecutive failed logon attempts, the system
automatically disables the account and locks out the user. In this case the Account Disabled
option is checked. To reinstate the user account and allow the user to log on to the system,
clear this option by unchecking it.

5 Optionally enter an Email Address and select a Language for the user in the General
section of the page. The Language selection depends on the language pack(s) you have
installed.
6 In the Report Preferences section of the Users screen you specify the preferences for
how this user is to receive incident reports, including Text File Encoding and CSV
Delimiter.
If the role grants the privilege for XML Export, you can select to include incident violations
and incident history in the XML export.
7 In the Roles section, select the roles that are available to this user to assign data and
incident access privileges.
You must assign the user at least one role to access the Enforce Server administration
console.
See “Configuring roles” on page 114.
Managing roles and users 124
Configuring password enforcement settings

8 Select the Default Role to assign to this user at log on.


The default role is applied if no specific role is requested when the user logs on.
For example, the Enforce Server administration console uses the default role if the user
uses single sign-on with certificate authentication or uses the logon page.

Note: Individual users can change their default role by clicking Profile and selecting a
different option from the Default Role menu. The new default role is applied at the next
logon.

See “About authenticating users” on page 127.


9 Click Save to save the user configuration.

Note: Once you have saved a new user, you cannot edit the user name.

10 Manage users and roles as necessary.


See “Manage and add roles” on page 126.
See “Manage and add users” on page 126.

Configuring password enforcement settings


At the Systems > Settings > General screen you can require users to use strong passwords.
Strong passwords must contain at least eight characters, at least one number, and at least
one uppercase letter. Strong passwords cannot have more than two repeated characters in a
row. If you enable strong passwords, the effect is system-wide. Existing users without a strong
password must update their profiles at next logon.
You can also require users to change their passwords at regular intervals. In this case at the
end of the interval you specify, the system forces users to create a new password.
If you use Active Directory authentication, these password settings only apply to the
Administrator password. All other user account passwords are derived from Active Directory.
See “Integrating Active Directory for user authentication” on page 137.
Managing roles and users 125
Resetting the Administrator password

To configure advanced authentication settings


1 Go to System > Settings > General and click Configure.
2 To require strong passwords, locate the DLP User Authentication section and select
Require Strong Passwords.
Symantec Data Loss Prevention prompts existing users who do not have strong passwords
to create one at next logon.
3 To set the period for which passwords remain valid, type a number (representing the
number of days) in the Password Rotation Period field.
To let passwords remain valid forever, type 0 (the character for zero).

Resetting the Administrator password


Symantec Data Loss Prevention provides the AdminPasswordReset utility to reset the
Administrator's password. There is no method to recover a lost password, but you can use
this utility to assign a new password. You can also use this utility if certificate authentication
mechanisms are disabled and you have not yet defined a password for the Administrator
account.
To use the AdminPasswordReset utility, you must specify the password to the Enforce Server
database. Use the following procedure to reset the password.
To reset the Administrator password for forms-based logon
1 Log on to the Enforce Server computer using the account that you created during Symantec
Data Loss Prevention installation.

Note: Do not change permissions or ownership on any configuration file from another
root or Administrator account.

2 Change directory to the /opt/Symantec/DataLossPrevention/EnforceServer


/15.5/Protect/bin (Linux) or c:\Program
Files\Symantec\DataLossPrevention\EnforceServer\15.5\Protect\bin (Windows)
directory. If you installed Symantec Data Loss Prevention into a different directory,
substitute the correct path.
3 Execute the AdminPasswordReset utility using the following syntax:

AdminPasswordReset -dbpass oracle_password -newpass new_administrator_password

Replace oracle_password with the password to the Enforce Server database, and replace
new_administrator_password with the password you want to set.
Managing roles and users 126
Manage and add roles

Manage and add roles


The System > Login Management > Roles screen displays an alphabetical list of the roles
that are defined for your organization.
Roles listed on this screen display the following information:
■ Name – The name of the role
■ Description – A brief description of the role
Assuming that you have the appropriate privileges, you can view, add, modify, or delete roles
as follows:
■ Add a new role, or modify an existing one.
Click Add Role to begin adding a new role to the system.
Click anywhere in a row or the pencil icon (far right) to modify that role
See “Configuring roles” on page 114.
■ Click the red X icon (far right) to delete the role; a dialog box confirms the deletion.
Before editing or deleting roles, note the following guidelines:
■ If you change the privileges for a role, users in that role who are currently logged on to the
system are not affected. For example, if you remove the Edit privilege for a role, users
currently logged on retain permission to edit custom attributes for that session. However,
the next time users log on, the changes to that role take effect, and those users can no
longer edit custom attributes.
■ If you revoke an incident-viewing privilege for a role, the Enforce Server automatically
deletes any saved reports that rely on the revoked privilege. For example, if you revoke
the privilege to view network incidents, the system deletes any saved network incident
reports associated with the newly restricted role.
■ Before you can delete a role, you must make sure there are no users associated with the
role.
■ When you delete a role, you delete all shared saved reports that a user in that role saved.
See “Manage and add users” on page 126.

Manage and add users


The System > Login Management > DLP Users screen lists all the active user accounts in
the system.
For each user account, the following information is listed:
■ User Name – The name the user enters to log on to the Enforce Server
■ Email – The email address of the user
Managing roles and users 127
About authenticating users

■ Access – The role(s) in which the user is a member


Assuming that you have the appropriate privileges, you can add, edit, or delete user accounts
as follows:
■ Add a new user account, or modify an existing one.
Click Add to begin adding a new user to the system.
Click anywhere in a row or the pencil icon (far right) to view and edit that user account.
See “Configuring user accounts” on page 121.
■ Click the red X icon (far right) to delete the user account; a dialog box confirms the deletion.

Note: The Administrator account is created on install and cannot be removed from the
system.

Note: When you delete a user account, you also delete all private saved reports that are
associated with that user.

See “Manage and add roles” on page 126.

About authenticating users


Enforce Server administration console logon authentication options include SAML, forms-based,
Active Directory/Kerberos, and certificate.
Table 5-2 provides the descriptions of these mechanisms for authenticating users to the Enforce
Server administration console:
Managing roles and users 128
About authenticating users

Table 5-2 Enforce Server authentication mechanisms

Authentication Sign-on mechanism Description


mechanism

SAML Single sign-on With SAML authentication, the Enforce Server administration console
authentication authenticates each user by validating the supplied email, user name,
or other user attributes that map to attributes the identity provider uses.

When SAML is enabled, users access the Enforce Server Admin console
URL and are redirected to the identity provider logon page, where they
enter their credentials. After they are authenticated with the identity
provider, their user attributes are sent to the Enforce Server. The
Enforce Server attempts to find a user with matching attributes. If the
user is found, they are logged on to the Enforce Server administration
console.

Configuration template file used:


springSecurityContext-SAML.xml

See “About SAML authentication” on page 131.

Password Forms-based sign-on With password authentication, the Enforce Server administration console
authentication authenticates each user. It determines if the supplied user name and
password combination matches an active user account in the Enforce
Server configuration. An active user account is authenticated if it has
been assigned a valid role.

Users enter their credentials into the Enforce Server administration


console's logon page and submit them over an HTTPS connection to
the Tomcat container that hosts the administration console.

With password authentication, you must configure the user name and
password of each user account directly in the Enforce Server
administration console. You must also ensure that each user account
has at least one assigned role.

Configuration template file used:


springSecurityContext-Form.xml

See “Manage and add users” on page 126.


Managing roles and users 129
About authenticating users

Table 5-2 Enforce Server authentication mechanisms (continued)

Authentication Sign-on mechanism Description


mechanism

Active Directory Forms-based sign-on With Microsoft Active Directory authentication, the Enforce Server
authentication administration console first evaluates a supplied user name to determine
if the name exists in a configured Active Directory server. If the user
name exists in Active Directory, the supplied password for the user is
evaluated against the Active Directory password. Any password that is
configured in the Enforce Server configuration is ignored.

With Active Directory authentication, you must configure a user account


for each new Active Directory user in the Enforce Server administration
console. When you upgrade to Symantec Data Loss Prevention 15,
your existing users do not have to be set up again.

You do not have to enter a password for an Active Directory user


account. You can switch to Active Directory authentication after you
have already created user accounts in the system. However, only those
existing user names that match Active Directory user names remain
valid after the switch.

Configuration template file used:


springSecurityContext-Kerberos.xml

See “Verifying the Active Directory connection” on page 140.


Managing roles and users 130
About authenticating users

Table 5-2 Enforce Server authentication mechanisms (continued)

Authentication Sign-on mechanism Description


mechanism

Certificate Single sign-on from Certificate authentication enables a user to automatically log on to the
authentication Public Key Infrastructure Enforce Server administration console using an X.509 client certificate.
(PKI) This certificate is generated by your public key infrastructure (PKI). To
use certificate-based single sign-on, you must first enable certificate
authentication as described in this section.

See “Configuring certificate authentication for the Enforce Server


administration console” on page 144.

The client certificate must be delivered to the Enforce Server when a


client's browser performs the SSL handshake with the Enforce Server
administration console. For example, you might use a smart card reader
and middleware with your browser to automatically present a certificate
to the Enforce Server. Or, you might obtain an X.509 certificate from a
certificate authority. Then you would upload the certificate to a browser
that is configured to send the certificate to the Enforce Server.

When a user accesses the Enforce Server administration console, the


PKI automatically delivers the user's certificate to the Tomcat container
that hosts the administration console. The Tomcat container validates
the client certificate using the certificate authorities that you have
configured in the Tomcat trust store.

Configuration template file used:


springSecurityContext-Certificate.xml

See “Adding certificate authority (CA) certificates to the Tomcat trust


store” on page 146.

The Enforce Server administration console uses the validated certificate


to determine whether the certificate has been revoked.

See “About certificate revocation checks” on page 150.

If the certificate is valid and has not been revoked, then the Enforce
Server uses the common name (CN) in the certificate to determine if
that CN is mapped to an active user account with a role in the Enforce
Server configuration. For each user that accesses the Enforce Server
administration console using certificate-based single sign-on, you must
create a user account in the Enforce Server that defines the
corresponding user's CN value. You must also assign one or more valid
roles to the user account.

Here are some important things to note when you set up SAML authentication.
■ You must restart the manager when you change the way you authenticate users in SAML.
Changing this mapping criteria in the springSecurityContext file for SAML without
Managing roles and users 131
Configuring user authentication

restarting the manager results in users that are out of sync, as the system continues to use
previous version of the file. For example, if you change the mapping criteria from user name
to email address, you must restart the manager.
■ You must remap each user when you change the way you map users in SAML. Changing
mapping criteria invalidates the existing user's mapping.
■ You must validate the XML syntax before you restart the manager. Some characters such
as "&" that can be part of a user attribute make the XML invalid. You need to replace these
characters with their XML escape string. For example, instead of "&" use "&amp".
■ Do not delete any XML nodes in the XML files.
■ Attribute names in XML must exactly match (including case) attribute names in the identity
provider.
■ When switching from forms-based to SAML authentication, you must go through each user
and disable password access for non-Web Services users.
■ When switching from Certificate authentication to SAML authentication, make sure that the
ClientAuth value in server.xml is set to false.

See “Configuring user authentication” on page 131.

Configuring user authentication


About SAML authentication
SAML (Security Assertion Markup Language) user authentication is now available for logging
on to the Enforce Server administration console. SAML is an XML-based open standard data
format for exchanging authentication and authorization data between service providers and
identity providers. DLP is the service provider.
Before using SAML, you must set up the service provider, the identity provider, and map the
user attributes to identify the user.
Three types of mapping are available: by email, by user name, and by custom user attributes.
When you use SAML, the ROLE\USERNAME logon for local users is not supported.
Symantec supports the following identity providers, both on-premises and cloud based:
■ SAM (Symantec Access Manager)
■ Okta
■ SSOCircle
See the Symantec Data Loss Prevention System Requirements Guide at
http://www.symantec.com/docs/doc10602 for updates on supported IdPs.
Managing roles and users 132
Configuring user authentication

See “Setting up authentication” on page 132.

Setting up authentication
Table 5-3 shows a summary of the tasks for the setup with links to more information on each
step.

Table 5-3 Authentication configuration steps

Step Task More information

Step Edit the Spring context file for the authentication See “Set up and configure the authentication
1 method. method” on page 133.

Step Set up the authentication configuration. For SAML:See “Set up the SAML authentication
2 configuration” on page 135.

For Active Directory/Kerberos:

See “Configuring Active Directory authentication”


on page 136.

For Forms-based:

See “Configuring forms-based authentication”


on page 137.

For Certificate:

See “Configuring certificate authentication”


on page 137.

Step Restart the Enforce Server. See “About Symantec Data Loss Prevention
3 services” on page 101.

Step For SAML, generate and download the service See “Generate or download Enforce (service
4 provider SAML metadata. The Enforce Server providers) SAML metadata” on page 135.
administration console is the service provider.

Step For SAML, configure Enforce as a SAML service See “Configure the Enforce Server as a SAML
5 provider with the identity provider. service provider with the IdP (Create an
application in your identity provider)” on page 136.

Step For SAML, download the identity provider See “Export the IdP metadata to DLP”
6 metadata. on page 136.

Step Complete the process by restarting the Enforce See “About Symantec Data Loss Prevention
7 Server. services” on page 101.

Step Log on to the Enforce Server administration See “Administrator Bypass URL” on page 133.
8 console using the Administrator Bypass URL.
Managing roles and users 133
Configuring user authentication

Note: The Enforce Server administration console (the service provider in SAML) and the IdP
exchange messages using the settings in the configuration. Ensure that your settings match
with your IdP's configuration and capabilities. Unmatched settings break the system.
You must restart the Enforce Server twice: once after you set up the authentication configuration
in the springSecurityContext.xml file, and once after you download the IdP metadata file
and replace the contents of idp-metadata.xml in the Enforce install directory with the IdP
metadata.

See “Administrator Bypass URL” on page 133.

Administrator Bypass URL


The administrator bypass URL, https://<hostnameOrlp>/ProtectManager/admin/Logon
enables you to bypass SAML authentication. You can log on to the Enforce Server
administration console and use forms-based authentication to set up users. You must enter
this URL in your browser; you cannot navigate to this URL through the Enforce Server
administration console user interface.

Note: Only one active logon is available with the Bypass URL.

See “Set up and configure the authentication method” on page 133.

Set up and configure the authentication method


These steps present an overview of the common tasks for setting up and configuring all
authentication methods. Additional steps or changes for each method are explained in "Final
steps" following the initial template file configuration.

Note: The files that you must modify are commented with details to help you through the update
process.

To set up the authentication method


1 Delete (or rename) the springSecurityContext.xml file in the [your install
directory]/Protect/tomcat/webapps/ProtectManager/WEB-INF/.

2 Go to the [your install


directory]/Protect/tomcat/webapps/ProtectManager/security/template folder
and select the appropriate configuration template file for your authentication method:
■ SpringSecurityContext-SAML.xml for SAML authentication configurations
Managing roles and users 134
Configuring user authentication

■ SpringSecurityContext-Form.xml for forms and client certificate-based authentication


configurations
■ SpringSecurityContext-Certificate.xmlfor client certificate-based authentication
only
■ springSecurityContext-Kerberos.xml for Active Directory/Kerberos authentication
configurations

3 Copy the file you selected into the [your install


directory]/Protect/tomcat/webapps/ProtectManager/WEB-INF/ folder.

4 Rename the file to springSecurityContext.xml.


5 Configure the springSecurityContext.xml file:
6 Final steps:
■ SAML: For instructions on how to set up the SAML authentication configuration, see
Set up the SAML authentication configuration.
■ Forms Based: If the template file that you copied is for forms-based authentication,
there are no additional settings to configure. The DLP User Authentication section
of the General Settings now indicates that your user authentication method is Forms
Based.
■ Client certificate: To enable client certificate authentication, set clientAuth to want
or true in <InstallDirectory>/Protect/tomcat/config/server.xml. The DLP
User Authentication section of the General Settings now indicates that your user
authentication method is Certificate.
■ Active Directory: To enable Active Directory authentication, replace the value for
krbConfLocation in
[your install
directory]/Protect/tomcat/webapps/ProtectManager/WEB-INF/springSecurityContext.xml
with the path to your krb5.ini file.
The DLP User Authentication section of the General Settings now indicates that
your user authentication method is Active Directory. You can configure the list of
domains in this DLP User Authentication section of the General Settings page

Note: You can no longer perform the initial setup of Active Directory through the Enforce
Server administration console.

See “Configuring the Enforce Server for Active Directory authentication” on page 141.
See “Set up the SAML authentication configuration” on page 135.
Managing roles and users 135
Configuring user authentication

Set up the SAML authentication configuration


Get the information about your IdP, such as its choice of authentication methods, available
user identifiers, available user attributes, and the required service provider metadata.
Open [your install directory]/Protect/tomcat/webapps/ProtectManager/WEB-INF/
and set the entityBaseURL property to your Enforce URL: https://<host name or
IP>/ProtectManager.

Note: Unless you only want to access the Enforce Server administration console from the host
machine, don't use localhost as the host name.

Set the property value of "nameID" by editing the property name ="nameID" value in the
Spring file to a name identifier such as emailAddress, WindowsDomainQualifiedName, or
another nameID that your IdP supports. Here's an example for email address:
<property name="nameID"
value=urn:oasis:names:tc:SAML:1.1:nameid-format:emailAddress" />

You may want to use a combination of user attributes returned from the IdP to identify a Data
Loss Prevention user. In this case you can set the userAttributes property. For example:

<bean id=userLookupService" class="com.vontu.login.spring.VontuSAMLUserDetailsService">


<!--
<property name="user Attributes">
<set>
<value>UserName</value>
<value>EmailAddress</value>
<value>EmployeeID</value>
</set>
</property>

Generate or download Enforce (service providers) SAML metadata


To download the Enforce SAML metadata
1 Restart the Enforce Server.
2 Log on as Administrator using the Bypass url. This Bypass URL is accessed directly; you
don't need to logon to the Enforce Server administration console to access this URL.
Managing roles and users 136
Configuring user authentication

3 Go to System > Settings > General and navigate to the DLP User Authentication
section.
4 Click the link to the right of The SAML config file for your IdP is at to download the
metadata.
See “Configure the Enforce Server as a SAML service provider with the IdP (Create an
application in your identity provider)” on page 136.

Configure the Enforce Server as a SAML service provider with the


IdP (Create an application in your identity provider)
These steps vary depending on the IdP that you use. Here is a broad overview of the steps if
you use Symantec VIP Access Manager as your IdP:
To configure the Enforce Server as a SAML service provider with the IdP create an application
1 Log on to the VIP Access Manager administration console as administrator.
2 Click generic template.
3 Name the connector.
4 Select the access policy as SSO (single sign-on).
5 Configure your portal by selecting an icon for your site (this icon appears on the identity
provider's dashboard).
6 Upload the Enforce Server metadata.
See “Export the IdP metadata to DLP” on page 136.

Export the IdP metadata to DLP


Download the IdP metadata and replace the contents of the idp-metadata.xml file at
<installdirectory>/Protect/tomcat/webapps/ProtectManager/security/idp-metadata.xml
with the IdP metadata that you downloaded.
See “Configuring Active Directory authentication” on page 136.

Configuring Active Directory authentication


If the template file that you copied is for Active Directory/Kerberos authentication, open the
<InstallDirectory>/Protect/tomcat/webapps/ProtectManager/WEB-INF/springSecurityContext.xml
file in a text editor. This is the springSecurityContext-Kerberos.xml file that you previously
renamed to springSecurityContext.xml. Set the krbConfLocation value to your Kerberos
authentication file. For example (line breaks added for legibility):
Managing roles and users 137
Integrating Active Directory for user authentication

<!--- Set krbConfLocation in System prooperties -->


<bean class="org.springframework.security.kerberos.authentication.sun.
GlobalJunJaasKerberosConfig">
<!-- krb5 configuration file location.
For example:
C:\Program Files\Symantec\DataLossPrevention\EnforceServer\15.5\
Protect\config\krb5.ini on Windows
or
/opt/Symantec/DataLossPrevention/EnforceServer/15.5/
Protect/config/krb5.conf on Linux
-->
property name="krbConfLocation" value="C:\Program Files\Symantec\
DataLossPrevention\EnforceServer\15.5\protect
\config\krb5.ini"/>
</bean>

See “Set up and configure the authentication method” on page 133.


See “Configuring forms-based authentication” on page 137.
See “Integrating Active Directory for user authentication” on page 137.

Configuring forms-based authentication


After you copy the template file for forms-based authentication, there are no additional settings
to configure.
See “Configuring certificate authentication” on page 137.

Configuring certificate authentication


After you copy the template file for client certificate-based authentication, go to the <Install
Directory>/Protect/tomcat/config/server.xml file and set the client auth value to
want or true.

See “Generate or download Enforce (service providers) SAML metadata” on page 135.

Integrating Active Directory for user authentication


You can configure the Enforce Server to use Microsoft Active Directory for user authentication.
After you switch to Active Directory authentication, you must still define users in the Enforce
Server administration console. If the user names you enter in the Administration Console match
Active Directory users, the system associates any new user accounts with Active Directory
passwords. You can switch to Active Directory authentication after you have already created
Managing roles and users 138
Integrating Active Directory for user authentication

user accounts in the system. Only those existing user names that match Active Directory user
names remain valid after the switch.
Users must use their Active Directory passwords when they log on. Note that all Symantec
Data Loss Prevention user names remain case sensitive, even though Active Directory user
names are not. You can switch to Active Directory authentication after already having created
user names in Symantec Data Loss Prevention. However, users still have to use the
case-sensitive Symantec Data Loss Prevention user name when they log on.
To use Active Directory authentication
1 Verify that the Enforce Server host is time-synchronized with the Active Directory server.

Note: Ensure that the clock on the Active Directory host is synched to within five minutes
of the clock on the Enforce Server host.

2 (Linux only) Make sure that the following Red Hat RPMs are installed on the Enforce
Server host:
■ krb5-workstation

■ krb5-libs

■ pam_krb5

3 Create the krb5.ini (or krb5.conf for Linux) configuration file that gives the Enforce
Server information about your Active Directory domain structure and Active Directory
server addresses.
See “Creating the configuration file for Active Directory integration” on page 138.
4 Confirm that the Enforce Server can communicate with the Active Directory server.
See “Verifying the Active Directory connection” on page 140.
5 Configure Symantec Data Loss Prevention to use Active Directory authentication.
See “Configuring the Enforce Server for Active Directory authentication” on page 141.

Creating the configuration file for Active Directory integration


You must create a krb5.ini configuration file (or krb5.conf on Linux) to give Symantec Data
Loss Prevention information about your Active Directory domain structure and server locations.
This step is required if you have more than one Active Directory domain. However, even if
your Active Directory structure includes only one domain, it is still recommended to create this
file. The kinit utility uses this file to confirm that Symantec Data Loss Prevention can
communicate with the Active Directory server.
Managing roles and users 139
Integrating Active Directory for user authentication

Note: If you are running Symantec Data Loss Prevention on Linux, verify the Active Directory
connection using the kinit utility. You must rename the krb5.ini file as krb5.conf. The kinit
utility requires the file to be named krb5.conf on Linux. Symantec Data Loss Prevention
assumes that you use kinit to verify the Active Directory connection, and directs you to rename
the file as krb5.conf.

Symantec Data Loss Prevention provides a sample krb5.ini file that you can modify for use
with your own system. The sample file is stored in \15.5\Protect\config (for example,
\Program Files\Symantec\DataLossPrevention\EnforceServer\15.5\Protect\config
on Windows or /opt/Symantec/DataLossPrevention/EnforceServer/15.5/Protect/config
on Linux). If you are running Symantec Data Loss Prevention on Linux, Symantec recommends
renaming the file to krb5.conf. The sample file, which is divided into two sections, looks like
this:

[libdefaults]
default_realm = TEST.LAB
[realms]
ENG.COMPANY.COM = {
kdc = engAD.eng.company.com
}
MARK.COMPANY.COM = {
kdc = markAD.eng.company.com
}
QA.COMPANY.COM = {
kdc = qaAD.eng.company.com
}

The [libdefaults] section identifies the default domain. (Note that Kerberos realms
correspond to Active Directory domains.) The [realms] section defines an Active Directory
server for each domain. In the previous example, the Active Directory server for
ENG.COMPANY.COM is engAD.eng.company.com.
Managing roles and users 140
Integrating Active Directory for user authentication

To create the krb5.ini or krb5.conf file


1 Go to SymantecDLP\Protect\config and locate the sample krb5.ini file. For example,
locate the file in \Program
Files\Symantec\DataLossPrevention\EnforceServer\15.5\Protect\config (on
Windows) or
/opt/Symantec/DataLossPrevention/EnforceServer/15.5/Protect/config (on Linux).

2 Copy the sample krb5.ini file to the c:\windows directory (on Windows) or the /etc
directory (on Linux). If you are running Symantec Data Loss Prevention on Linux, plan to
verify the Active Directory connection using the kinit command-line tool. Rename the file
as krb5.conf.
See “Verifying the Active Directory connection” on page 140.
3 Open the krb5.ini or krb5.conf file in a text editor.
4 Replace the sample default_realm value with the fully qualified name of your default
domain. (The value for default_realm must be all capital letters.) For example, modify
the value to look like the following:

default_realm = MYDOMAIN.LAB

5 Replace the other sample domain names with the names of your actual domains. (Domain
names must be all capital letters.) For example, replace ENG.COMPANY.COM with
ADOMAIN.COMPANY.COM.

6 Replace the sample kdc values with the host names or IP addresses of your Active
Directory servers. (Be sure to follow the specified format, in which opening brackets are
followed immediately by line breaks.) For example, replace engAD.eng.company.com with
ADserver.eng.company.com, and so on.

7 Remove any unused kdc entries from the configuration file. For example, if you have only
two domains besides the default domain, delete the unused kdc entry.
8 Save the file.

Verifying the Active Directory connection


kinit is a command-line tool you can use to confirm that the Active Directory server responds
to requests. It also verifies that the Enforce Server has access to the Active Directory server.
For Microsoft Windows installations, the utility is installed by the Symantec Data Loss Prevention
installer in the C:\Program
Files\Symantec\DataLossPrevention\EnforceServer\15.1\Protect\jre\bin directory.
For Linux installations, the utility is part of the Red Hat Enterprise Linux distribution, and is in
the following location: /usr/kerberos/bin/kinit. You can also download Java SE 6 and
locate the kinit tool in \java_home\jdk1.6.0\bin.
Managing roles and users 141
Integrating Active Directory for user authentication

If you run the Enforce Server on Linux, use the kinit utility to test access from the Enforce
Server to the Active Directory server. Rename the krb5.ini file as krb5.conf. The kinit
utility requires the file to be named krb5.conf on Linux.
To test the connection to the Active Directory server
1 On the Enforce Server host, go to the command line and navigate to the directory where
kinit is located.

2 Issue a kinit command using a known user name and password as parameters. (Note
that the password is visible in clear text when you type it on the command line.) For
example, issue the following:

kinit kchatterjee mypwd10#

The first time you contact Active Directory you may receive an error that it cannot find the
krb5.ini or krb5.conf file in the expected location. On Windows, the error looks similar
to the following:

krb_error 0 Could not load configuration file c:\winnt\krb5.ini


(The system cannot find the file specified) No error.

In this case, copy the krb5.ini or krb5.conf file to the expected location and then rerun
the kinit command that is previously shown.
3 Depending on how the Active Directory server responds to the command, take one of the
following actions:
■ If the Active Directory server indicates it has successfully created a Kerberos ticket,
continue configuring Symantec Data Loss Prevention.
■ If you receive an error message, consult with your Active Directory administrator.

Configuring the Enforce Server for Active Directory authentication


Perform the procedure in this section when you first set up Active Directory authentication,
and any time you want to modify existing Active Directory settings. Make sure that you have
completed the prerequisite steps before you enable Active Directory authentication.
See “Integrating Active Directory for user authentication” on page 137.
To configure the Enforce Server to use Active Directory for authentication:
1 Make sure all users other than the Administrator are logged out of the system.
2 In the Enforce Server administration console, go to System > Settings > General and
click Configure (at top left).
Managing roles and users 142
About certificate authentication configuration

3 At the Edit General Settings screen that appears, locate the Active Directory
Authentication section near the bottom and select (check) Perform Active Directory
Authentication.
The system then displays several fields to fill out.
4 See “Creating the configuration file for Active Directory integration” on page 138.
5 If your environment has more than one Active Directory domain, click Configure and
enter the domain names (separated by commas) in the Active Directory Domain List
field.
The system displays Active Directory domains in a drop-down list on the user logon page.
Users then select the appropriate domain at logon. Do not list the default domain, as it
already appears in the drop-down list by default.
6 Click Save.
7 Go to the operating system services tool and restart the Symantec Data Loss Prevention
Manager service.

About certificate authentication configuration


Certificate authentication enables a user to automatically log on to the Enforce Server
administration console. The user logs on using a client certificate that your public key
infrastructure (PKI) generates. When a user accesses the Enforce Server administration
console, the PKI automatically delivers the user's certificate to the Tomcat container that hosts
the administration console. The Tomcat container validates the client certificate using the
certificate authorities that you have configured in the Tomcat trust store.
The client certificate is delivered to the Enforce Server computer when a client's browser
performs the SSL handshake with the Enforce Server. For example, some browsers might be
configured to operate with a smart card reader to present the certificate. Alternately, you can
upload the X.509 certificate to a browser and configure the browser to send the certificate to
the Enforce Server.
If the certificate is valid, the Enforce Server administration console may also determine if the
certificate was revoked.
See “About certificate revocation checks” on page 150.
If the certificate is valid, then the Enforce Server uses the common name (CN) in the certificate
to determine if that CN is mapped to an active user account with a role.

Note: Some browsers cache a user's client certificate, and automatically log the user on to the
Administration Console after the user has chosen to sign out. In this case, users must close
the browser window to complete the log out process.
Managing roles and users 143
About certificate authentication configuration

The following table describes the steps necessary to use certificate authentication with
Symantec Data Loss Prevention.

Table 5-4 Steps to configure certificate authentication

Phase Action Description

1 Enable certificate authentication on the Enforce You can configure an existing Enforce Server
Server computer. to enable authentication. Enforce Servers have
form-based authentication by default.

See “Configuring certificate authentication for


the Enforce Server administration console”
on page 144.

2 Add certificate authority (CA) certificates to You can add CA certificates to the Tomcat trust
establish the trust chain. store with the Java keytool utility to manually
add certificates to an existing Enforce Server.

See “Adding certificate authority (CA) certificates


to the Tomcat trust store” on page 146.

3 (Optional) Change the Tomcat trust store The Symantec Data Loss Prevention installer
password. configures each new Enforce Server installation
with a default Tomcat trust store password.
Follow these instructions to configure a secure
password.

See “Changing the Tomcat trust store password”


on page 147.

4 Map certificate common name (CN) values to See “Mapping Common Name (CN) values to
Enforce Server user accounts. Symantec Data Loss Prevention user accounts”
on page 149.

5 Configure the Enforce Server to check for See “About certificate revocation checks”
certificate revocation. on page 150.

6 Verify Enforce Server access using See “Troubleshooting certificate authentication”


certificate-based single sign-on. on page 153.

7 (Optional) Disable forms-based logon. If you want to use certificate-based single


sign-on for all access to the Enforce Server,
disable forms-based logon.

See “Disabling password authentication and


forms-based logon” on page 154.
Managing roles and users 144
About certificate authentication configuration

Configuring certificate authentication for the Enforce Server


administration console
Form-based authentication is available by default on the Enforce Server. You must add
certificate authentication manually. Follow this procedure to manually enable form and certificate
authentication on a Symantec Data Loss Prevention installation.
To enable form and certificate authentication for users of the Enforce Server administration
console
1 Log on to the Enforce Server computer using the account that you created during Symantec
Data Loss Prevention installation.

Note: Do not change permissions or ownership on any configuration file from another
root or Administrator account.

2 Copy the corresponding springSecurityContext.xml file into the Tomcat WEB-INF


directory.
3 Edit C:\Program
Files\Symantec\DataLossPrevention\EnforceServer\15.5\Protect\tomcat\conf\server.xml
(Windows) or
/opt/Symantec/DataLossPrevention/EnforceServer/15.5/Protect/tomcat/conf/server.xml
(Linux) and change the certificateVerification value from none to optional. Change
the revocationEnabled value from true to false. Save the file.
4 Restart the Enforce Server. This change to the server.xml file that you edited in the
previous step enables the Use Certificate authentication check box in the Enforce Server
administration console user interface.
5 Logon to the Enforce Server administration console and go to System > Login
Management > DLP Users.
6 Check Use Certificate authentication and indicate the corresponding CN mapping.
7 Add the CA certificates to the Tomcat trust store using the Java keytool utility.
See “Adding certificate authority (CA) certificates to the Tomcat trust store” on page 146.
Ensure that you have installed all necessary certificates and that users can log on with
certificate authentication.
Now the end user has both form-based authentication and certificate authentication.
About certificate revocation checks
Follow this procedure to enable certificate authentication on Symantec Data Loss Prevention.
Managing roles and users 145
About certificate authentication configuration

To enable certificate authentication for users of the Enforce Server administration console
1 Log on to the Enforce Server computer using the account that you created during Symantec
Data Loss Prevention installation.

Note: Do not change permissions or ownership on any configuration file from another
root or Administrator account.

2 Copy the corresponding springSecurityContext.xml file into the Tomcat WEB-INF


directory.
3 Edit C:\Program
Files\Symantec\DataLossPrevention\EnforceServer\15.5\Protect\tomcat\conf\server.xml
(Windows) or
/opt/Symantec/DataLossPrevention/EnforceServer/15.5/Protect/tomcat/conf/server.xml
(Linux) and change thecertificate verification value from false to optional. Save
the file.
4 Restart the Enforce Server. This change to the server.xml file that you edited in the
previous step enables the Use Certificate authentication check box in the Enforce Server
administration console user interface.
5 Logon to the Enforce Server administration console and go to System > Login
Management > DLP Users.
6 Check Use Certificate authentication and indicate the corresponding Common Name
(CN) mapping.
7 Add the CA certificates to the Tomcat trust store using the Java keytool utility.
See “Adding certificate authority (CA) certificates to the Tomcat trust store” on page 146.
Ensure that you have installed all necessary certificates and that users can log on with
certificate authentication.
Managing roles and users 146
About certificate authentication configuration

8 For certificate authentication only, copy the springSecurityContext-Certificate.xml


file from C:\Program Files\Symantec\DataLossPrevention\EnforceServer\
15.5\Protect\tomcat\webapps\ProtectManager\security\template (Windows) or
opt/Symantec/DataLossPrevention/EnforceServer/
15.5/Protect/tomcat/webapps/ProtectManager/WEB-INF (Linux) and rename it to
springSecurityContext.xml.

9 Edit the C:\Program


Files\Symantec\DataLossPrevention\EnforceServer\15.5\Protect\tomcat\conf\server.xml
(Windows) or
/opt/Symantec/DataLossPrevention/EnforceServer/15.5/Protect/tomcat/conf/server.xml
file and change the certificateVerification value from optional to required.
Restart the Enforce Server.
Now the user has certificate authentication only.

See “Adding certificate authority (CA) certificates to the Tomcat trust store” on page 146.

Adding certificate authority (CA) certificates to the Tomcat trust store


To use certificate authentication with Symantec Data Loss Prevention, you must add all of the
CA certificates that are required to authenticate users in your system to the Tomcat trust store.
For Symantec Data Loss Prevention 15.0 and later, CA certificates can only be imported to
the Enforce Server using the Java keytool utility. Each X.509 certificate must be provided in
Distinguished Encoding Rules (DER) format in a .cer file. If multiple CAs are required to
establish the certificate chain, then you must add multiple .cer files.
To add certificate CA certificates to the Tomcat trust store
1 Log on to the Enforce Server computer using the account that you created during Symantec
Data Loss Prevention installation.

Note: Do not change permissions or ownership on any configuration file from another
root or Administrator account.

2 Change directory to the


/opt/Symantec/DataLossPrevention/EnforceServer/15.5/Protect/tomcat/conf
(Linux) or c:\Program
Files\Symantec\DataLossPrevention\EnforceServer\15.5\Protect\tomcat\conf
(Windows) directory. If you installed Symantec Data Loss Prevention to a different directory,
substitute the correct path.
3 Copy all certificate files (.cer files) that you want to import to the conf directory on the
Enforce Server computer.
Managing roles and users 147
About certificate authentication configuration

4 Use the keytool utility that is installed with Symantec Data Loss Prevention to add a
certificate to the Tomcat trust store. For Windows systems, enter:

c:\Program Files\Symantec\DataLossPrevention\EnforceServer\jre\bin\keytool
-import
-trustcacerts
-alias CA_CERT_1
-file certificate_1.cer
-keystore .\truststore.jks

For Linux systems, enter:

/opt/Symantec/DataLossPrevention/jre/bin/keytool
-import
-trustcacerts
-alias CA_CERT_1
-file certificate_1.cer
-keystore ./truststore.jks

In these commands, replace CA_CERT_1 with a unique alias for the certificate that you
import. Replace certificate_1.cer with the name of the certificate file you copied to the
Enforce Server computer.
5 Enter the password to the keystore at the keytool utility prompt. The default keystore
password is protect.
6 Repeat these steps to install all the certificate files that are necessary to complete the
certificate chain.
7 Stop and then restart the Symantec DLP Manager service to apply your changes.
8 If you have not yet changed the default Tomcat keystore password, do so now.
See “Changing the Tomcat trust store password” on page 147.

Changing the Tomcat trust store password


When you install Symantec Data Loss Prevention, the Tomcat trust store uses protect as
the default password. Follow this procedure to assign a secure password to the Tomcat trust
store when you use certificate authentication.
Managing roles and users 148
About certificate authentication configuration

To change the Tomcat trust store password


1 Log on to the Enforce Server computer using the account that you created during Symantec
Data Loss Prevention installation.

Note: Do not change permissions or ownership on any configuration file from another
root or Administrator account.

2 Change directory to the


/opt/Symantec/DataLossPrevention/EnforceServer/15.5/jre/bin/ (Linux) or
c:\Program
Files\Symantec\DataLossPrevention\EnforceServer\15.5\Protect\config\
(Windows) directory. If you installed Symantec Data Loss Prevention to a different directory,
substitute the correct path.
3 Use the keytool utility that is installed with Symantec Data Loss Prevention to change
the Tomcat truststore password. For Windows systems, enter:

c:\Program Files\Symantec\DataLossPrevention\ServerJRE\1.8.0_162\bin\
keytool - storepasswd -new new_password -keystore ./truststore.jks

For Linux systems, enter:

/opt/Symantec/DataLossPrevention/EnforceServer/15.5/jre/bin/keytool -storepasswd
-new new_password -keystore ./truststore.jks

Replace new_password with a secure password.


4 Enter the current password to the keystore when the keytool utility prompts you to do
so. The default password is protect.
5 Change directory to the
/opt/Symantec/DataLossPrevention/EnforceServer/15.5/Protect/tomcat/conf
(Linux) or c:\Program
Files\Symantec\DataLossPrevention\EnforceServer\15.5\Protect\tomcat\conf
(Windows) directory. If you installed Symantec Data Loss Prevention into a different
directory, substitute the correct path.
6 Open the server.xml file with a text editor.
Managing roles and users 149
About certificate authentication configuration

7 In the following line in the file, edit the truststorePass="protect" entry to specify your
new password:

<Connector URIEncoding="UTF-8" acceptCount="100" clientAuth="want"


debug="0" disableUploadTimeout="true" enableLookups="false"
keystoreFile="conf/.keystore" keystorePass="protect"
maxSpareThreads="75" maxThreads="150" minSpareThreads="25"
port="443" scheme="https" secure="true" sslProtocol="TLS"
truststoreFile="conf/truststore.jks" truststorePass="protect"/>

Replace protect with the new password that you defined in the keytool command.
8 Save your changes and exit the text editor.
9 Change directory to the
/opt/Symantec/DataLossPrevention/EnforceServer/15.5/Protect/config (Linux)
or c:\Program
Files\Symantec\DataLossPrevention\EnforceServer\15.5\Protect\config (Windows)
directory. If you installed Symantec Data Loss Prevention into a different directory,
substitute the correct path.
10 Open the Manager.properties file with a text editor.
Add the following line in the file to specify the new password:

com.vontu.manager.tomcat.truststore.password = password

Replace password with the new password. Do not enclose the password in quotation
marks.
11 Save your changes and exit the text editor.
12 Open the Protect.properties file with a text editor.
13 Edit (or if not present, add) the following line in the file to specify the new password:
com.vontu.manager.tomcat.truststore.password = password

Replace password with the new password. Do not enclose the password in quotation
marks.
14 Save your changes and exit the text editor.
15 Stop and then restart the Symantec DLP Manager service to apply your changes.

Mapping Common Name (CN) values to Symantec Data Loss


Prevention user accounts
Each user that accesses the Enforce Server administration console using certificate-based
single sign-on must have an active user account in the Enforce Server configuration. The user
Managing roles and users 150
About certificate authentication configuration

account associates the common name (CN) value from the user's client certificate to one or
more roles in the Enforce Server administration console. You can map a CN value to only one
Enforce Server user account.
The user account that you create does not require a separate Enforce Server administration
console password. You can optionally configure a password if you want to allow the user to
also log on from the Enforce Server administration console log-on page. If you enable password
authentication and the user does not provide a certificate when the browser asks for one, then
the Enforce Server displays the log-on page. A log-on failure is displayed if password
authentication is disabled and the user does not provide a certificate.
An active user account must identify a user's CN value and have a valid role assigned in the
Enforce Server to log on using single sign-on with certificate authentication. You can disable
or delete the associated Enforce Server user account to prevent a user from accessing the
Enforce Server administration console without revoking their client certificate.
See “Configuring user accounts” on page 121.

About certificate revocation checks


While managing your public key infrastructure, you may need to revoke a client's certificate
with the CA. For example, you might revoke a certificate if an employee leaves the company,
or if an employee's credentials are lost or stolen. When you revoke a certificate, the CA uses
one or more Certificate Revocation Lists (CRLs) to publish those certificates that are no longer
valid.

Note: Certificate revocation checking is disabled by default. You must enable it and configure
it. See “Configuring certificate revocation checks” on page 151.

Symantec Data Loss Prevention retrieves revocation lists from a Certificate Revocation List
Distribution Point (CRLDP). To check revocation using a CRLDP, the client certificate must
include a CRL distribution point field. The following shows an example CRLDP field definition:

[1]CRL Distribution Point


Distribution Point Name:
Full Name: URL=http://my_crldp

Note: Symantec Data Loss Prevention does not support specifying the CRLDP using an LDAP
URL.

If the CRL distribution point is defined in each certificate and the Enforce Server can directly
access the server, then no additional configuration is required to perform revocation checks.
If the CRL distribution point is accessible only by a proxy server, then you must configure the
proxy server settings in the Symantec Data Loss Prevention configuration.
Managing roles and users 151
About certificate authentication configuration

See “Accessing the CRLDP with a proxy” on page 152.


Regardless of which revocation checking method you use, you must enable certificate revocation
checks on the Enforce Server computer. Certificate revocation checks are enabled by default
if you select certificate installation during the Enforce Server installation. If you upgraded an
existing Symantec Data Loss Prevention installation, certificate revocation is not enabled by
default.
See “Configuring certificate revocation checks” on page 151.

Configuring certificate revocation checks


When you enable certificate revocation checks, Symantec Data Loss Prevention uses a CRLDP
to determine the revocation status.
Follow this procedure to enable certificate revocation checks.
To configure certificate revocation checks
1 Ensure that the CRLDP is defined in the CRL distribution point field of each client certificate.
2 Log on to the Enforce Server computer using the account that you created during Symantec
Data Loss Prevention installation.

Note: Do not change permissions or ownership on any configuration file from another
root or Administrator account.

3 Navigate to the c:\Program


Files\Symantec\DataLossPrevention\EnforceServer\15.5\Protect\tomcat\conf\server.xml
(Windows) or
/opt/Symantec/DataLossPrevention/EnforceServer/15.5/Protect/tomcat/conf/server.xml
(Linux) directory and update the revocationEnabled value from false to true.
4 To enable revocation checking using a CRLDP, add or uncomment the following line in
the file:

wrapper.java.additional.22=-Dcom.sun.security.enableCRLDP=true

This option is enabled by default for new Symantec Data Loss Prevention installations.
Managing roles and users 152
About certificate authentication configuration

5 If you use CRLDP revocation checks, optionally configure the cache lifetime using the
property:

wrapper.java.additional.22=-Dsun.security.certpath.ldap.cache.lifetime=30

This parameter specifies the length of time, in seconds, to cache the revocation lists that
are obtained from a CRL distribution point. After this time is reached, a lookup is performed
to refresh the cache the next time there is an authentication request. The default cache
lifetime 30 seconds. Specify 0 to disable the cache, or -1 to store cache results indefinitely.
6 Stop and then restart the Symantec DLP Manager service to apply your changes.

Note: Symantec Data Loss Prevention supports certificate revocation when the Enforce Server
is in non-FIPS mode.

Accessing the CRLDP with a proxy


Symantec recommends that you allow direct access from the Enforce Server computer to all
CRLDP servers that are required to perform certificate revocation checks. If the CRLDP servers
are accessible only through a proxy, then you must configure the proxy settings on the Enforce
Server computer.
When you configure a proxy, the Enforce Server uses your proxy configuration for all HTTP
connections, such as those connections that are created to connect to a CRLDP server to
fetch certificate revocation lists. Check with your proxy administrator before you configure
these proxy settings, and consider allowing direct access to CRLDP servers if at all possible.
To configure proxy settings for a CRLDP server
1 Ensure that the CRLDP is defined in the CRL distribution point field of each client certificate.
2 Log on to the Enforce Server computer using the account that you created during Symantec
Data Loss Prevention installation.

Note: Do not change permissions or ownership on any configuration file from another
root or Administrator account.

3 Change directory to the


/opt/Symantec/DataLossPrevention/EnforceServer/15.5/Protect/config (Linux)
or c:\Program
Files\Symantec\DataLossPrevention\EnforceServer\15.5\Protect\config (Windows)
directory. If you installed Symantec Data Loss Prevention into a different directory,
substitute the correct path.
4 Open the SymantecDLPManager.conf file with a text editor.
Managing roles and users 153
About certificate authentication configuration

5 Add or edit the following configuration properties to identify the proxy:

wrapper.java.additional.22=-Dhttp.proxyHost=myproxy.mydomain.com
wrapper.java.additional.23=-Dhttp.proxyPort=8080
wrapper.java.additional.24=-Dhttp.nonProxyHosts=hosts

Replace myproxy.mydomain.com and 8080 with the host name and port of your proxy
server. You can include server host names, fully qualified domain names, or IP addresses
separated with a pipe character. For example:

wrapper.java.additional.24=-Dhttp.nonProxyHosts=crldp-server|
127.0.0.1|DataInsight_Server_Host

6 Save your changes to the configuration file.


7 Stop and then restart the Symantec DLP Manager service to apply your changes.

Troubleshooting certificate authentication


By default Symantec Data Loss Prevention logs each successful log-on request to the Enforce
Server administration console. Symantec Data Loss Prevention also logs an error message
if a logon request is made without supplying a certificate, or if a valid certificate presents a CN
that does not map to a valid user account in the Enforce Server configuration.

Note: If certificate authentication fails while the browser establishes an HTTPS connection to
the Enforce Server administration console, then Symantec Data Loss Prevention cannot log
an error message.

You can optionally log additional information about certificate revocation checks by adding or
uncommenting the following system property in the SymantecDLPManager.conf file:

wrapper.java.additional.90=-Djava.security.debug=certpath

SymantecDLPManager.conf is located in the c:\Program


Files\Symantec\DataLossPrevention\EnforceServer\15.5\Protect\config (Windows)
or /opt/Symantec/DataLossPrevention/EnforceServer/15.5/Protect/config (Linux)
directory. All debug messages are logged to
c:\ProgramData\Symantec\DataLossPrevention\EnforceServer\15.5
\Protect\logs\debug\SymantecDLPManager.log (Windows) or
/var/log/Symantec/DataLossPrevention/EnforceServer/15.5/debug/SymantecDLPManager.log
(Linux).
Managing roles and users 154
About certificate authentication configuration

Disabling password authentication and forms-based logon


Forms-based log on with password authentication can be used as a fallback access mechanism
while you configure and test certificate authentication. After you configure certificate
authentication, you can disable forms-based logon and password authentication. Your public
key infrastructure then handles all logon requests.
Once you configure the common name (CN) with both forms and certificate enabled, then you
can switch to certificate-only. You replace the springSecurityContext.xml file with the
springSecurityContext-Certificate.xml file and restart the Enforce Server. Form-based
logon is then completely disabled.

Note: When you disable forms-based logon you disable the feature for all users, including
those with Administrator privileges. As an alternative, you can disable forms-based logon or
certificate authentication for an individual user by configuring that user's account.
See “Configuring user accounts” on page 121.

If you later turn on forms-based logon but the Administrator user account does not have a
password configured, you can reset the Administrator password. Reset the password using
the AdminPasswordReset utility.
See “Resetting the Administrator password” on page 125.
Chapter 6
Connecting to group
directories
This chapter includes the following topics:

■ Creating connections to LDAP servers

■ Configuring directory server connections

■ Scheduling directory server indexing

Creating connections to LDAP servers


Symantec Data Loss Prevention supports directory server connections to LDAP-compliant
directory servers such as Microsoft Active Directory (AD). A group directory connection specifies
how the Enforce Server or Discover Server connects to the directory server.
The connection to the directory server must be established before you create any user groups
in the Enforce Server. The Enforce Server or Discover Server uses the connection to obtain
details about the groups. If this connection is not created, you are not able to define any User
Groups. The connection is not permanent, but you can configure the connection to synchronize
at a specified interval. The directory server contains all of the information that you need to
create User Groups.
See “User Groups” on page 376.

Note: If you use a directory server that contains a self-signed authentication certificate, you
must add the certificate to the Enforce Server or the Discover Server. If your directory server
uses a pre-authorized certificate, it is automatically added to the Enforce Server or Discover
Server. See “Importing SSL certificates to Enforce or Discover servers” on page 277.
Connecting to group directories 156
Configuring directory server connections

To create a group directory connection


1 Navigate to the System > Settings > Directory Connections screen.
2 Click Add Connection.
3 Configure the directory connection.
See “Configuring directory server connections” on page 156.

Configuring directory server connections


The Directory Connections page is the home page for configuring directory server connections.
Once you define the directory connection, you can create one or more User Groups.
See “Configuring User Groups” on page 936.

Table 6-1 Configuring directory server connections

Step Action Description

1 Navigate to the Directory Connections This page is available at System > Settings > Directory
page (if not already there). Connections.

2 Click Create New Connection. This action takes you to the Configure Directory
Connection page.

3 Enter a Name for the directory server The Connection Name is the user-defined name for the
connection. connection. It appears at the Directory Connections home
page once the connection is configured.

4 Specify the Network Parameters for the Table 6-2 provides details on these parameters.
directory server connection. Enter or specify the following parameters:

■ The Hostname of the computer where the directory


server is installed.
■ The Port on the directory server that supports
connections.
■ The Base DN (distinguished name) of the directory
server.
■ The Encryption Method for the connection, either None
or Secure.

5 Specify the Authentication mode for Table 6-3 provides details on configuring the authentication
connecting to the directory server. parameters.

6 Click Test Connection to verify the If there is anything wrong with the connection, the system
connection. displays an error message describing the problem.
Connecting to group directories 157
Configuring directory server connections

Table 6-1 Configuring directory server connections (continued)

Step Action Description

7 Click Save to save the direction connection The system automatically indexes the directory server after
configuration. you successfully create, test, and save the directory server
connection.

8 Select the Index and Replication Status Verify that the directory server was indexed. After some time
tab. (depending on the size of the directory server query), you
should see that the Replication Status is "Completed
<date> <time>". If you do not see that the status is
completed, verify that you have configured and tested the
directory connection properly. Contact your directory server
administrator for assistance.

9 Select the Index Settings tab. You can adjust the directory server indexing schedule as
necessary at the Index Settings tab.

See “Scheduling directory server indexing” on page 158.

Table 6-2 Directory connection network parameters

Network parameters Description

Hostname Enter the Hostname of the directory server.

For example: enforce.dlp.symantec.com

You must enter the Fully Qualified Name (FQN) of the directory server. Do not use
the IP address.

Port Enter the connection Port for the directory server.

For example: 389

Typically the port is 389 or 636 for secure connections.

Base DN Enter the Base DN for the directory server. This field only accepts one directory
server entry.

For example: dc=enforce,dc=dlp,dc=symantec,dc=com

The Base DN string cannot contain any space characters.


The Base DN is the base distinguished name of the directory server. Typically, this
name is the domain name of the directory server. The Base DN parameter defines
the initial depth of the directory server search.
Connecting to group directories 158
Scheduling directory server indexing

Table 6-2 Directory connection network parameters (continued)

Network parameters Description

Encryption Method Select the Secure option if you want the communication between the directory server
and the Enforce Server to be encrypted using SSL.
Note: If you choose to use a secure connection, you may need to import the SSL
certificate for the directory server to the Enforce Server keystore. See “Importing SSL
certificates to Enforce or Discover servers” on page 277.

Table 6-3 Directory connection authentication parameters

Authentication Description

Authentication Select the Authentication option to connect to the directory server using
authentication mode. Check Connect with Credentials to add your username and
password to authenticate to the directory server.

Username To authenticate with Active Directory, use one of the following methods:

■ Domain and user name, for example: Domain\username


■ User name and domain, for example: [email protected]
■ Fully distinguished user name and domain (without spaces), for example:
cn=username,cn=Users,dc=domain,dc=com
To authenticate with another type of directory server:

■ A different syntax may be required, for example:


uid=username,ou=people,o=company

Password Enter the password for the user name that was specified in the preceding field.

The password is obfuscated when you enter it.

Scheduling directory server indexing


Each directory connection is set to automatically index the configured LDAP server once at
12:00 AM the day after you create the initial connection. You can modify the indexing schedule
to specify when and how often the index is synchronized.
Each directory server connection is set to automatically index the configured User Groups
hosted in the directory server once at 12:00 AM the day after you create the initial connection.
After you create, test, and save the directory server connection, the system automatically
indexes all of the User Groups that are hosted in the directory whose connection you have
established. You can modify this setting, and schedule indexing daily, weekly, or monthly.
Connecting to group directories 159
Scheduling directory server indexing

To schedule group directory indexing


1 Select an existing group directory server connection from the System > Settings >
Directory Connections screen. Or, create a new connection.
See “Configuring directory server connections” on page 156.
2 Adjust the Index Settings to the desired schedule.
See Table 6-4 on page 159.

Table 6-4 Schedule group directory server indexing and view status

Index Settings Description

Index the directory server The Once setting is selected by default and automatically indexes the director
once. server at 12:00 AM the day after you create the initial connection.

You can modify the default Once indexing schedule by specifying when and
how often the index is supposed to be rebuilt.

Index the directory server Select the Daily option to schedule the index daily.
daily.
Specify the time of day and, optionally, the Until duration for this schedule.

Index the directory server Select the Weekly option to schedule the index to occur once a week.
weekly.
Specify the day of the week to index.

Specify the time to index.

Optionally, specify the Until duration for this schedule.

Index the directory server Specify the day of the month to index the directory and the time.
monthly.
Optionally, specify the Until duration for this schedule.

View the indexing and Select the Index and Replication Status tab to view the status of the indexing
replication status. process.

■ Indexing Status
Displays the next scheduled index, date and time.
■ Detection Server Name
Displays the detection server where the User Group profile is deployed.
■ Replication Status
■ Displays the data and time of the most recent synchronization with the
directory group server.
Chapter 7
Managing stored
credentials
This chapter includes the following topics:

■ About the credential store

■ Adding new credentials to the credential store

■ Configuring endpoint credentials

■ Managing credentials in the credential store

■ Managing stored credentials

About the credential store


An authentication credential can be stored as a named credential in a central credential store.
It can be defined once, and then referenced by any number of Discover targets. Passwords
are encrypted before they are stored.
The credential store simplifies management of user name and password changes.
You can add, delete, or edit stored credentials.
See “Adding new credentials to the credential store” on page 161.
See “Managing credentials in the credential store” on page 162.
The Credential Management screen is accessible to users with the "Credential Management"
privilege.
Stored credentials can be used when you edit or create a Discover target.
See “Network Discover/Cloud Storage Discover scan target configuration options” on page 2090.
Managing stored credentials 161
Adding new credentials to the credential store

Adding new credentials to the credential store


You can add new credentials to the credential store. These credentials can later be referenced
with the credential name.
To add a stored credential
1 Click System > Settings > Credentials, and click Add Credential.
2 Enter the following information:

Credential Name Enter your name for this stored credential.

The credential name must be unique within the


credential store. The name is used only to identify
the credential.

Access Username Enter the user name for authentication as


<domain_name>\<username> in the NT4
format. The username must be a Windows
domain user account.

Access Password Enter the password for authentication.

Re-enter Access Password Re-enter the password.

3 Click Save.
4 You can later edit or delete credentials from the credential store.
See “Managing credentials in the credential store” on page 162.
See “Configuring endpoint credentials” on page 161.

Configuring endpoint credentials


You must add credentials to the Credential Store before you can access credentials for Endpoint
FlexResponse or the Endpoint Discover Quarantine response rule. The credentials are stored
in an encrypted folder on all endpoints that are connected to an Endpoint Server. Because all
endpoints store the credentials, you must be careful about the type of credentials you store.
Use credentials that cannot access other areas of your system. Before your endpoint credentials
can be used, you must enable the Enforce Server to recognize them.
To create endpoint credentials
1 Go to: System > Settings > General.
2 Click Configure.
Managing stored credentials 162
Managing credentials in the credential store

3 Under the Credential Management section, ensure that the Allow Saved Credentials
on Endpoint Agent checkbox is selected.
4 Click Save.
5 Go to: System > Settings > Credentials.
6 Click Add Credential.
7 Under the General section, enter the details of the credential you want to add.
8 Under Usage Permission, select Servers and Endpoint agents.
9 Click Save.
See “About the credential store” on page 160.
See “Configuring the Endpoint Discover: Quarantine File action” on page 1815.

Managing credentials in the credential store


You can delete or edit a stored credential.
To delete a stored credential
1 Click System > Settings > Credentials. Locate the name of the stored credential that
you want to remove.
2 Click the delete icon to the right of the name. A credential can be deleted only if it is not
currently referenced in a Discover target or indexed document profile.
To edit a stored credential
1 Click System > Settings > Credentials. Locate the name of the stored credential that
you want to edit.
2 Click the edit icon (pencil) to the right of the name.
3 Update the user name or password.
4 Click Save.
5 If you change the password for a given credential, the new password is used for all
subsequent Discover scans that use that credential.

Managing stored credentials


An authentication credential can be stored in a central credential store. It can be defined once
as a named credential, and then referenced by any number of Network Discover/Cloud Storage
Discover targets.
Managing stored credentials 163
Managing stored credentials

Store your authentication credentials in a central store to simplify management of user name
and password changes.
You can add, delete, or edit stored credentials.
To add a stored credential
1 In System > Settings > Credentials, click Add Credential.
2 Enter the following information:

Credential Name Enter your name for this stored credential.

The credential name must be unique within the


credential store. The name is used only to identify
the credential.

Access Username Enter the user name for authentication.

Access Password Enter the password for authentication.

Re-enter Access Password Re-enter the password.

3 Click Save.
To delete a stored credential
1 In System > Settings > Credentials, locate the name of the stored credential that you
want to remove.
2 Click the delete icon to the right of the name. A credential can be deleted only if it is not
currently referenced in a Discover target or indexed document profile.
To edit a stored credential
1 In System > Settings > Credentials, locate the name of the stored credential that you
want to edit.
2 Click the edit icon (pencil) to the right of the name.
3 Update the user name or password.
4 Click Save.
5 If you change the password for a given credential, the new password is used for all
subsequent Discover scans that use that credential.
See “Providing the password authentication for Network Discover scanned content” on page 2095.
Chapter 8
Managing system events
and messages
This chapter includes the following topics:

■ About system events

■ System events reports

■ Working with saved system reports

■ Server and Detectors event detail

■ Configuring event thresholds and triggers

■ About system event responses

■ Enabling a syslog server

■ About system alerts

■ Configuring the Enforce Server to send email alerts

■ Configuring system alerts

■ About log review

■ System event codes and messages

About system events


System events related to your Symantec Data Loss Prevention installation are monitored,
reported, and logged. System events include notifications from Cloud Operations for cloud
services.
System event reports are viewed from the Enforce Server administration console:
Managing system events and messages 165
System events reports

■ The five most recent system events of severity Warning or Severe are listed on the
Overview screen (System > Servers and Detectors > Overview).
See “About the Overview screen” on page 278.
■ Reports on all system events of any severity can be viewed by going to System > Servers
and Detectors > Events.
See “System events reports” on page 165.
■ Recent system events for a particular detection server or cloud service are listed on the
Server/Detector Detail screen for that server or detector.
See “Server/Detector Detail screen” on page 283.
■ Click on any event in an event list to go to the Event Details screen for that event. The
Event Details screen provides additional information about the event.
See “Server and Detectors event detail” on page 169.
There are three ways that system events can be brought to your attention:
■ System event reports displayed on the administration console
■ System alert email messages
See “About system alerts” on page 175.
■ Syslog functionality
See “Enabling a syslog server” on page 174.
Some system events require a response.
See “About system event responses” on page 172.
To narrow the focus of system event management you can:
■ Use the filters in the various system event notification methods.
See “System events reports” on page 165.
■ Configure the system event thresholds for individual servers.
See “Configuring event thresholds and triggers” on page 170.

System events reports


To view all system events, go to the system events report screen (System > Servers and
Detectors > Events). This screen lists events, one event per line. The list contains those
events that match the selected data range, and any other filter options that are listed in the
Applied Filters bar. For each event, the following information is displayed:

Table 8-1
Events Description

Type The type (severity) of the event. Type may be any one of those listed in Table 8-2.
Managing system events and messages 166
System events reports

Table 8-1 (continued)

Events Description

Time The date and time of the event.

Server The name of the server on which the event occurred.

Host The IP address or host name of the server on which the event occurred.

Code A number that identifies the kind of event.

See the Symantec Data Loss Prevention Administration Guide for information on event
code numbers.

Summary A brief description of the event. Click on the summary for more detail about the event.

Table 8-2 System event types

Event Description

System information

Warning

Severe

You can select from several report handling options.


See “Common incident report features” on page 1933.
Click any event in the list to go to the Event Details screen for that event. The Event Details
screen provides additional information about the event.
See “Server and Detectors event detail” on page 169.
Since the list of events can be long, filters are available to help you select only the events that
you are interested in. By default, only the Date filter is enabled and it is initially set to All Dates.
The Date filter selects events by the dates the events occurred.
To filter the list of system events by date of occurrence
1 Go to the Filter section of the events report screen and select one of the date range
options.
2 Click Apply.
3 Select Custom from the date list to specify beginning and end dates.
In addition to filtering by date range, you can also apply advanced filters. Advanced filters are
cumulative with the current date filter. This means that events are only listed if they match the
advanced filter and also fall within the current date range. Multiple advanced filters can be
Managing system events and messages 167
System events reports

applied. If multiple filters are applied, events are only listed if they match all the filters and the
date range.
To apply additional advanced filters
1 Click on Advanced Filters and Summarization.
2 Click on Add Filter.
3 Choose the filter you want to use from the left-most drop-down list. Available filters are
listed in Table 8-3.
4 Choose the filter-operator from the middle drop-down list.

Note: You can use the Cloud Operations filter value to view events from Cloud Operations
for your detectors.

For each advanced filter you can specify a filter-operator Is Any Of or Is None Of.
5 Enter the filter value, or values, in the right-hand text box, or click a value in the list to
select it.
■ To select multiple values from a list, hold down the Control key and click each one.
■ To select a range of values from a list, click the first one, then hold down the Shift key
and click the last value in the range you want.

6 (Optional) Specify additional advanced filters if needed.


7 When you have finished specifying a filter or set of filters, click Apply.
Click the red X to delete an advanced filter.
The Applied Filters bar lists the filters that are used to produce the list of events that is
displayed. Note that multiple filters are cumulative. For an event to appear on the list it must
pass all the applied filters.
The following advanced filters are available:

Table 8-3 System events advanced filter options

Filter Description

Event Code Filter events by the code numbers that identify each
kind of event. You can filter by a single code number
or multiple code numbers separated by commas
(2121, 1202, 1204). Filtering by code number
ranges, or greater than, or less than operators is
not supported.
Managing system events and messages 168
Working with saved system reports

Table 8-3 System events advanced filter options (continued)

Filter Description

Event type Filter events by event severity type (Info, Warning,


or Severe).

Server Filter events by the server on which the event


occurred.

Note: A small subset of the parameters that trigger system events have thresholds that can
be configured. These parameters should only be adjusted with advice from Symantec Support.
Before changing these settings, you should have a thorough understanding of the implications
that are involved. The default values are appropriate for most installations.
See “Configuring event thresholds and triggers” on page 170.

See “About system events” on page 164.


See “Server and Detectors event detail” on page 169.
See “ Working with saved system reports” on page 168.
See “Configuring event thresholds and triggers” on page 170.
See “About system alerts” on page 175.

Working with saved system reports


The System Reports screen lists system and agent-related reports that have previously been
saved. To display the System Reports screen, click System > System Reports. Use this
screen to work with saved system reports.
To create a saved system report
1 Go to one of the following screens:
■ System Events (System > Events)
■ Agents Overview (System > Agents > Overview)
■ Agents Events (System > Agents > Events)
See “About the Enforce Server administration console” on page 83.
2 Select the filters and summaries for your custom report.
See “About custom reports and dashboards” on page 1912.
3 Select Report > Save As.
Managing system events and messages 169
Server and Detectors event detail

4 Enter the saved report information.


See “Saving custom incident reports” on page 1914.
5 Click Save.
The System Reports screen is divided into two sections:
■ System Event - Saved Reports lists saved system reports.
■ Agent Management - Saved Reports lists saved agent reports.
For each saved report you can perform the following operations:
■ Share the report. Click share to allow other Symantec Data Loss Prevention uses who
have the same role as you to share the report. Sharing a report cannot be undone; after a
report is shared it cannot be made private. After a report is shared, all users with whom it
is shared can view, edit, or delete the report.
See “Saving custom incident reports” on page 1914.
■ Change the report name or description. Click the pencil icon to the right of the report name
to edit it.
■ Change the report scheduling. Click the calendar icon to the right of the report name to
edit the delivery schedule of the report and to whom it is sent.
See “Saving custom incident reports” on page 1914.
See “Delivery schedule options for incident and system reports” on page 1917.
■ Delete the report. Click the red X to the right of the report name to delete the report.

Server and Detectors event detail


To view the Server and Detectors Event Detail screen, go to System > Servers and
Detectors > Events and click one of the listed events.
See “System events reports” on page 165.
The Server and Detectors Event Detail screen displays all of the information available for
the selected event. The information on this screen is not editable.
The Server and Detectors Event Detail screen is divided into two sections—General and
Message.
Managing system events and messages 170
Configuring event thresholds and triggers

Table 8-4 Event detail — General

Item Description

Type The event is one of the following types:


■ Info: Information about the system.
■ Warning: A problem that is not severe enough to generate an error.
■ Severe: An error that requires immediate attention.

Time The date and time of the event.

Server or The name of the server or detector.


Detector

Host The host name or IP address of the server.

Table 8-5 Event detail — Message

Item Description

Code A number that identifies the kind of event.

See “System event codes and messages” on page 180.

Summary A brief description of the event.

Detail Detailed information about the event.

See “About system events” on page 164.


See “System events reports” on page 165.
See “About system alerts” on page 175.

Configuring event thresholds and triggers


A small subset of the parameters that trigger system events have thresholds that can be
configured. These parameters are configured for each detection server or detector separately.
These parameters should only be adjusted with advice from Symantec Support. Before changing
these settings, you should have a thorough understanding of the implications. The default
values are appropriate for most installations.
See “About system events” on page 164.
Managing system events and messages 171
Configuring event thresholds and triggers

To view and change the configurable parameters that trigger system events
1 Go to the Overview screen (System > Servers and Detectors > Overview).
2 Click on the name of a detection server or detector to display that server's Server/Detector
Detail screen.
3 Click Server/Detector Settings.
The Advanced Server/Detector Settings screen for that server is displayed.
4 Change the configurable parameters, as needed.

Table 8-6 Configurable parameters that trigger events

Parameter Description Event

BoxMonitor.DiskUsageError Indicates the amount of filled disk space Low disk space
(as a percentage) that triggers a severe
system event. For example, a Severe
event occurs if a detection server is
installed on the C drive and the disk
space error value is 90. The detection
server creates a Severe system event
when the C drive usage is 90% or
greater. The default is 90.

BoxMonitor.DiskUsageWarning Indicates the amount of filled disk space Low disk space
(as a percentage) that triggers a
Warning system event. For example, a
Warning event occurs if the detection
server is installed on the C drive and the
disk space warning value is 80. Then
the detection server generates a
Warning system event when the C drive
usage is 80% or greater. The default is
80.

BoxMonitor.MaxRestartCount Indicates the number of times that a process name restarts excessively
system process can be restarted in one
hour before a Severe system event is
generated. The default is 3.

IncidentDetection.MessageWaitSevere Indicates the number of minutes Long message wait time


messages need to wait to be processed
before a Severe system event is sent
about message wait times. The default
is 240.
Managing system events and messages 172
About system event responses

Table 8-6 Configurable parameters that trigger events (continued)

Parameter Description Event

IncidentDetection.MessageWaitWarning Indicates the number of minutes Long message wait time


messages need to wait to be processed
before sending a Severe system event
about message wait times. The default
is 60.

IncidentWriter.BacklogInfo Indicates the number of incidents that N incidents in queue


can be queued before an Info system
event is generated. This type of backlog
usually indicates that incidents are not
processed or are not processed
correctly because the system may have
slowed down or stopped. The default is
1000.

IncidentWriter.BacklogWarning Indicates the number of incidents that N incidents in queue


can be queued before generating a
Warning system event. This type of
backlog usually indicates that incidents
are not processed or are not processed
correctly because the system may have
slowed down or stopped. The default is
3000.

IncidentWriter.BacklogSevere Indicates the number of incidents that N incidents in queue


can be queued before a Severe system
event is generated. This type of backlog
usually indicates that incidents are not
processed or are not processed
correctly because the system may have
slowed down or stopped. The default is
10000.

About system event responses


There are three ways that system events can be brought to your attention:
■ System event reports displayed on the administration console
■ System alert email messages
See “About system alerts” on page 175.
■ Syslog functionality
See “Enabling a syslog server” on page 174.
Managing system events and messages 173
About system event responses

In most cases, the system event summary and detail information should provide enough
information to direct investigation and remediation steps. The following table provides some
general guidelines for responding to system events.

Table 8-7 System event responses

System event or category Appropriate response

Low disk space If this event is reported on a detection server, recycle the
Symantec Data Loss Prevention services on the detection server.
The detection server may have lost its connection to the Enforce
Server. The detection server then queues its incidents locally,
and fills up the disk.

If this event is reported on an Enforce Server, check the status


of the Oracle and the Symantec DLP Incident Persister services.
Low disk space may result if incidents do not transfer properly
from the file system to the database. This event may also indicate
a need to add additional disk space.

Tablespace is almost full Add additional data files to the database. When the hard disk is
at 80% of capacity, obtain a bigger disk instead of adding
additional data files.

Refer to the Symantec Data Loss Prevention Installation Guide.

Licensing and versioning Contact Symantec Support.

Monitor not responding Restart the Symantec DLP Detection Server service. If the event
persists, check the network connections. Make sure the computer
that hosts the detections server is turned on by connecting to it.
You can connect with terminal services or another remote desktop
connection method. If necessary, contact Symantec Support.

See “About Symantec Data Loss Prevention services”


on page 101.

Alert or scheduled report sending Go to System > Settings > General and ensure that the settings
failed in the Reports and Alerts and SMTP sections are configured
correctly. Check network connectivity between the Enforce Server
and the SMTP server. Contact Symantec Support.

Auto key ignition failed Contact Symantec Support.

Cryptographic keys are inconsistent Contact Symantec Support.


Managing system events and messages 174
Enabling a syslog server

Table 8-7 System event responses (continued)

System event or category Appropriate response

Long message wait time Increase detection server capacity by adding more CPUs or
replacing the computer with a more powerful one.

Decrease the load on the detection server. You can decrease


the load by applying the traffic filters that have been configured
to detect fewer incidents. You can also re-route portions of the
traffic to other detection servers.

Increase the threshold wait times if all of the following items are
true:

■ This message is issued during peak hours.


■ The message wait time drops down to zero before the next
peak.
■ The business is willing to have such delays in message
processing.

process_name restarts excessively Check the process by going to System > Servers > Overview.
To see individual processes on this screen, Process Control must
be enabled by going to System > Settings > General >
Configure.

N incidents in queue Investigate the reason for the incidents filling up the queue.
The most likely reasons are as follows:

■ Connection problems. Response: Make sure the


communication link between the Endpoint Server and the
detection server is stable.
■ Insufficient connection bandwidth for the number of generated
incidents (typical for WAN connections). Response: Consider
changing policies (by configuring the filters) so that they
generate fewer incidents.

Enabling a syslog server


Syslog functionality sends Severe system events to a syslog server. Syslog servers allow
system administrators to filter and route the system event notifications on a more granular
level. System administrators who use syslog regularly for monitoring their systems may prefer
to use syslog instead of alerts. Syslog may be preferred if the volume of alerts seems unwieldy
for email.
Syslog functionality is an on or off option. If syslog is turned on, all Severe events are sent to
the syslog server.
Managing system events and messages 175
About system alerts

To enable syslog functionality


1 Go to the \Program
Files\Symantec\DataLossPrevention\EnforceServer\15.5\Protect\config directory
on Windows or the
/opt/Symantec/DataLossPrevention/EnforceServer/15.5/Protect/config directory
on Linux.
2 Open the Manager.properties file.
3 Uncomment the #systemevent.syslog.host= line by removing the # symbol from the
beginning of the line, and enter the hostname or IP address of the syslog server.
4 Uncomment the #systemevent.syslog.port= line by removing the # symbol from the
beginning of the line. Enter the port number that should accept connections from the
Enforce Server server. The default is 514.
5 Uncomment the #systemevent.syslog.format= [{0}] {1} - {2} line by removing the
# symbol from the beginning of the line. Then define the system event message format
to be sent to the syslog server:
If the line is uncommented without any changes, the notification messages are sent in the
format: [server name] summary - details. The format variables are:
■ {0} - the name of the server on which the event occurred
■ {1} - the event summary
■ {2} - the event detail
For example, the following configuration specifies that Severe system event notifications
are sent to a syslog host named server1 which uses port 600.

systemevent.syslog.host=server1
systemevent.syslog.port=600
systemevent.syslog.format= [{0}] {1} - {2}

Using this example, a low disk space event notification from an Enforce Server on a host
named dlp-1 would look like:

dlp-1 Low disk space - Hard disk space for


incident data storage server is low. Disk usage is over 82%.

See “About system events” on page 164.

About system alerts


System alerts are email messages that are sent to designated addresses when a particular
system event occurs. You define what alerts (if any) that you want to use for your installation.
Managing system events and messages 176
Configuring the Enforce Server to send email alerts

Alerts are specified and edited on the Configure Alert screen, which is reached by System
> Servers and Detectors > Alerts > Add Alert.
Alerts can be specified based on event severity, server name, or event code, or a combination
of those factors. Alerts can be sent for any system event.
The email that is generated by the alert has a subject line that begins with Symantec Data
Loss Prevention System Alert followed by a short event summary. The body of the email
contains the same information that is displayed by the Event Detail screen to provide complete
information about the event.
See “Configuring the Enforce Server to send email alerts” on page 176.
See “Configuring system alerts” on page 177.
See “Server and Detectors event detail” on page 169.

Configuring the Enforce Server to send email alerts


To send out email alerts regarding specified system events, the Enforce Server has to be
configured to support sending of alerts and reports. This section describes how to specify the
report format and how to configure Symantec Data Loss Prevention to communicate with an
SMTP server.
After completing the configuration described here, you can schedule the sending of specific
reports and create specific system alerts.
To configure Symantec Data Loss Prevention to send alerts and reports
1 Go to System > Settings > General and click Configure.
The Edit General Settings screen is displayed.
2 In the Reports and Alerts section, select one of the following distribution methods:
■ Send reports as links, logon is required to view. Symantec Data Loss Prevention
sends email messages with links to reports. You must log on to the Enforce Server to
view the reports.

Note: Reports with incident data cannot be distributed if this option is set.

■ Send report data with emails. Symantec Data Loss Prevention sends email messages
and attaches the report data.
Managing system events and messages 177
Configuring system alerts

3 Enter the Enforce Server domain name or IP address in the Fully Qualified Manager
Name field.
If you send reports as links, Symantec Data Loss Prevention uses the domain name as
the basis of the URL in the report email.
Do not specify a port number unless you have modified the Enforce Server to run on a
port other than the default of 443.
4 If you want alert recipients to see any correlated incidents, check the Correlations Enabled
box.
When correlations are enabled, users see them on the Incident Snapshot screen.
5 In the SMTP section, identify the SMTP server to use for sending out alerts and reports.
Enter the relevant information in the following fields:
■ Server: The fully qualified hostname or IP address of the SMTP server that Symantec
Data Loss Prevention uses to deliver system events and scheduled reports.
■ System email: The email address for the alert sender. Symantec Data Loss Prevention
specifies this email address as the sender of all outgoing email messages. Your IT
department may require the system email to be a valid email address on your SMTP
server.
■ User ID: If your SMTP server requires it, type a valid user name for accessing the
server. For example, enter DOMAIN\bsmith.
■ Password: If your SMTP server requires it, enter the password for the User ID.

6 Click Save.
See “About system alerts” on page 175.
See “Configuring system alerts” on page 177.
See “About system events” on page 164.

Configuring system alerts


You can configure Symantec Data Loss Prevention to send an email alert whenever it detects
a specified system event. Alerts can be specified based on event severity, server name, or
event code, or a combination of those factors. Alerts can be sent for any system event.
See “About system alerts” on page 175.
Note that the Enforce Server must first be configured to send alerts and reports.
See “Configuring the Enforce Server to send email alerts” on page 176.
Managing system events and messages 178
Configuring system alerts

Alerts are specified and edited on the Configure Alert screen, which is reached by System
> Servers > Alerts and then choosing Add Alert to create a new alert, or clicking on the name
of an existing alert to modify it.
To create or modify an alert
1 Go the Alerts screen (System > Servers and Detectors > Alerts).
2 Click the Add Alert tab to create a new alert, or click on the name of an alert to modify
it.
The Configure Alert screen is displayed.
3 Fill in (or modify) the name of the alert. The alert name is displayed in the subject line of
the email alert message.
4 Fill in (or modify) a description of the alert.
5 Click Add Condition to specify a condition that will trigger the alert.
Each time you click Add Condition you can add another condition. If you specify multiple
conditions, every one of the conditions must be met to trigger the alert.
Click on the red X next to a condition to remove it from an existing alert.
6 Enter the email address that the alert is to be sent to. Separate multiple addresses by
commas.
7 Limit the maximum number of times this alert can be sent in one hour by entering a number
in the Max Per Hour box.
If no number is entered in this box, there is no limit on the number of times this alert can
be sent out. The recommended practice is to limit alerts to one or two per hour, and to
substitute a larger number later if necessary. If you specify a large number, or no number
at all, recipient mailboxes may be overloaded with continual alerts.
8 Click Save to finish.
The Alerts list is displayed.
There are three kinds of conditions that you can specify to trigger an alert:
■ Event type - the severity of the event.
■ Server - the server associated with the event.
■ Event code - a code number that identifies a particular kind of event.
For each kind of condition, you can choose one of two operators:
■ Is any of.
■ Is none of.
For each kind of condition, you can specify appropriate parameters:
Managing system events and messages 179
About log review

■ Event type. You can select one, or a combination of, Information, Warning, Severe. Click
on an event type to specify it. To specify multiple types, hold down the Control key while
clicking on event types. You can specify one, two, or all three types.
■ Server. You can select one or more servers from the list of available servers. Click on the
name of server to specify it. To specify multiple servers, hold down the Control key while
clicking on server names. You can specify as many different servers as necessary.
■ Event code. Enter the code number. To enter multiple code numbers, separate them with
commas or use the Return key to enter each code on a separate line.
See “System event codes and messages” on page 180.
By combining multiple conditions, you can define alerts that cover a wide variety of system
conditions.

Note: If you define more than one condition, the conditions are treated as if they were connected
by the Boolean "AND" operator. This means that the Enforce Server only sends the alert if all
conditions are met. For example, if you define an event type condition and a server condition,
the Enforce Server only sends the alert if the specified event occurs on the designated server.

See “About system alerts” on page 175.


See “Configuring the Enforce Server to send email alerts” on page 176.
See “System events reports” on page 165.

About log review


Your Symantec Data Loss Prevention installation includes a number of log files. These files
provide information on server communication, Enforce Server and detection server operation,
incident detection, and so on.
By default, logs for the Enforce Server and detection server are stored in the following
directories:
■ Windows:c:\ProgramData\Symantec\DataLossPrevention\EnforceServer\15.5\Protect\logs
■ Linux: /var/log/Symantec/DataLossPrevention/EnforceServer/15.5/
See “About log files” on page 333.
See also the Symantec Data Loss Prevention System Maintenance Guide for additional
information about working with logs.
Managing system events and messages 180
System event codes and messages

System event codes and messages


Symantec Data Loss Prevention system events are monitored, reported, and logged. Each
event is identified by code number listed in the tables.
See “About system events” on page 164.
System event lists and reports can be filtered by event codes.
See “System events reports” on page 165.

Note: Numbers enclosed in braces, such as {0}, indicate text strings that are dynamically
inserted into the actual event name or description message.

Table 8-8 General detection server events

Code Summary Description

1000 Monitor started All monitor processes have been started.

1001 Local monitor started All monitor processes have been started.

1002 Monitor started Some monitor processes are disabled and haven't been
started.

1003 Local monitor started Some monitor processes are disabled and haven't been
started.

1004 Monitor stopped All monitor processes have been stopped.

1005 Local monitor stopped All monitor processes have been stopped.

1006 {0} failed to start Process {0} can't be started. See log files for more detail.

1007 {0} restarts excessively Process {0} has restarted {1} times during last {2} minutes.

1008 {0} is down {0} process went down before it had fully started.

1010 Restarted {0} {0} process was restarted because it went down unexpectedly.

1011 Restarted {0} {0} was restarted because it was not responding.

1012 Unable to start {0} Cannot bind to the shutdown datagram socket. Will retry.

1013 {0} resumed starting Successfully bound to the shutdown socket.

1014 Low disk space Hard disk space is low. Symantec Data Loss Prevention
server disk usage is over {0}%.
Managing system events and messages 181
System event codes and messages

Table 8-9 Endpoint server events

Code Summary Description

1100 Aggregator started None

1101 Aggregator failed to start Error starting Aggregator. {0} No incidents will be detected.

1102 Communications with non-legacy SSL keystore and truststore are not configured for this
agents are disabled endpoint server. Please go to configure server page to
configure SSL keystore and truststore.

Table 8-10 Detection configuration events

Code Summary Description

1200 Loaded policy "{0}" Policy "{0}" v{1} ({2}) has been successfully loaded.

1201 Loaded policies {0} None

1202 No policies loaded No relevant policies are found. No incidents will be detected.
1203 Unloaded policy "{0}" Policy "{0}" has been unloaded.

1204 Updated policy "{0}" Policy "{0}" has been successfully updated. The current policy
version is {1}. Active channels: {2}.

1205 Incident limit reached for Policy The policy "{0}" has found incidents in more than {1}
"{0}" messages within the last {2} hours. The policy will not be
enforced until the policy is changed, or the reset period of {2}
hours is reached.

1206 Long message wait time Message wait time was {0}:{1}:{2}:{3}.

1207 Failed to load Vector Machine Failed to load [{0}] Vector Machine Learning profile. See
Learning profile server logs for more details.

1208 Failed to unload Vector Machine Failed to unload [{0}] Vector Machine Learning profile. See
Learning profile server logs for more details.

1209 Loaded Vector Machine Learning Loaded [{0}] Vector Machine Learning profile.
profile

1210 Unloaded Vector Machine Unloaded [{0}] Vector Machine Learning profile.
Learning profile

1211 Vector Machine Learning training Training succeeded for [{0}] Vector Machine Learning profile.
successful

1212 Vector Machine Learning training Training failed for [{0}] Vector Machine Learning profile.
failed
Managing system events and messages 182
System event codes and messages

Table 8-10 Detection configuration events (continued)

Code Summary Description

1213 {0} messages timed out in {0} messages timed out in Detection in the last {1} minutes.
Detection recently Enable Detection execution trace logs for details.

1214 Detected regular expression rules Policy set contains regular expression rule(s) with invalid
with invalid patterns patterns. See FileReader.log for details.

Table 8-11 File reader events

Code Summary Description

1301 File Reader started None

1302 File Reader failed to start Error starting File Reader. {0} No incidents will be detected.

1303 Unable to delete folder File Reader was unable to delete folder "{0}" in the file system.
Please investigate, as this will cause system malfunction.

1304 Channel enabled Monitor channel "{0}" has been enabled.

1305 Channel disabled Monitor channel "{0}" has been disabled. 1306 License
received. {0}.

1306 License received. None

1307 started Process is started.

1308 down Process is down.

Table 8-12 ICAP events

Code Summary Description

1400 ICAP channel configured The channel is in {0} mode

1401 Invalid license The ICAP channel is not licensed or the license has expired.
No incidents will be detected or prevented by the ICAP
channel.

1402 Content Removal Incorrect Configuration rule in line {0} is outdated or not written in
proper grammar format. Either remove it from the config file
or update the rule.

1403 Out of memory Error (Web While processing request on connection ID{0}, out of memory
Prevent) while processing error occurred. Please tune your setup for traffic load.
message
Managing system events and messages 183
System event codes and messages

Table 8-12 ICAP events (continued)

Code Summary Description

1404 Host restriction Any host (ICAP client) can connect to ICAP Server.

1405 Host restriction error Unable to get the IP address of host {0}.

1406 Host restriction error Unable to get the IP address of any host in Icap.AllowHosts.

1407 Protocol Trace Enabled Enabled Traces available at {0}.

1408 Invalid Load Balance Factor Icap LoadBalanceFactor configured to 0. Treating it as 1.

Table 8-13 MTA events

Code Summary Description

1500 Invalid license The SMTP Prevent channel is not licensed or the license has
expired. No incidents will be detected or prevented by the
SMTP Prevent channel.

1501 Bind address error Unable to bind {0}. Please check the configured address or
the RequestProcessor log for more information. 1502 MTA
restriction error Unable to resolve host {0}.

1503 All MTAs restricted Client MTAs are restricted, but no hosts were resolved.
Please check the RequestProcessor log for more information
and correct the RequestProcessor.AllowHosts setting for this
Prevent server.

1504 Downstream TLS Handshake TLS handshake with downstream MTA {0} failed. Please
failed check SmtpPrevent and RequestProcessor logs for more
information.

1505 Downstream TLS Handshake TLS handshake with downstream MTA {0} was successfully
successful completed.

Table 8-14 File inductor events

Code Summary Description

1600 Override folder invalid Monitor channel {0} has invalid source folder: {1} Using folder:
{2}.

1601 Source folder invalid Monitor channel {0} has invalid source folder: {1} The channel
is disabled.
Managing system events and messages 184
System event codes and messages

Table 8-15 File scan events

Code Summary Description

1700 Scan start failed Discover target with ID {0} does not exist. 1701 Scan
terminated {0}

1702 Scan completed Scan completed. Discover Target Name - "{0}"

1703 Scan start failed {0}

1704 Share list had errors {0}

1705 Scheduled scan failed Failed to start a scheduled scan of Discover target {0}. {1}

1706 Scan suspend failed {0}

1707 Scan resume failed {0}

1708 Scheduled scan suspension Scheduled suspension failed for scan of Discover target {0}.
failed {1}

1709 Scheduled scan resume failed Scheduled suspension failed for scan of Discover target {0}.
{1}

1710 Maximum Scan Duration Timeout Discover target "{0}" timed out because of Maximum Scan
Occurred Duration.

1711 Maximum Scan Duration Timeout Maximum scan time duration timed out for scan: {0}. However,
Failed an error occurred while trying to abort the scan.

1712 Scan Idle Timeout Occurred Discover target "{0}" timed out because of Scan Idle Timeout.

1713 Scan Idle Timeout Failed Maximum idle time duration timed out for scan: {0}. However,
an error occurred while trying to abort the scan.

1714 Scan terminated - Invalid Server Scan of discover target "{0}" has been terminated from the
State state of "{1}" because the associated discover server {2}
entered an unexpected state of "{3}".

1715 Scan terminated - Server Scan of discover target "{0}" has been terminated because
Removed the associated discover server {1} is no longer available.

1716 Scan terminated - Server Scan of discover target "{0}" has been terminated because
Reassigned the associated discover server {1} is already scanning
discover target(s) "{2}".

1717 Scan terminated - Transition Failed to handle the state change of discover server {1} while
Failed scanning discover target "{0}". See log files for details.
Managing system events and messages 185
System event codes and messages

Table 8-15 File scan events (continued)

Code Summary Description

1718 Scan start failed Scan of discover target "{0}" has failed to start. See log files
for detailed error description.

1719 Scan start failed due to Scan of discover target "{0}" has failed, as its target type is
unsupported target type no longer supported.

1720 Scan started Scan started. Discover Target Name - "{0}"

1721 Scan paused Scan paused. Discover Target Name - "{0}"

1722 Scan stopped Scan stopped. Discover Target Name - "{0}"

1723 Scan queued Scan queued. Discover Target Name - "{0}"

1724 Scan failed Scan failed. Discover Target Name - "{0}"

Table 8-16 Incident attachment external storage events

Code Summary Description

1750 Incident attachment migration Migration of incident attachments from database to external
started storage directory has started.

1751 Incident attachment migration Completed migrating incident attachments from database to
completed external storage directory.

1752 Incident attachment migration One or more incident attachments could not be migrated from
failed database to external storage directory. Check the incident
persister log for more details. Once the error is resolved,
restart the SymantecDLPIncidentPersisterService
service to resume the migration.

1753 Incident attachment migration One or more incident attachments migration from database
error. to external storage directory has encountered error. Check
the incident persister log for more details. Migration will
continue and will retry erred attachment later.

1754 Failed to update incident Failed to update the schedule to delete incident attachments
attachment deletion schedule in the external directory. Check the incident persister log for
more details.

1755 Incident attachment deletion Deletion of obsolete incident attachments from the external
started storage directory has started.

1756 Incident attachment deletion Deletion of obsolete incident attachments from the external
completed storage directory has completed.
Managing system events and messages 186
System event codes and messages

Table 8-16 Incident attachment external storage events (continued)

Code Summary Description

1757 Incident attachment deletion One or more incident attachments could not be deleted from
failed the external storage directory. Check the incident persister
log for more details.

1758 Incident attachment external Incident attachment external storage directory is not
storage directory is not accessible. Check the incident persister log for more details.
accessible

Incident attachment external Incident attachment external storage directory is accessible.


storage directory is accessible

Table 8-17 Incident persister and incident writer events

Code Summary Description

1800 Incident Persister is unable to Persister ran out of memory processing incident {0}.
process incident Incident

1801 Incident Persister failed to


process incident {0}

1802 Corrupted incident received A corrupted incident was received, and renamed to {0}.

1803 Policy misconfigured Policy "{0}" has no associated severity.

1804 Incident Persister is unable to Incident Persister cannot start because it failed to access the
start incident folder {0}. Check folder permissions.

1805 Incident Persister is unable to Incidents folder The Incident Persister is unable to access
access the incident folder {0}. Check folder permissions.

1806 Response rule processing failed Response rule processing failed to start: {0}.
to start

1807 Response rule processing Response rule command runtime execution failed from error:
execution failed {0}.

1808 Unable to write incident Failed to delete old temporary file {0}.

1809 Unable to write incident Failed to rename temporary incident file {0}.

1810 Unable to list incidents Failed to list incident files in folder {0}. Check folder
permissions.

1811 Error sending incident Unexpected error occurred while sending an incident. {0}
Look in the incident writer log for more information.
Managing system events and messages 187
System event codes and messages

Table 8-17 Incident persister and incident writer events (continued)

Code Summary Description

1812 Incident writer stopped Failed to delete incident file {0} after it was sent. Delete the
file manually, correct the problem and restart the incident
writer.

1813 Failed to list incidents Failed to list incident files in folder {0}. Check folder
permissions.

1814 Incident queue backlogged There are {0} incidents in this server's queue.

1815 Low disk space on incident server Hard disk space for the incident data storage server is low.
Disk usage is over {0}%.

1816 Failed to update policy statistics Failed to update policy statistics for policy {0}.

1817 Daily incident maximum The daily incident maximum for policy {0} has been
exceeded exceeded.\n No further incidents will be generated.

1818 Incident is oversized, has been Incident is oversized, has been partially persisted with
persisted with a limited number messageID {0}, Incident File Name {1}.
of components and/or violations

1821 Failure to process an incident Unexpected error occurred while sending an incident {0}
received from the cloud gateway

Table 8-18 Install or update events

Code Summary Description

1900 Failed to load update package Database connection error occurred while loading the
software update package {0}.

1901 Software update failed Failed to apply software update from package {0}. Check the
update service log.

Table 8-19 Key ignition password events

Code Summary Description

2000 Key ignition error Failed to ignite keys with the new ignition password. Detection
against Exact Data Profiles will be disabled.

2001 Unable to update key ignition The key ignition password won't be updated, because the
password. cryptographic keys aren't ignited. Exact Data Matching will
be disabled.
Managing system events and messages 188
System event codes and messages

Table 8-20 Admin password reset event code

Code Summary Description

2099 Administrator password reset The Administrator password has been reset by the password
reset tool.

Table 8-21 Manager administrator and policy events

Code Summary Description

2100 Administrator saved The administrator settings were successfully saved.

2101 Data source removed The data source with ID {0} was removed by {1}.

2102 Data source saved The {0} data source was saved by {1}.

2103 Document source removed The document source with ID {0} was removed by {1}.

2104 Document source saved The {0} document source was saved by {1}.

2105 New protocol created The new protocol {0} was created by {1}.

2106 Protocol order changed The protocol {0} was moved {1} by {2}.

2107 Protocol removed The protocol {0} was removed by {1}.

2108 Protocol saved The protocol {0} was edited by {1}.

2109 User removed The user with ID {0} was removed by {1}.

2110 User saved The user {0} was saved by {1}.

2111 Runaway lookup detected One of the attribute lookup plug-ins did not complete
gracefully and left a running thread in the system. Manager
restart may be required for cleanup.

2112 Loaded Custom Attribute Lookup Plug-ins The following Custom Attribute
Lookup Plug-ins were loaded: {0}.

2113 No Custom Attribute Lookup No Custom Attribute Lookup Plug-in was found.
Plug-in was loaded

2114 Custom attribute lookup failed Lookup plug-in {0} timed out. It was unloaded.

2115 Custom attribute lookup failed Failed to instantiate lookup plug-in {0}. It was unloaded. Error
message: {1}

2116 Policy changed The {0} policy was changed by {1}.

2117 Policy removed The {0} policy was removed by {1}.


Managing system events and messages 189
System event codes and messages

Table 8-21 Manager administrator and policy events (continued)

Code Summary Description

2118 Alert or scheduled report sending configured by {1} contains the following unreachable email
failed. {0} addresses: {2}. Either the addresses are bad or your email
server does not allow relay to those addresses.

2119 System settings changed The system settings were changed by {0}.

2120 Endpoint Location settings The endpoint location settings were changed by {0}.
changed

2121 The account ''{1}'' has been The maximum consecutive failed logon number of {0}
locked out attempts has been exceeded for account ''{1}'', consequently
it has been locked out.

2122 Loaded FlexResponse Actions The following FlexResponse Actions were loaded: {0}.

2123 No FlexResponse Action was No FlexResponse Action was found.


loaded.

2124 A runaway FlexResponse action One of the FlexResponse plug-ins did not complete gracefully
was detected. and left a running thread in the system. Manager restart may
be required for cleanup.

2125 Data Insight settings changed. The Data Insight settings were changed by {0}.

2126 Agent configuration created Agent configuration {0} was created by {1}.

2127 Agent configuration modified Agent configuration {0} was modified by {1}.

2128 Agent configuration removed Agent configuration {0} was removed by {1}.

2129 Agent configuration applied Agent configuration {0} was applied to endpoint server {1} by
{2}.

2130 Directory Connection source The directory connection source with ID {0} was removed by
removed {1}.

2131 Directory Connection source The {0} directory connection source was saved by {1}.
saved

2132 Agent Troubleshooting Task Agent Troubleshooting task of type {0} created by user {1}.

2133 Certificate authority file Certificate authority file {0} generated.


generated.

2134 Certificate authority file is corrupt. Certificate authority file {0} is corrupt.
Managing system events and messages 190
System event codes and messages

Table 8-21 Manager administrator and policy events (continued)

Code Summary Description

2135 Password changed for certificate Password changed for certificate authority file {0}. New
authority file. certificate authority file is {1}.

2136 Server keystore generated. Server keystore {0} generated for endpoint server {1}.

2137 Server keystore is missing or Server keystore {0} for endpoint server {1} is missing or
corrupt. corrupt.

2138 Server truststore generated. Server truststore {0} generated for endpoint server {1}.

2139 Server truststore is missing or Server truststore {0} for endpoint server {1} is missing or
corrupt. corrupt.

2140 Client certificates and key Client certificates and key generated.
generated.

2141 Agent installer package Agent installer package generated for platforms {0}.
generated.

Table 8-22 Enforce licensing and key ignition events

Code Summary Description

2200 End User License Agreement The Symantec Data Loss Prevention End User License
accepted Agreement was accepted by {0}, {1}, {2}.

2201 License is invalid None

2202 License has expired One or more of your product licenses has expired. Some
system feature may be disabled. Check the status of your
licenses on the system settings page.

2203 License about to expire One or more of your product licenses will expire soon. Check
the status of your licenses on the system settings page.

2204 No license The license does not exist, is expired or invalid. No incidents
will be detected.

2205 Keys ignited The cryptographic keys were ignited by administrator logon.

2206 Key ignition failed Failed to ignite the cryptographic keys manually. Please look
in the Enforce Server logs for more information. It will be
impossible to create new exact data profiles.

2207 Auto key ignition The cryptographic keys were automatically ignited.
Managing system events and messages 191
System event codes and messages

Table 8-22 Enforce licensing and key ignition events (continued)

Code Summary Description

2208 Manual key ignition required The automatic ignition of the cryptographic keys is not
configured. Administrator logon is required to ignite the
cryptographic keys. No new exact data profiles can be created
until the administrator logs on.

Table 8-23 Manager major events

Code Summary Description

2300 Low disk space Hard disk space is low. Symantec Data Loss Prevention
Enforce Server disk usage is over {0}%.

2301 Tablespace is almost full Oracle tablespace {0} is over {1}% full.

2302 {0} not responding Detection Server {0} did not update its heartbeat for at least
20 minutes.

2303 Monitor configuration changed The {0} monitor configuration was changed by {1}.

2304 System update uploaded A system update was uploaded that affected the following
components: {0}.

2305 SMTP server is not reachable. SMTP server is not reachable. Cannot send out alerts or
schedule reports.

2306 Enforce Server started The Enforce Server was started.

2307 Enforce Server stopped The Enforce Server was stopped.

2308 Monitor status updater exception The monitor status updater encountered a general exception.
Please look at the Enforce Server logs for more information.

2309 System statistics update failed Unable to update the Enforce Server disk usage and database
usage statistics. Please look at the Enforce Server logs for
more information.

2310 Statistics aggregation failure The statistics summarization task encountered a general
exception. Refer to the Enforce Server logs for more
information.

2311 Version mismatch Enforce version is {0}, but this monitor's version is {1}.

2312 Incident deletion failed Incident Deletion failed .

2313 Incident deletion completed Incident deletion ran for {0} and deleted {1} incident(s).

2314 Endpoint data deletion failed Endpoint data deletion failed.


Managing system events and messages 192
System event codes and messages

Table 8-23 Manager major events (continued)

Code Summary Description

2315 Low disk space on incident server Hard disk space for the incident data storage server is low.
Disk usage is over {0}%.

2316 Over {0} incidents currently Persisting over {0} incidents can decrease database
contained in the database performance.

2318 Incident deletion flagging process Incident deletion flagging process started.
started.

2319 Incident deletion flagging process Incident deletion flagging process ended.
ended.

Table 8-24 Monitor version support events

Code Summary Description

2320 Version obsolete Detection server is not supported when two major versions
older than Enforce server version. Enforce version is {0}, and
this detection server's version is {1}. This detection server
must be upgraded.

2321 Version older than Enforce Enforce will not have visibility for this detection server and
version will not be able to send updates to it. Detection server
incidents will be received and processed normally. Enforce
version is {0}, and this detection server's version is {1}.

2322 Version older than Enforce Functionality introduced with recent versions of Enforce
version relevant to this type of detection server will not be supported
by this detection server. Enforce version is {0}, and this
detection server's version is {1}.

2323 Minor version older than Enforce Functionality introduced with recent versions of Enforce
minor version relevant to this type of detection server will not be supported
by this detection server and might be incompatible with this
detection server. Enforce version is {0}, and this detection
server's version is {1}. This detection server should be
upgraded.

2324 Version newer than Enforce Detection server is not supported when its version is newer
version than the Enforce server version. Enforce version is {0}, and
this detection server's version is {1}. Enforce must be
upgraded or detection server must be downgraded.
Managing system events and messages 193
System event codes and messages

Table 8-25 Manager reporting events

Code Summary Description

2400 Export web archive finished Archive "{0}" for user {1} was created successfully.

2401 Export web archive canceled Archive "{0}" for user {1} was canceled.

2402 Export web archive failed Failed to create archive "{0}" for user {1}. The report specified
had over {2} incidents.

2403 Export web archive failed Failed to create archive "{0}" for user {1}. Failure occurred at
incident {2}.

2404 Unable to run scheduled report The scheduled report job {0} was invalid and has been
removed.

2405 Unable to run scheduled report The scheduled report {0} owned by {1} encountered an error:
{2}.

2406 Report scheduling is disabled The scheduled report {0} owned by {1} cannot be run because
report scheduling is disabled.

2407 Report scheduling is disabled The scheduled report cannot be run because report
scheduling is disabled.

2408 Unable to run scheduled report Unable to connect to mail server when delivery scheduled
report {0}{1}.

2409 Unable to run scheduled report User {0} is no longer in role {1} which scheduled report {2}
belongs to. The schedule has been deleted.

2410 Unable to run scheduled report Unable to run scheduled report {0} for user {1} because the
account is currently locked.

2411 Scheduled report sent The schedule report {0} owned by {1} was successfully sent.

2412 Export XML report failed XML Export of report by user [{0}] failed XML Export of report
by user [{0}] failed.

2420 Unable to run scheduled data Unable to distribute report {0} (id={1}) by data owner because
owner report distribution sending of report data has been disabled.

2421 Report distribution by data owner Report distribution by data owner for report {0} (id={1}) failed.
failed

2422 Report distribution by data owner Report distribution by data owner for report {0} (id={1})
finished finished with {2} incidents for {3} data owners. {4} incidents
for {5} data owners failed to be exported.
Managing system events and messages 194
System event codes and messages

Table 8-25 Manager reporting events (continued)

Code Summary Description

2423 Report distribution to data owner The report distribution {1} (id={2}) for the data owner "{0}"
truncated exceeded the maximum allowed size. Only the first {3}
incidents were sent to "{0}".

Table 8-26 Messaging events

Code Summary Description

2500 Unexpected Error Processing {0} encountered an unexpected error processing a message.
Message See the log file for details.

2501 Memory Throttler disabled {0} x {1} bytes need to be available for memory throttling.
Only {2} bytes were available. Memory Throttler has been
disabled.

Table 8-27 Detection server communication events

Code Summary Description

2600 Communication error Unexpected error occurred while sending {1} updates to {0}.
{2} Please look at the monitor controller logs for more
information.

2650 Communication error(VML) Unexpected error occurred while sending profile updates
config set {0} to {1} {2}. Please look at the monitor controller
logs for more information.

Table 8-28 Monitor controller events

Code Summary Description

2700 Monitor Controller started Monitor Controller service was started.

2701 Monitor Controller stopped Monitor Controller service was stopped.

2702 Update transferred to {0} Successfully transferred update package {1} to detection
server {0}.

2703 Update transfer complete Successfully transferred update package {0} to all detection
servers.

2704 Update of {0} failed Failed to transfer update package to detection server {0}.

2705 Configuration file delivery Successfully transferred config file {0} to detection server.
complete
Managing system events and messages 195
System event codes and messages

Table 8-28 Monitor controller events (continued)

Code Summary Description

2706 Log upload request sent. Successfully sent log upload request {0}.

2707 Unable to send log upload Encountered a recoverable error while attempting to deliver
request log upload request {0}.

2708 Unable to send log upload Encountered an unrecoverable error while attempting to
request deliver log upload request {0}.

2709 Using built-in certificate Using built-in certificate to secure the communication between
Enforce and Detection Servers.

2710 Using user generated certificate Using user generated certificate to secure the communication
between Enforce and Detection Servers.

2711 Time mismatch between Enforce Time mismatch between Enforce and Monitor. It is
and Monitor. This may affect recommended to fix the time on the monitor through automatic
certain functions in the system. time synchronization.

2712 Connected to cloud detector Connected to cloud detector.

2713 Cloud connector disconnected Error {0} - check your network settings.

Table 8-29 Packet capture events

Code Summary Description

2800 Bad spool directory configured Packet Capture has been configured with a spool directory:
for Packet Capture {0}. This directory does not have write privileges. Please
check the directory permissions and monitor configuration
file. Then restart the monitor.

2801 Failed to send list of NICs. {0} {0}.

Table 8-30 EDM index events and messages

Code Summary Description

2900 EDM profile search failed {0}.

2901 Keys are not ignited Exact Data Matching will be disabled until the cryptographic
keys are ignited.

2902 Index folder inaccessible Failed to list files in the index folder {0}. Check the
configuration and the folder permissions.
Managing system events and messages 196
System event codes and messages

Table 8-30 EDM index events and messages (continued)

Code Summary Description

2903 Created index folder The local index folder {0} specified in the configuration had
not existed. It was created.

2904 Invalid index folder The index folder {0} specified in the configuration does not
exist.

2905 Exact data profile creation failed Data file for exact data profile "{0}" was not created. Please
look in the enforce server logs for more information.

2906 Indexing canceled Creation of database profile "{0}" was canceled.

2907 Replication canceled Canceled replication of database profile "{0}" version {1} to
server {2}.

2908 Replication failed Connection to database was lost while replicating database
profile {0} to server {1}.

2909 Replication failed Database error occurred while replicating database profile
{0} to server {1}.

2910 Failed to remove index file Failed to delete index file {1} of database profile {0}.

2911 Failed to remove index files Failed to delete index files {1} of database profile {0}.

2912 Failed to remove orphaned file Failed to remove orphaned database profile index file {0}.

2913 Replication failed Replication of database profile {0} to server {2} failed.{1}
Check the monitor controller log for more details.

2914 Replication completed Completed replication of database profile {0} to server {2}.
File {1} was transferred successfully.

2915 Replication completed Completed replication of database profile {0} to the server
{2}. Files {1} were transferred successfully.

2916 Database profile removed Database profile {0} was removed. File {1} was deleted
successfully.

2917 Database profile removed Database profile {0} was removed. Files {1} were deleted
successfully.

2918 Loaded database profile Loaded database profile {0} from {1}.

2919 Unloaded database profile Unloaded database profile {0}.

2920 Failed to load database profile {2} No incidents will be detected against database profile "{0}"
version {1}.
Managing system events and messages 197
System event codes and messages

Table 8-30 EDM index events and messages (continued)

Code Summary Description

2921 Failed to unload database profile {2} It may not be possible to reload the database profile "{0}"
version {1} in the future without detection server restart.

2922 Couldn't find registered content Registered content with ID {0} wasn't found in database during
indexing.

2923 Database error Database error occurred during indexing. {0}

2924 Process shutdown during The process has been shutdown during indexing. Some
indexing registered content may have failed to create.

2925 Policy is inaccurate Policy "{0}" has one or more rules with unsatisfactory
detection accuracy against {1}.{2}

2926 Created exact data profile Created {0} from file "{1}".\nRows processed: {2}\nInvalid
rows: {3}\nThe exact data profile will now be replicated to all
Symantec Data Loss Prevention Servers.

2927 User Group "{0}" synchronization The following User Group directories have been
failed removed/renamed in the Directory Server and could not be
synchronized: {1}.Please update the "{2}" User Group page
to reflect such changes.

2928 One or more EDM profiles are out Check the "Manage > Data Profiles > Exact Data" page for
of date and must be reindexed more details. The following EDM profiles are out of date: {0}.

Table 8-31 IDM index events and messages

Code Summary Description

3000 {0} {1} Document profile wasn't created.

3001 Indexing canceled Creation of document profile "{0}" was canceled.

3002 Replication canceled Canceled replication of document profile "{0}" version {1} to
server {2}.

3003 Replication failed Connection to database was lost while replicating document
profile "{0}" version {1} to server {2}.

3004 Replication failed Database error occurred while replicating document profile
"{0}" version {1} to server {2}.

3005 Failed to remove index file Failed to delete index file {2} of document profile "{0}" version
{1}.
Managing system events and messages 198
System event codes and messages

Table 8-31 IDM index events and messages (continued)

Code Summary Description

3006 Failed to remove index files Failed to delete index files {2} of document profile "{0}" version
{1}.

3007 Failed to remove orphaned file {0}

3008 Replication failed Replication of document profile "{0}" version {1} to server {3}
failed. {2}\nCheck the monitor controller log for more details.

3009 Replication completed Completed replication of document profile "{0}" version {1}
to server {3}. File {2} was transferred successfully.

3010 Replication completed Completed replication of document profile "{0}" version {1}
to server {3}.\nFiles {2} were transferred successfully.

3011 Document profile removed Document profile "{0}" version {1} was removed. File {2} was
deleted successfully.

3012 Document profile removed Document profile "{0}" version {1} was removed. Files {2}
were deleted successfully.

3013 Loaded document profile Loaded document profile "{0}" version {1} from {2}.

3014 Unloaded document profile Unloaded document profile "{0}" version {1}.

3015 Failed to load document profile {2}No incidents will be detected against document profile "{0}"
version {1}.

3016 Failed to unload document profile {2} It may not be possible to reload the document profile "{0}"
version {1} in the future without monitor restart.

3017 Created document profile Created "{0}" from "{1}". There are {2} accessible files in the
content root. {3} The profile contains index for {4}
document(s). {5} The document profile will now be replicated
to all Symantec Data Loss Prevention Servers.

3018 Document profile {0} has reached maximum size. Only {1} out of {2} documents
are indexed.

3019 Nothing to index Document source "{0}" found no files to index.

3020 Created document profile Created "{0}" from "{1}". There are {2} accessible files in the
content root. {3} The profile contains index for {4}
document(s). Comparing to last indexing run: {5} new
document(s) were added, {6} document(s) were updated, {7}
documents were unchanged, and {8} documents were
removed. The document profile will now be replicated to all
Symantec Data Loss Prevention servers.
Managing system events and messages 199
System event codes and messages

Table 8-31 IDM index events and messages (continued)

Code Summary Description

3021 Nothing to index The new remote IDM profile for source "{0}" was identical to
the previous imported version.

3022 Profile conversion IDM profile {0} has been converted to {1} on the endpoint.

3023 Endpoint IDM profiles memory IDM profile {0} size plus already deployed profiles size are
usage too large to fit on the endpoint, only exact matching will be
available.

Table 8-32 Attribute lookup events

Code Summary Description

3100 Invalid Attributes detected with Invalid or unsafe Attributes passed from Standard In were
Script Lookup Plugin removed during script execution. Please check the logs for
more details.

3101 Invalid Attributes detected with Invalid or unsafe Attributes passed to Standard Out were
Script Lookup Plugin removed during script execution. Please check the logs for
more details.

Table 8-33 Monitor stub events

Code Summary Description

3200 AggregatorStub started None

3201 {0} updated List of updates:{1}.

3202 {0} store intialized Initial items:{1}.

3203 Received {0} Size: {1} bytes.

3204 FileReaderStub started None

3205 IncidentWriterStub started Using test incidents folder {0}.

3206 Received configuration for {0} {1}.

3207 PacketCaptureStub started None

3208 RequestProcessorStub started None

3209 Received advanced settings None

3210 Updated settings Updated settings:{0}.


Managing system events and messages 200
System event codes and messages

Table 8-33 Monitor stub events (continued)

Code Summary Description

3211 Loaded advanced settings None

3212 UpdateServiceStub started None

3213 DetectionServerDatabaseStub None


started

Table 8-34 Packet capture events

Code Summary Description

3300 Packet Capture started Packet Capture has successfully started.

3301 Capture failed to start on device Device {0} is configured for capture, but could not be
{0} initialized. Please see PacketCapture.log for more information.

3302 PacketCapture could not elevate PacketCapture could not elevate its privileges. Some
its privilege level initialization tasks are likely to fail. Please check ownership
and permissions of the PacketCapture executable.

3303 PacketCapture failed to drop its Root privileges are still attainable after attempting to drop
privilege level them. PacketCapture will not continue

3304 Packet Capture started again as Packet capture started processing again because some disk
more disk space is available space was freed on the monitor hard drives.

3305 Packet Capture stopped due to Packet capture stopped processing packets because there
disk space limit is too little space on the monitor hard drives.

3306 Endace DAG driver is not Packet Capture was unable to activate Endace device
available support. Please see PacketCapture.log for more information.

3307 PF_RING driver is not available Packet Capture was unable to activate devices using the
PF_RING interface. Please check PacketCapture.log and
your system logs for more information.

3308 PACKET_MMAP driver is not Packet Capture was unable to activate devices using the
available PACKET_MMAP interface. Please check PacketCapture.log
and your system logs for more information.

3309 {0} is not available Packet Capture was unable to load {0} . No native capture
interface is available. Please see PacketCapture.log for more
information.
Managing system events and messages 201
System event codes and messages

Table 8-34 Packet capture events (continued)

Code Summary Description

3310 No {0} Traffic Captured {0} traffic has not been captured in the last {1} seconds.
Please check Protocol filters and the traffic sent to the
monitoring NIC.

3311 Could not create directory Could not create directory {0} : {1}.

Table 8-35 Log collection events

Code Summary Description

3400 Couldn't add files to zip The files requested for collection could not be written to an
archive file.

3401 Couldn't send log collection The files requested for collection could not be sent.

3402 Couldn't read logging properties A properties file could not be read. Logging configuration
changes were not applied.

3403 Couldn't unzip log configuration The zip file containing logging configuration changes could
package not be unpacked. Configuration changes will not be applied.

3404 Couldn't find files to collect There were no files found for the last log collection request
sent to server.

3405 File creation failed Could not create file to collect endpoint logs.

3406 Disk usage exceeded File creation failed due to insufficient disk space.

3407 Max open file limit exceeded File creation failed as max allowed number of files are already
open.

Table 8-36 Enforce SPC events

Code Summary Description

3500 SPC Server successfully SPC Server successfully registered. Product Instance Id [{0}].
registered.

3501 SPC Server successfully SPC Server successfully unregistered. Product Instance Id
unregistered. [{0}].

3502 A self-signed certificate was A self-signed certificate was generated. Certificate alias [{0}].
generated.
Managing system events and messages 202
System event codes and messages

Table 8-37 Enforce user data sources events

Code Summary Description

3600 User import completed User import from source {0} completed successfully.
successfully.

3601 User import failed. User import from data source {0} has failed.

3602 Updated user data linked to Updated user data linked to {0} existing incident events.
incidents.

Table 8-38 Catalog item distribution related events

Code Summary Description

3700 Unable to write catalog item Failed to delete old temporary file {0}.

3701 Unable to rename catalog item Failed to rename temporary catalog item file {0}.

3702 Unable to list catalog items Failed to list catalog item files in folder {0}.Check folder
permissions.

3703 Error sending catalog items Unexpected error occurred while sending an catalog
item.{0}Look in the file reader log for more information.

3704 File Reader failed to delete files. Failed to delete catalog file {0} after it was sent.\nDelete the
file manually, correct the problem and restart the File Reader.

3705 Failed to list catalog item files Failed to list catalog item files in folder {0}.Check folder
permissions.

3706 The configuration is not valid. The property {0} was configured with invalid value {1}. Please
make sure that this has correct value provided.

3707 Scan failed: Remediation Remediation detection catalog update timed out after {0}
detection catalog could not be seconds for target {1}.
updated

Table 8-39 Detection server database events

Code Summary Description

3800 DetectionServerDatabase started None

3801 DetectionServerDatabase failed Error starting DetectionServerDatabase. Reason: {0}.


to start
Managing system events and messages 203
System event codes and messages

Table 8-39 Detection server database events (continued)

Code Summary Description

3802 Invalid Port for Could not retrieve the port for DetectionServerDatabase
DetectionServerDatabase process to listen to connection. Reason: {0}. Check if the
property file setting has the valid port number.

Table 8-40 Endpoint communication layer events

Code Summary Description

3900 Internal communications error. Internal communications error. Please see {0} for errors.
Search for the string {1}.

3901 System events have been System event throttle limit exceeded. {0} events have been
suppressed. suppressed. Internal error code = {1}.

Table 8-41 Agent communication event code

Code Summary Description

4000 Agent Handshaker error Agent Handshaker error. Please see {0} for errors. Search
for the string {1}.

Table 8-42 Monitor controller replication communication layer application error events

Code Summary Description

4050 Agent data batch persist error Unexpected error occurred while agent data being persisted
: {0}. Please look at the monitor controller logs for more
information.

4051 Agent status attribute batch Status attribute data for {0} agent(s) could not be persisted.
persist error Please look at the monitor controller logs for more information.

4052 Agent event batch persist Event data for {0} agent(s) could not be persisted. Please
look at the monitor controller logs for more information.

Table 8-43 Enforce Server web services event code

Code Summary Description

4101 Response Rule Execution Request fetch failed even after {0} retries. Database
Service Database failure on connection still down. The service will be stopped.
request fetch
Managing system events and messages 204
System event codes and messages

Table 8-44 Cloud service enrollment events

Code Summary Description

4200 Cloud Service enrollment: Cloud Service enrollment: successfully received client
successfully received client certificate from Symantec Managed PKI Service.
certificate from Symantec
Managed PKI Service

4201 Cloud Service enrollment: error ERROR {0}.


requesting client certificate from
Symantec Managed PKI Service

4205 Symantec Managed PKI Symantec Managed PKI certificate expires in {0} days.
certificate expires in {0} days

4206 Symantec Managed PKI Service Symantec Managed PKI Service certificate has expired.
certificate has expired

4210 Cloud Service enrollment bundle Invalid enrollment file content.


error

4211 Cloud Service enrollment bundle Enrollment file missing from ZIP bundle.
error

4212 Invalid Cloud Detector enrollment Detector info doesn't match the existing configuration.
bundle

Table 8-45 Cloud detector event code

Code Summary Description

4300 Cloud Detector created in Cloud detector {0} created in Enforce.


Enforce

Table 8-46 User Groups profile event code

Code Summary Description

4400 One or more User Group profiles Check the "Manage > Policies > User Groups" page for
are out of date and must be more details. The following User Group profiles are out of
reindexed. date: {0}.

Table 8-47 Cloud operations event code

Code Summary Description

4701 Cloud operations events or Cloud operations issued an event or notification about the
notifications cloud service.
Managing system events and messages 205
System event codes and messages

Table 8-48 OCR event codes

Code Summary Description

4800 OCR service is busy Request not processed. OCR server's request queue
is full.

4801 Request failed to connect to Please verify OCR server's address, port, and that it
OCR server is reachable. Check logs for more detail.

4802 OCR server had an internal Please check OCR server logs for details about what
server error went wrong.

4803 OCR request was not {0}


successful

4804 Failed to initialize OCR Client {0}

4805 An Unknown error {0}


encountered

4807 The client and/or OCR server Unable to verify client and server with each other as
are not authorized with each authorized endpoints. Please verify that the client and
other server keystores are configured correctly. Check logs
on detection server and OCR server for more details.
Chapter 9
Managing the Symantec
Data Loss Prevention
database
This chapter includes the following topics:

■ Working with Symantec Data Loss Prevention database diagnostic tools

■ Viewing tablespaces and data file allocations

■ Viewing table details

■ Checking the database update readiness

Working with Symantec Data Loss Prevention


database diagnostic tools
The Enforce Server administration console lets you view diagnostic information about the
tablespaces and tables in your database to help you better manage your database resources.
You can see how full your tablespaces and tables are, and whether or not the files in the tables
are automatically extendable to accommodate more data. This information can help you
manage your database by understanding where you may want to enable the Oracle Autoextend
feature on data files, or otherwise manage your database resources. You can also generate
a detailed database report to share with Symantec Technical Support for help with
troubleshooting database issues.
You can view the allocation of tablespaces, including the size, memory usage, extendability,
status, and number of files in each tablespace. You can also view the name, size, and
Autoextend setting for each file in a tablespace. In addition, you can view table-level allocations
for incident data tables, other tables, indexes, and large object (LOB) tables.
Managing the Symantec Data Loss Prevention database 207
Viewing tablespaces and data file allocations

You can generate a full database report in HTML format to share with Symantec Technical
Support at any time by clicking Get full report. The data in the report can help Symantec
Technical Support troubleshoot issues in your database.
See “Generating a database report” on page 208.

Viewing tablespaces and data file allocations


You can view tablespaces and data file allocations on the Database Tablespaces Summary
page (System > Database > Tablespaces Summary).
The Database Tablespaces Summary page displays the following information:
■ Name: The name of the tablespace.
■ Size: The size of the tablespace in megabytes.
■ Used (%): The percentage of the tablespace currently in use. This percentage is calculated
based on the Used (MB) and Size values. It does not take into account the Extendable
To (MB) value.
■ Used (MB): The amount of the tablespace currently in use, in megabytes.
■ Extendable To (MB): The size to which the tablespace can be extended. This value is
based on the Autoextend settings of the files within the tablespace.
■ Status: The current status of the tablespace according to the percentage of the tablespace
currently in use, depending on the warning thresholds. If you are using the default warning
threshold settings, the status is:
■ OK: The tablespace is under 80% full, or the tablespace can be automatically extended.
■ Warning: The tablespace is between 80% and 90% full . If you see a warning on a
tablespace, you may consider enabling Autoextend on the data files in the tablespace
or extending the maximum value for data file auto-extensibility.
■ Severe: The tablespace is more than 90% full. If you see a severe warning on a
tablespace, you should enable Autoextend on the data files in the tablespace, extend
the maximum value for data file auto-extensibility, or determine whether you can purge
some of the data in the tablespace.

■ Number of Files: The number of data files in the tablespace.


Select a tablespace from the list to view details about the files it contains. The tablespace file
view displays the following information:
■ Name: The name of the file.
■ Size: The size of the file, in megabytes.
Managing the Symantec Data Loss Prevention database 208
Viewing tablespaces and data file allocations

■ Auto Extendable: Specifies if the file is automatically extendable based on the Autoextend
setting of the file in the Oracle database.
■ Extendable To (MB): The maximum size to which the file can be automatically extended,
in megabytes.
■ Path: The path to the file.

Adjusting warning thresholds for tablespace usage in large databases


If your database contains a very large amount of data (1 terabyte or more), you may want to
adjust the warning thresholds for tablespace usage. For such large databases, Symantec
recommends adjusting the Warning threshold to 85% full, and the Severe threshold to 95%
full. You may want to set these thresholds even higher for larger databases. You can specify
these values in the Manager.properties file.
To adjust the tablespace usage warning thresholds
1 Open the Manager.properties file in a text editor.
2 Set the Warning and Severe thresholds to the following values:

com.vontu.manager.tablespaceThreshold.warning=85
com.vontu.manager.tablespaceThreshold.severe=95

3 Save the changes to the Manager.properties file and close it.


4 Restart the Symantec DLP Manager service to apply your changes.

Generating a database report


You can generate a full database report in HTML format at any time by clicking Get full report
on the Database Tablespaces Summary page. The database report includes the following
information:
■ Detailed database information
■ Incident data distribution
■ Message data distribution
■ Policy group information
■ Policy information
■ Endpoint agent information
■ Detection server (monitor) information
Symantec Technical Support may request this report to help troubleshoot database issues.
Managing the Symantec Data Loss Prevention database 209
Viewing table details

To generate a database report


1 Navigate to System > Database > Tablespaces Summary.
2 Click Get full report.
3 The report takes several minutes to generate. Refresh your screen after several minutes
to view the link to the report.
4 To open or save the report, click the link above the Tablespaces Allocation table. The
link includes the timestamp of the report for your convenience.
5 In the Open File dialog box, chose whether to open the file or save it.
6 To view the report, open it in a web browser or text editor.
7 To update the report, click Update full report.

Viewing table details


You can view table-level allocations on the Database Table Details page (System > Database
> Table Details). Viewing table-level allocations can be useful after a large data purge to see
the de-allocation of space within your database segments. You can refresh the information
displayed on this page by clicking Update table data at any time.
The Database Table Details page displays your table-level allocations on one of four tabs:
■ Incident Tables: This tab lists all the incident data tables in the Symantec Data Loss
Prevention database schema. The tab displays the following information:
■ Table Name: The name of the table.
■ In Tablespace: The name of the tablespace that contains the table.
■ Size (MB): The size of the table, in megabytes.
■ % Full: The percentage of the table currently in use.

■ Other Tables: This tab lists all other tables in the schema. The tab displays the following
information:
■ Table Name: The name of the table.
■ In Tablespace: The name of the tablespace that contains the table.
■ Size (MB): The size of the table, in megabytes.
■ % Full: The percentage of the table currently in use.

■ Indices: This table lists all of the indexes in the schema. The tab displays the following
information:
■ Index Name: The name of the index.
Managing the Symantec Data Loss Prevention database 210
Checking the database update readiness

■ Table Name: The name of the table that contains the index.
■ In Tablespace: The name of the tablespace that contains the table.
■ Size (MB): The size of the table, in megabytes.
■ % Full: The percentage of the table currently in use.

■ LOB Segments: This table lists all of the large object (LOB) tables in the schema. The tab
displays the following information:
■ Table Name: The name of the table.
■ Column Name: The name of the table column containing the LOB data.
■ In Tablespace: The name of the tablespace that contains the table.
■ LOB Segment Size (MB): The size of the LOB segment, in megabytes.
■ LOB Index Size: The size of the LOB index, in megabytes.
■ % Full: The percentage of the table currently in use.

Note: The percentage used value for each table displays the percentage of the table currently
in use as reported by the Oracle database in dark blue. It also includes an additional estimated
percentage used range in light blue. Symantec Data Loss Prevention calculates this range
based on tablespace utilization.

Checking the database update readiness


You use the Update Readiness tool to confirm that the Oracle database is ready to upgrade
to the next Symantec Data Loss Prevention version.
The Update Readiness tool tests the following items in the database schema:
■ Oracle version
■ Oracle patches
■ Permissions
■ Tablespaces
■ Existing schema against standard schema
■ Real Application Clusters
■ Change Data Capture
■ Virtual columns
■ Partitioned tables
Managing the Symantec Data Loss Prevention database 211
Checking the database update readiness

■ Numeric overflow
■ Temp Oracle space
Table 9-1 lists tasks you complete to run the tool.

Table 9-1 Using the Update Readiness tool

Step Task Details

1 Prepare to run the Update Readiness See “Preparing to run the Update Readiness tool”
tool. on page 211.

2 Create the Update Readiness tool See “Creating the Update Readiness tool database
database account. account” on page 213.

3 Run the tool. See “Running the Update Readiness tool at the
command line” on page 215.

4 Review the update readiness results. See “Reviewing update readiness results” on page 218.

Preparing to run the Update Readiness tool


Preparing the Update Readiness tool includes downloading the tool and moving it to the Enforce
Server.
To prepare the Update Readiness tool
1 Obtain the latest version of the tool (for both major or minor release versions of Symantec
Data Loss Prevention) from Software Downloads.
The tool file name is Symantec_DLP_15.5_Update_Readiness_Tool_15.5.0-1.zip. The
tool version changes when updated tools are released.
The latest version of the Update Readiness tool includes important fixes and improvements,
and should be the version that you use before attempting an upgrade. See the Support
Center article About the Symantec Data Loss Prevention Update Readiness tool, and
URT test results for information about the latest version; subscribe to the article to be
informed about new versions.
Symantec recommends that you download the tool to the DLPDownloadHome directory.

Note: Review the Readme file that is included with the tool for a list of Symantec Data
Loss Prevention versions the tool is capable of testing.

2 Log on as Administrator to the database server system.


3 Confirm the following if you are running a three-tier deployment:
Managing the Symantec Data Loss Prevention database 212
Checking the database update readiness

■ That you are running the same Oracle Client version as the Oracle Server version.
If the versions do not match, the Oracle Client cannot connect to the database, which
causes the Update Readiness tool to fail.
■ That the Oracle Client is installed as Administrator.
If the Oracle Client is not installed as Administrator, reinstall it and select Administrator
on the Select Installation Type panel. Selecting Administrator enables the
command-line clients, expdp and impdp.

4 Stop Oracle database jobs if your database has scheduled jobs.


See “Stopping Oracle database jobs” on page 212.
5 Unzip the tool, then copy the contents of the unzipped folder to the following location. Do
not unzip the tool as a folder to this location: The contents of the tool folder must reside
directly in the URT folder as specified:
c:\Program
Files\Symantec\DataLossPrevention\EnforceServer\15.5\Protect\Migrator\URT\
(for Windows)
opt/Symantec/DataLossPrevention/EnforceServer/15.5/Protect/Migrator/URT/
(for Linux)
During the upgrade process, the Migration Utility checks the database update readiness
by running the Update Readiness tool from this location.
See “Checking the database update readiness” on page 210.

Stopping Oracle database jobs


If your database has scheduled jobs, you must unschedule them and clear the jobs queue
before you run the Update Readiness tool and start the migration process. After the jobs are
unscheduled and the jobs queue is clear, you can run the Update Readiness tool and continue
your migration.
Managing the Symantec Data Loss Prevention database 213
Checking the database update readiness

To unschedule jobs
1 Log on to SQL*Plus using the Symantec Data Loss Prevention database user name and
password.
2 Run the following:

BEGIN
FOR rec IN (SELECT * FROM user_jobs) LOOP
dbms_job.broken( rec.job, true);
dbms_job.remove( rec.job);
END LOOP;
END;
/

3 Verify that all jobs are unscheduled by running the following:

Select count(*) from user_jobs;

Confirm that the count is zero. If the count is not zero, run the command to clear the queue
again. If a job is running when you attempt to clear the queue, the job continues to run
until it completes and is not cleared. For long running jobs, Symantec recommends that
you wait for the job to complete instead of terminating the job.
4 Exit SQL*Plus.

Creating the Update Readiness tool database account


Before you can run the Update Readiness tool, you must create a database account.
To create the new Update Readiness tool database account
1 Navigate to the /script folder where you extracted the Update Readiness tool.
2 Start SQL*Plus:

sqlplus /nolog

3 Run the oracle_create_user.sql script:

SQL> @oracle_create_user.sql

4 At the Please enter the password for sys user prompt, enter the password for the SYS
user.
5 At the Please enter Service Name prompt, enter a user name.
6 At the Please enter required username to be created prompt, enter a name for the new
upgrade readiness database account.
Managing the Symantec Data Loss Prevention database 214
Checking the database update readiness

7 At the Please enter a password for the new username prompt, enter a password for
the new upgrade readiness database account.
Use the following guidelines to create an acceptable password:
■ Passwords cannot contain more than 30 characters.
■ Passwords cannot contain double quotation marks, commas, or backslashes.
■ Avoid using the & character.
■ Passwords are case-sensitive by default. You can change the case sensitivity through
an Oracle configuration setting.
■ If your password uses special characters other than _, #, or $, or if your password
begins with a number, you must enclose the password in double quotes when you
configure it.
Store the user name and password in a secure location for future use. You use this user
name and password to run the Update Readiness tool.
8 As the database sysdba user, grant permission to the Symantec Data Loss Prevention
schema user name for the following database objects:

sqlplus sys/[sysdba password] as sysdba


GRANT READ,WRITE ON directory DATA_PUMP_DIR TO [schema user name];
GRANT SELECT ON dba_registry_history TO [schema user name];
GRANT SELECT ON dba_temp_free_space TO [schema user name];

See “Preparing to run the Update Readiness tool” on page 211.


See “Checking the database update readiness” on page 210.

Running the Update Readiness tool from the Enforce Server


administration console
You can run the Update Readiness tool from the Enforce Server administration console to
check the update readiness for the next Symantec Data Loss Prevention version. To run the
tool, you must have User Administration (Superuser) or Server Administration user privileges.
To run the Update Readiness tool
1 Go to System > Servers and Detectors > Overview, and click System Servers and
Detectors Overview.
2 Click Upload the Update Readiness tool and locate the tool.
If you the tool has already been uploaded, and you upload a new version, the old version
is deleted.
See “Preparing to run the Update Readiness tool” on page 211.
Managing the Symantec Data Loss Prevention database 215
Checking the database update readiness

3 Enter the Update Readiness tool database account user credentials.

Warning: Do not enter the protect user database credentials. Entering credentials other
than the Update Readiness tool database account overwrites the Symantec Data Loss
Prevention database.

See “Creating the Update Readiness tool database account” on page 213.
4 Click Run Update Readiness Tool to begin the update readiness check.
You can click Refresh this page to update the status of the readiness check. When you
refresh, a link to a summary of results returned at that point in time displays. The process
may take up to an hour depending on the size of the database.
When the tool completes the test, you are provided with a link you can use to download
the results log.
See “Reviewing update readiness results” on page 218.
See “Checking the database update readiness” on page 210.

Running the Update Readiness tool at the command line


You can run the Update Readiness tool from the command prompt on the Enforce Server host
computer.
Managing the Symantec Data Loss Prevention database 216
Checking the database update readiness

To run the Update Readiness tool


1 Open a command prompt window.
2 Go to the URT directory:
c:\Program
Files\Symantec\DataLossPrevention\EnforceServer\15.5\Protect\Migrator\URT
(for Windows)
opt/Symantec/DataLossPrevention/EnforceServer/15.5/Protect/Migrator/URT
(for Linux)
Managing the Symantec Data Loss Prevention database 217
Checking the database update readiness

3 Run the Update Readiness tool using the following command:

"C:\Program Files\Symantec\DataLossPrevention\ServerJRE\1.8.0_181\bin\java" UpdateReadinessTool


--username <schema user name>
--password <password>
--readiness_username <readiness_username>
--readiness_password <readiness_password>
--sid <database_system_id>
[--quick]

"/opt/Symantec/DataLossPrevention/ServerJRE/1.8.0_181/bin/java" UpdateReadinessTool
--username <schema user name>
--password <password>
--readiness_username <readiness_username>
--readiness_password <readiness_password>
--sid <database_system_id>
[--quick]

The following table identifies the commands:

<schema user name> The Symantec Data Loss Prevention schema user name.

<password> The Symantec Data Loss Prevention schema password.

<readiness_username> The Update Readiness tool database account user you created.

See “Creating the Update Readiness tool database account”


on page 213.

<readiness_password> The password for the Update Readiness tool database account
user.

<database_system_id> The database system ID (SERVICE_NAME), typically "protect."

[--quick] The optional command only runs the database object check and
skips the update readiness test.

After the test completes, you can locate the results in a log file in the /output directory.
This directory is located where you extracted the Update Readiness tool. If you do not
include [--quick] when you run the tool, the test may take up to an hour to complete.
You can verify the status of the test by reviewing log files in the /output directory.
See “Preparing to run the Update Readiness tool” on page 211.
See “Reviewing update readiness results” on page 218.
Managing the Symantec Data Loss Prevention database 218
Checking the database update readiness

Reviewing update readiness results


After you run the Update Readiness tool, the tool returns test results in a log file. Table 9-2
lists the results summarized in the log file.

Table 9-2 Update Readiness results

Status Description

Pass Items that display under this section are confirmed and ready for update.

Warning If not fixed, items that display under this section may prevent the database from
upgrading properly.

Error These items prevent the upgrade from completing and must be fixed.

See “Checking the database update readiness” on page 210.


Chapter 10
Working with Symantec
Information Centric
Encryption
This chapter includes the following topics:

■ About Symantec Information Centric Encryption

■ About the Symantec ICE Utility

■ Overview of implementing Information Centric Encryption capabilities

■ Configuring the Enforce Server to connect to the Symantec ICE Cloud

About Symantec Information Centric Encryption


Symantec Information Centric Encryption (ICE) is a risk-reduction solution that lets your
employees, partners, and trusted individuals securely share company email and files. Symantec
ICE can help you to detect confidential email and files and encrypt them so that only the users
that you authorize can access them.
Typical encryption technologies may allow data loss after email or files are decrypted. Once
they are decrypted, they can be sent to other individuals and are no longer protected. However,
ICE encryption technology encrypts and protects email and files throughout their life, regardless
of where they travel.
When an email or file is determined to be confidential or critical, ICE automatically encrypts it
in place by using the ICE library and encryption services. Once it is encrypted, only the users
that you authorize can read it.
ICE also includes the Information Centric Encryption Cloud Console, which provides you with
visibility into the use of ICE-encrypted email and files. You can monitor who has accessed
Working with Symantec Information Centric Encryption 220
About the Symantec ICE Utility

those email and files, from where they are accessed, and how they are used. You can also
use the ICE Cloud Console to set specific group permissions. You can set permissions for the
saving, sharing, and editing of email and files for policy groups. You can also revoke access
to individual email and files or revoke rights to access email and files for specific policy groups.
How and what you protect depends upon the Symantec solution you integrate with ICE. ICE
is designed to bring end-to-end encryption to multiple Symantec products, enhancing the
security of your emails and files. Table 10-1 lists the most common ways you can use ICE
with Symantec products.

Table 10-1 ICE encryption solutions

To... Use ICE with...

Protect files in cloud file storage such as Box and Symantec CloudSOC
OneDrive.

Protect files stored in: Symantec Data Loss Prevention 15 and later

■ Cloud file storage such as Box and OneDrive. Symantec Data Loss Prevention also allows you to
■ Enterprise file storage such as File System create robust policies and remediation rules to
servers and Microsoft SharePoint. protect these files and emails.
■ Endpoint content such as removable drives.

Protect files uploaded by browsers over HTTPs.

Protect emails and attachments for Microsoft Office


365 Exchange Online and Google G Suite Gmail.

Protect emails and email attachments in the cloud. Symantec Data Loss Prevention for Email with
Cloud Console (DLP Cloud Console)

ICE with DLP Cloud Console has a minimal


on-premises footprint.

Integrate classification with encryption capabilities Symantec Information Centric Tagging (ICT)
for multilevel protection of sensitive information both
Integrating the capabilities of ICE and ICT results
inside and outside your network.
in a powerful information protection solution known
Applies to files and email in a Windows as Symantec Information Centric Security Module.
environment.

See the Symantec™ Information Centric Encryption Deployment Guide for details on integrating
Symantec ICE with Symantec Data Loss Prevention.

About the Symantec ICE Utility


The Symantec ICE Utility allows an authorized user to decrypt a file that has been encrypted
by ICE. If a user attempts to access a file that ICE protects, the ICE Utility prompts the user
Working with Symantec Information Centric Encryption 221
About the Symantec ICE Utility

for authentication. If the user is authenticated, the ICE Utility decrypts the file. The user can
decrypt ICE-encrypted files when endpoints are not connected to the Internet.
The ICE Utility also applies any permission sets assigned to the user in the ICE Cloud Console.
For example, if you have disabled printing for the user or the policy group, the user is not able
to print the document.

Note: On mobile devices, the ICE Utility is called ICE Workspace. You can get ICE Workspace
with the VIP Access for Mobile app.

The ICE Utility is context aware, meaning that it recognizes a user's environment. The ICE
Utility can be deployed in two types of environments: managed environments and unmanaged
environments.
The Symantec ICE Utility automatically detects a network proxy that is configured on an
endpoint and uses it to connect to the Symantec ICE Cloud. Additionally, in a managed
environments, the ICE Utility uses the same network proxy settings that are stored in an agent
configuration used by the DLP Agent that is installed on the same endpoint.
■ In managed environments, your organization provides and maintains the devices on which
users access protected files.
In managed environments, the ICE Utility leverages the policies and security controls that
your organization puts in place over user devices. In this environment, the ICE Utility gives
the user greater flexibility with decrypting and working with protected files. Files open in
their native app, and the user has full access to the file to edit, share, save, save as, and
print the file. Users are required to authenticate at least once every 180 days (configurable
in the ICE Cloud Console).
The managed version of the ICE Utility works the same across Windows and macOS
platforms; however, the Windows version of the ICE Utility installation package also includes
the ICT agent. Users can only install the ICT agent if you have implemented ICT and
correctly configured the ICT agent installation package.
■ In unmanaged environments, such as those of your partners or in which employees bring
their own devices, users' devices are outside your direct control.
Since you have no direct control over the security of the users' devices in unmanaged
environments, the ICE Utility provides additional security. The ICE Utility enforces stricter
restrictions over when and how a file is decrypted, and allows you greater content control
through the use of permission sets.
When users attempt to open a protected file on a device without the ICE Utility, they are
prompted to download the ICE Utility.
Users that attempt to access an encrypted file are required to authenticate at least once
every 24 hours (configurable in the ICE Cloud Console).
Working with Symantec Information Centric Encryption 222
Overview of implementing Information Centric Encryption capabilities

■ On Windows, supported file types are decrypted and opened in their native app, but
the permissions that you assigned to the user are enforced. So, if you have restricted
printing for the user or the policy group, the user is unable to print the file.
Files that ICE does not support open in their native app, but ICE does not enforce
permissions.
■ On macOS, supported file types are opened in their native app, if the edit permission
is enabled on the Information Centric Encryption Cloud Console. However, if the
permissions include content lock or print restrictions, such files open in the Mac
Preview application in view-only mode. For Office formats, ICE-encrypted files launch
the Microsoft Office application. If the user does not have Microsoft Office installed,
then Word documents open in Mac TextEdit, and Excel and PowerPoint files open in
Mac Preview.
On iOS, supported file types are opened in a view-only mode irrespective of the
permissions that are assigned to the user.

In all environments, when the user finishes with the file, the ICE Utility encrypts it again,
maintaining the file's security throughout its lifetime. However, if the permissions for a user
allow the user to save the file with a new name, the new file is not encrypted.
See the following for more information about the ICE Utility.

For information about See

How to provide the ICE Utility to your users

How users are authenticated through the ICE Utility

Where ICE Utility logs are stored

How the ICE Utility works on mobile devices

How customers using Symantec Data Loss


Prevention Cloud Service for Email with Microsoft
Office 365 Exchange Online can allow users to view
emails without the ICE Utility

Overview of implementing Information Centric


Encryption capabilities
The high-level steps for implementing Information Centric Encryption with Symantec Data
Loss Prevention are provided in Table 10-2. Specific task steps are provided in the topics
referenced in the "Details" column.
For more information about Information Centric Encryption, refer to the Symantec Information
Centric Encryption Deployment Guide at http://www.symantec.com/docs/DOC9707.
Working with Symantec Information Centric Encryption 223
Overview of implementing Information Centric Encryption capabilities

Table 10-2 Overview of implementing Information Centric Encryption capabilities

Step Action Details

1 Depending on your See “Installing a new license file” on page 234.


organization's security
needs, install one or both
of the following licenses:

■ Network Protect ICE


■ Endpoint Prevent ICE

2 Configure the Enforce See “Configuring the Enforce Server to connect to the
Server to connect to the Symantec ICE Cloud” on page 224.
Symantec ICE Cloud.

3 Configure policy response See “Configuring the Endpoint Prevent: Encrypt action”
rule actions to protect on page 1821.
sensitive files using ICE
See “Configuring the Network Protect: Encrypt File action”
encryption.
on page 1838.

See “Configuring the Server FlexResponse action” on page 1788.

4 Configure Network Protect See “Configuring Network Protect for file shares” on page 2177.
to enable ICE encryption
protection for supported
scan targets.

5 Configure Cloud Service See “Encrypting cloud email with Symantec Information Centric
for Email policy response Encryption” on page 2518.
rule actions to protect both
sensitive emails and
attachments or sensitive
email attachments using
ICE encryption.

6 Enable ICE encryption in See “Information Centric Encryption settings for DLP Agents”
Endpoint Prevent to protect on page 2371.
confidential files that are:
See “Configuring Network Protect for SharePoint servers”
■ Stored on removable on page 2203.
devices that are
connected to endpoints
■ Stored on cloud storage
applications
■ Uploaded with
browsers using HTTPS
Working with Symantec Information Centric Encryption 224
Configuring the Enforce Server to connect to the Symantec ICE Cloud

Table 10-2 Overview of implementing Information Centric Encryption capabilities


(continued)

Step Action Details

7 Download and then install The ICE Utility is available for download from Symantec
the ICE Utility on all FileConnect.
managed devices within
See “About the Symantec ICE Utility” on page 220.
your organization. The ICE
Utility is required for users
to be able to access
ICE-encrypted files.

Unmanaged device users


will be prompted to
download and install the
ICE Utility when they
attempt to access an
ICE-encrypted file for the
first time on a particular
device.

Configuring the Enforce Server to connect to the


Symantec ICE Cloud
After you install the Endpoint Prevent ICE license, or the Network Protect ICE license, or
upload your Cloud Service for Email enrollment bundle, you must configure the Enforce Server
to connect to the Symantec ICE Cloud. This step is a prerequisite for enabling any of the
encryption-related functions that you can configure using the Enforce Server administration
console.
See “Installing a new license file” on page 234.
To configure the Enforce Server to connect to the Symantec ICE Cloud:
1 Go to System > Settings > General and click Configure.
2 At the Edit General Settings screen, scroll down to the ICE Cloud Access Settings
section.
3 Type the following Symantec ICE Cloud details in the provided fields:
■ Service URL
■ Customer ID
■ Domain ID
■ Service User ID
Working with Symantec Information Centric Encryption 225
Configuring the Enforce Server to connect to the Symantec ICE Cloud

■ Service Password

Note: Obtain this information from the Settings > Advanced Configuration > External
Services page of the ICE Cloud Console. Note that the Service Password is only visible
when you first authorize an external service. If you have lost your Service Password, the
only way to see your Service Password is to obtain a new one.

4 Click Save.
5 To enable and configure the ICE functionality in Symantec Data Loss Prevention, do one
or more of the following, depending on which ICE licenses are installed:
■ Configure Network Protect to enable ICE encryption protection for the supported scan
targets.
See “Configuring Network Protect for file shares” on page 2177.
■ Configure Cloud Service for Email to enable ICE email encryption of Office 365 email
and Gmail in the cloud.
See the Cloud Service for Email Implementation Guide at the Symantec Support Center
at http://www.symantec.com/docs/DOC9008.
■ Enable ICE in Endpoint Prevent to encrypt the following sensitive files:
■ Files that are transferred to removable storage
■ Files that are transferred by a cloud storage application
■ Files that are uploaded with browsers using HTTPS
See “Information Centric Encryption settings for DLP Agents” on page 2371.
Chapter 11
Working with Symantec
Information Centric Tagging
This chapter includes the following topics:

■ About integrating Information Centric Tagging with Data Loss Prevention

■ Overview of steps to tie Information Centric Tagging to Data Loss Prevention

■ Integrating the ICT server with the Enforce Server

■ Importing the ICT classification taxonomy

■ Supported file types for ICT-Data Loss Prevention integration

About integrating Information Centric Tagging with


Data Loss Prevention
Symantec Information Centric Tagging (ICT) is a data classification product that defines and
supports the application of tags and watermarks to emails and files. Information Centric Tagging
is also part of the separately licensed Information Centric Security Module (ICSM). ICSM
additionally offers data protection by providing encryption options--including Symantec
Information Centric Encryption (ICE)--that can be associated with certain tags.
The data classification taxonomy is a hierarchy of configured organization-scope-sensitivity
level tags. You use the administration console to import the taxonomy from the Information
Centric Tagging product into the Data Loss Prevention Enforce Server database.
Working with Symantec Information Centric Tagging 227
About integrating Information Centric Tagging with Data Loss Prevention

Note: Import of the taxonomy requires that a Data Loss Prevention domain user, whose name
is identified when ICT server credentials are added to the credential store, is also associated
in ICT with certain Active Directory User Groups. This association provides the user access
to ICT Administration Webservice methods. Additionally, an entry must be added to the Windows
Hosts file, mapping the ICT server IP address to its host name.

Once you have imported the taxonomy, you select appropriate tags from it to define response
rules of the ICT Classification And Tagging action type. You then attach the rules to policies
so that ICT tags are applied to content according to your corporate policy.
Tags can be applied in two ways:
■ You create Endpoint Discover scans. These scans apply the tags in response to policy
violations, or to all targeted content solely as a baseline Classification Scan.
■ ICT end users apply tags. The ICT Administrator enables Data Loss Prevention integration
by selecting the Symantec DLP Policies Integration option during ICT system setup.
Those Data Loss Prevention policies configured with ICT-based response rules are imported
to ICT. Data Loss Prevention policies, not ICT rules, drive automatic classification on the
ICT endpoint..
You can also use the imported taxonomy to create detection rules using the Content Matches
Classification option. You create the rules by selecting the tags displayed on the administration
console. Tagged content is discovered in the metadata of supported emails and files.

Note: Tagging can be used to notify Symantec Endpoint Protection (SEP) about certain files.
(This requires a separate license and the presence of a SEP agent on the Data Loss Prevention
endpoint.) To enable integration with SEP, when the ICT Administrator creates the classification
taxonomy, the Administrator can enable the Information Centric Defense option. This ICD
option appears on the classification level screens. When your Endpoint Discover scan runs
and applies a tag that contains this option, Data Loss Prevention notifies SEP about this file.
In a forthcoming release of SEP that integrates this functionality, SEP administrators will be
able to configure SEP to take necessary action on the classified file

The integration of ICT with Data Loss Prevention requires ongoing coordination between you
and the ICT Administrator. Some of the events requiring communication include:
■ You decide to use ICT tags in Data Loss Prevention. You notify the ICT Administrator, who
lets you know when the ICT taxonomy is ready. You import the taxonomy into Data Loss
Prevention, create ICT-based response rules that use those tags, and attach them to
policies.
■ If ICT end users will be applying the tags, you notify the ICT Administrator that the policies
are in place. The ICT Administrator confirms that the Symantec DLP Policies Integration
check box is selected on the ICT Administration Console. The Data Loss Prevention policies
Working with Symantec Information Centric Tagging 228
Overview of steps to tie Information Centric Tagging to Data Loss Prevention

are imported to ICT so that automatic classification is driven by Data Loss Prevention
policies, not by ICT rules.
■ If you will be applying the tags as part of Endpoint Discover scans, as a courtesy, you notify
the ICT Administrator. If ICT end users are working with those files, tagging activity may
fail.
See “Overview of steps to tie Information Centric Tagging to Data Loss Prevention” on page 228.
For more information, see the Information Centric Tagging documentation here:
https://support.symantec.com/en_US/article.DOC11257.html

Overview of steps to tie Information Centric Tagging


to Data Loss Prevention
The high-level steps for integrating Symantec Information Centric Tagging with Symantec Data
Loss Prevention are provided in Table 11-1. Specific task steps are provided in the topics
referenced in the "Details" column.

Table 11-1 Overview of implementing Information Centric Tagging capabilities

Step Action Details

1 Prepare to integrate the ICT server with the See “Integrating the ICT server with the
Enforce Server by defining the ICT server Enforce Server” on page 229.
credentials, and the ICT Web Service URL or an
XML-file pathname.

2 Schedule or trigger the Information Centric Tagging See “Importing the ICT classification
classification taxonomy import. taxonomy” on page 231.

3 For detection purposes, define response rules with See “Configuring the Content Matches
the Content Matches Classification option, then Classification condition” on page 863.
attach them to policies.

4 For tagging purposes, define response rules with See “Configuring response rule actions”
the ICT Classification And Tagging Action type, on page 1765.
then attach them to policies.
See “Configuring the Endpoint: ICT
Classification And Tagging action”
on page 1814.

5 For ICT tagging driven by Endpoint Discover See “About Endpoint Discover classification
scans, define the scans, either for policy-violation scanning” on page 2320.
tagging or as a baseline Classification Scan.
See “Creating an Endpoint Discover scan”
Note: These tagging scans require the DLP Agent on page 2326.
on the endpoint. (Mac and Windows)
Working with Symantec Information Centric Tagging 229
Integrating the ICT server with the Enforce Server

Table 11-1 Overview of implementing Information Centric Tagging capabilities (continued)

Step Action Details

6 For ICT tagging applied by end users, have the


ICT administrator enable Symantec DLP Policies
Integration, from the ICT Administration Console.
Note: This form of tagging requires both the DLP
Agent and the ICT agent on the endpoint.
(Windows only)

Integrating the ICT server with the Enforce Server


To integrate the Enforce Server with the ICT server, define the ICT server settings. These
settings include the ICT server credentials and the ICT Web Service URL or an XML pathname.
To define your Information Centric Tagging server settings
1 In the Enforce Server administration console, navigate to System > Settings > Information
Centric Tagging.
2 To enable the settings, click Edit.
3 In the Server Credential field, select the ICT server credential from the drop-down menu.
The credential name represents the login and password to the ICT server.
To add the credential to the menu, go to the Credentials page in the administration
console and enter it. The credential must be a Windows domain user account with privileges
to access ICT. These privileges are established in ICT, when the administrator associates
the domain user with certain Active Directory User Groups. See "Creating AD users,
groups, and Organizational Units" in the Information Centric Tagging Deployment Guide.
See “Adding new credentials to the credential store” on page 161.
4 In the ICT Web Service URL field, type either the ICT Web Service URL or an XML file
pathname.
See “About automatic and static imports of the ICT classification taxonomy” on page 229.
If you change the ICT Web Service URL: See “Changing the ICT Web Service URL”
on page 230.

About automatic and static imports of the ICT classification taxonomy


You can use the ICT Web Service for automatic, scheduled imports of the ICT classification
taxonomy. If you cannot use the ICT Web Service--perhaps you have a restrictive firewall or
a policy on the Enforce Server that does not allow database updates from external
Working with Symantec Information Centric Tagging 230
Integrating the ICT server with the Enforce Server

processes--you can alternately import a static, XML-based version of the taxonomy. For either
of these methods, you can perform the import immediately, rather than schedule it.
See “Using the ICT Web Service for scheduled classification taxonomy imports” on page 230.
See “Using an XML file for static classification taxonomy imports” on page 231.

Using the ICT Web Service for scheduled classification taxonomy


imports
To use the ICT Web Service for ICT classification taxonomy imports
◆ On the Information Centric Tagging page, in the ICT Web Service URL field, type the
ICT Web Service URL.
The URL syntax is
http://<ICT_server>/ICT/Admin-Webservice/Classifications.asmx.

Requirements for using the ICT Web Service for imports are:
■ A network connectivity on port 80 between the Data Loss Prevention Enforce Server and
the Information Centric Tagging server.
■ The ICT server identified to Windows from the Enforce Server:
■ Navigate to %systemdrive%\Windows\System32\drivers\etc\.
■ Edit the Windows Hosts file to map the ICT server IP address to its host name, using
the tabulated format: <IP> <FQDN of ICT server>.

Changing the ICT Web Service URL


The need to change the ICT Web Service URL is rare; however, if you change the name of
the Information Centric Tagging server, for example, and a URL change is necessary, see
Table 11-2 for actions you may need to take.

Table 11-2 Implications of changing the ICT Web Service URL

Circumstance Action

You have not yet synchronized an ICT classification Change the URL without taking any other action.
import using this URL.
Click Edit to enable the ICT Web Service URL field.
Make the change, then click Save.

You have synchronized an ICT classification import Change the URL without taking any other action.
using this URL and the new URL still points to the
same taxonomy as before.
Working with Symantec Information Centric Tagging 231
Importing the ICT classification taxonomy

Table 11-2 Implications of changing the ICT Web Service URL (continued)

Circumstance Action

You have synchronized an ICT classification import If you have existing detection rules in use:
using this URL, but the new URL points to a different
1 Delete any incidents generated from those
taxonomy.
rules.

2 Delete any detection rules that use the Content


Matches Classification option.

3 Define new rules using the taxonomy that


results from using the new ICT Web Service
URL.

Using an XML file for static classification taxonomy imports


To import the ICT taxonomy using an XML file
1 Log on to the ICT server as a Windows user with privileges to access the ICT SQL
database.
2 Use the local Internet Explorer browser on the server to browse the ICT Web Service.
The Web Service URL uses this syntax:
http://<ICT_server>/ICT/Admin-Webservice/Classifications.asmx

3 Run the GetAllClassifications operation.


On the Classifications tab, click Invoke.
4 Select and copy the entire resulting XML from the IE browser window and save it to a text
file.
5 Drop the file anywhere on the Enforce Server.

Note: This step requires administrator (write) permission on the Enforce Server.

6 On the Information Centric Tagging page, in the ICT Web Service URL field, enter the
XML pathname instead of the URL. A sample XML pathname is:
file://Program Files/Symantec/Data
LossPrevention/EnforceServer/15.5/Protect/config/ICT.xml

Importing the ICT classification taxonomy


You can establish a daily import schedule or do an immediate import.
Working with Symantec Information Centric Tagging 232
Supported file types for ICT-Data Loss Prevention integration

To set a synchronization schedule for the ICT classification taxonomy import


◆ On the Information Centric Tagging page, in the Sync daily at field, from the two
drop-down menus, select the hour and minutes for the import. The ICT Web Service
synchronization will run daily.
To do an immediate import of the ICT classification taxonomy
◆ On the Information Centric Tagging page, to immediately trigger an import, click SYNC
NOW.
After a synchronization runs, the imported taxonomy appears on the Information Centric
Tagging page, under the columns for Organization, Scope, Sensitivity, and Level. Click
any column to sort it.
Be aware that when you resynchronize the taxonomy, any existing taxonomy is deleted and
replaced with the new one.
Note that in Information Centric Tagging, once a classification is created, it cannot be deleted.
Your existing Data Loss Prevention detection policies will continue to work, even when a new
import runs. However, the ICT administrator can make changes to the classifications. Therefore,
over time, you should review your existing policies. Update or delete and recreate them, if
necessary, to reflect changed Organization-Scope-Sensitivity Level tags. Your review should
also include the names of your policies, if they are indicative of the tags being detected.

Supported file types for ICT-Data Loss Prevention


integration
Table 11-3 lists the supported file types from which ICT tags can be read by Data Loss
Prevention policies (Detection) and to which the DLP Agent and ICT agent can write tags
(Endpoint Discover).

Table 11-3 Supported file types for ICT-Data Loss Prevention integration

File types Extension Read Write tags


tags

Microsoft Office files: CFB doc, dot, pot, pps, ppt, xla, xls, xlt Y Y
(old) format

Microsoft Office files: HTML docm, docx, dotm, dotx, potm, potx, Y Y
(new) format ppsm, ppsx, pptx, xlam, xlsb, xlsm, xltm,
xlsx, xltx

Portable Document Format pdf Y Y

Image files gif Y Y


Working with Symantec Information Centric Tagging 233
Supported file types for ICT-Data Loss Prevention integration

Table 11-3 Supported file types for ICT-Data Loss Prevention integration (continued)

File types Extension Read Write tags


tags

png Y Y

JPEG jpe, jpg, jpeg, jfif N Y

TIFF tif, tiff N Y

Hypertext Markup Language htm, html N N


Chapter 12
Adding a new product
module
This chapter includes the following topics:

■ Installing a new license file

■ About system upgrades

Installing a new license file


When you first purchase Symantec Data Loss Prevention, upgrade to a later version, or
purchase additional product modules, you must install one or more Symantec Data Loss
Prevention license files. License files have names in the format name.slf.
You can also enter a license file for one module to start and, later on, enter license files for
additional modules.
To install a license:
1 Download the new license file.
2 Go to System > Settings > General and click Configure.
3 At the Edit General Settings screen, scroll down to the License section.
4 In the Install License field, browse for the new Symantec Data Loss Prevention license
file you downloaded, then click Save to agree to the terms and conditions of the end user
license agreement (EULA) for the software and to install the license.
The Current License list displays the following information for each product license:
■ Product – The individual Symantec Data Loss Prevention product name
■ Count – The number of users licensed to use the product
■ Status – The current state of the product
Adding a new product module 235
About system upgrades

■ Expiration – The expiration date of license for the product


A month before Expiration of the license, warning messages appear on the System > Servers
> Overview screen. When you see a message about the expiration of your license, contact
Symantec to purchase a new license key before the current license expires.

About system upgrades


For information about upgrading the Symantec Data Loss Prevention software, see the
Symantec Data Loss Prevention Upgrade Guide.
See “About Symantec Data Loss Prevention administration” on page 82.
Chapter 13
Applying a Maintenance
Pack
This chapter includes the following topics:

■ Applying a Symantec Data Loss Prevention Maintenance Pack

Applying a Symantec Data Loss Prevention


Maintenance Pack
Maintenance Packs can only be applied to an already installed version of Symantec Data Loss
Prevention. For example, a maintenance pack for 15.5 can only be applied to Symantec Data
Loss Prevention 15.5 (new or upgraded installation).
Before applying a maintenance pack or installing Symantec Data Loss Prevention, refer to the
Symantec Data Loss Prevention System Requirements and Compatibility Guide for information
about system requirements. This guide is available online here:
https://www.symantec.com/docs/DOC10602

Steps to apply a maintenance pack on Windows servers


The following table describes the high-level steps that are involved in applying the maintenance
pack to a Windows server. Each step is described in more detail elsewhere in this chapter, as
indicated.
Before you apply a maintenance pack, create an EnforceReinstallationResources.zip file
using the Reinstallation Resources Utility. This file includes the CryptoMasterKey.properties
file and the keystore files for your Symantec Data Loss Prevention deployment. You can use
the file to rollback to a previous version.
Applying a Maintenance Pack 237
Applying a Symantec Data Loss Prevention Maintenance Pack

See the Symantec Data Loss Prevention Upgrade Guide for Windows at the Symantec Support
Center at http://www.symantec.com/docs/DOC9258.

Table 13-1 Steps to apply the maintenance pack to a Windows environment

Step Action Description

1 Download and extract the maintenance pack See “Downloading the maintenance pack
software. software for Windows servers” on page 237.

2 Confirm that all users are logged out of the If users are logged in during the
Enforce Server administration console. maintenance pack application process,
subsequent logins fail during the End User
Licensing Agreement confirmation.

3 Apply the maintenance pack to the Enforce See “Updating the Enforce Server on
Server. Windows” on page 237.

The process to apply the maintenance pack


to a single-tier installation omits the
detection server update step.

See “Updating a single-tier system on


Windows” on page 239.

4 Apply the maintenance pack to the detection See “Updating the detection server on
server. Windows” on page 238.

Downloading the maintenance pack software for Windows servers


Copy the MSP files to the computer from where you intend to perform the upgrade. That
computer must have a reliable network connection to the Enforce Server.
Copy the MSP files into a directory on a system that is accessible to you. The root directory
where you move the files is referred to as the DLPDownloadHome directory.
Choose from the following files based on your current installation:
■ Apply the maintenance pack to the Enforce Server: EnforceServer.msp
■ Apply the maintenance pack to the detection server: DetectionServer.msp
■ Apply the maintenance pack to a single-tier installation: SingleTierServer.msp

Updating the Enforce Server on Windows


These instructions assume that Symantec Data Loss Prevention 15.5 is installed and that the
EnforceServer.msp file has been copied into the DLPDownloadHome directory on the Enforce
Server computer.
Applying a Maintenance Pack 238
Applying a Symantec Data Loss Prevention Maintenance Pack

To update the Enforce Server


◆ Install the maintenance pack by completing the following steps:

Note: You can install the maintenance pack using Silent Mode by running the following
command:
msiexec /p "EnforceServer.msp" ORACLE_PASSWORD=<ORACLE PASSWORD>/qn
/norestart /L*v EnforceServer.log

where <ORACLE PASSWORD> is the database password used for Symantec Data Loss
Prevention 15.5.

a Click Start > Run > Browse to navigate to the folder where you copied the
EnforceServer.msp file.

b Double-click EnforceServer.msp to execute the file, and click OK.

c Click Next on the Welcome panel.

d Enter the Symantec Data Loss Prevention database password in Oracle Database Server
Information panel.

e Click Update.

The update process may take a few minutes. The installation program window may display
for a few minutes while the services startup. After the update process completes, a
completion notice displays.

Updating the detection server on Windows


These instructions assume that Symantec Data Loss Prevention 15.5 is installed and the
DetectionServer.msp file has been copied into the DLPDownloadHome directory on the detection
server computer.
Applying a Maintenance Pack 239
Applying a Symantec Data Loss Prevention Maintenance Pack

To update the detection server


◆ Install the maintenance pack by completing the following steps:

Note: You can install the maintenance pack using Silent Mode by running the following
command:
msiexec /p "DetectionServer.msp" /qn /norestart /L*v DetectionServer.log

a Click Start > Run > Browse to navigate to the folder where you copied the
DetectionServer.msp file.

b Double-click DetectionServer.msp to execute the file, and click OK.

c Click Next on the Welcome panel.

d Click Update.

The update process may take a few minutes. The installation program window may display
for a few minutes while the services startup. After the update process completes, a
completion notice displays.

Updating a single-tier system on Windows


The following instructions assume that the SingleTierServer.msp file has been copied into
the DLPDownloadHome directory on the Enforce Server computer.
Applying a Maintenance Pack 240
Applying a Symantec Data Loss Prevention Maintenance Pack

To update a single-tier system


◆ Install the maintenance pack by completing the following steps:

Note: You can install the maintenance pack using Silent Mode by running the following
command:
msiexec /p "SingleTierServer.msp" ORACLE_PASSWORD=<ORACLE PASSWORD>/qn
/norestart /L*v EnforceServer.log

where <ORACLE PASSWORD> is the database password used for Symantec Data Loss
Prevention.

a Click Start > Run > Browse to navigate to the folder where you copied the
SingleTierServer.msp file.

b Double-click SingleTierServer.msp to execute the file, and click OK.

c Click Next on the Welcome panel.

d Enter the Symantec Data Loss Prevention database password in Oracle Database
Server Information panel.

e Click Update.

The update process may take a few minutes. The installation program window may display
for a few minutes while the services startup. After the update process completes, a
completion notice displays.

Steps to apply a maintenance pack on Linux servers


The following table describes the high-level steps that are involved in applying a Symantec
Data Loss Prevention maintenance pack to a Linux server. Each step is described in more
detail elsewhere in this chapter, as indicated.
Before you apply a maintenance pack, create an EnforceReinstallationResources.zip file
using the Reinstallation Resources Utility. This file includes the CryptoMasterKey.properties
file and the keystore files for your Symantec Data Loss Prevention deployment. You can use
the file to rollback to a previous version.
See the Symantec Data Loss Prevention Upgrade Guide for Linux at the Symantec Support
Center at http://www.symantec.com/docs/DOC9258.
Applying a Maintenance Pack 241
Applying a Symantec Data Loss Prevention Maintenance Pack

Table 13-2 Steps to apply the maintenance pack on Linux

Step Action Description

1 Download and extract the upgrade software. See “Downloading the maintenance pack
software for Windows servers” on page 237.

2 Confirm that all users are logged out of the Enforce If users are logged in during the
Server administration console. maintenance pack application process,
subsequent logins fail during the End User
Licensing Agreement confirmation.

3 Apply the maintenance pack to the Enforce Server. See “Updating the Enforce Server on Linux”
on page 241.

The process to apply the maintenance pack


to a single-tier installation omits the
detection server update step.

See “Updating a single-tier system on Linux”


on page 243.

4 Apply the maintenance pack to the detection See “Updating the detection server on Linux”
server. on page 242.

Downloading and extracting the maintenance pack software for Linux


servers
Copy the ZIP files to the computer from where you intend to perform the upgrade. That computer
must have a reliable network connection to the Enforce Server.
Copy the ZIP files into a directory on a system that is accessible to you. The root directory
where you move the files is referred to as the DLPDownloadHome directory.
Choose from the following files based on your current installation:
■ Apply the maintenance pack to the Enforce Server: EnforceServer.zip
■ Apply the maintenance pack to the detection server: DetectionServer.zip
■ Update a single-tier installation: SingleTierServer.zip

Updating the Enforce Server on Linux


The instructions that follow describe how to install the maintenance pack on an Enforce Server
on a Linux computer.
These instructions assume that Symantec Data Loss Prevention 15.5 is installed and that the
EnforceServer.zip file has been copied into the /opt/temp directory on the Enforce Server
computer.
Applying a Maintenance Pack 242
Applying a Symantec Data Loss Prevention Maintenance Pack

To update the Enforce Server


1 Log on as root to the Enforce Server system.
2 Navigate to the directory where you copied the EnforceServer.zip file. (/opt/temp)
3 Unzip the file to the same directory.
4 Perform the update process by running the following command:

rpm -Uvh
symantec-dlp-15-5-content-extraction-service-15.5-01074.x86_64.rpm
symantec-dlp-15-5-server-platform-common-15.5-01074.x86_64.rpm
symantec-dlp-15-5-content-extraction-plugins-15.5-01074.x86_64.rpm
symantec-dlp-15-5-enforce-server-15.5-01074.x86_64.rpm

Note: Replace filenames with those the maintenance pack version you are installing.

You can install the RPMs at once by running the following command:
rpm -Uvh *.rpm

If you used any relocations (--relocate default-path=new-path) during the initial


installation, you must use them again with the upgrade command.
5 Run the Update Configuration utility by running the following command:
cd "/opt/Symantec/DataLossPrevention/EnforceServer/15.5/Protect/install"
./EnforceServerUpdateConfigurationUtility

Note: You can install the maintenance pack using Silent Mode by running the following
command:
./EnforceServerUpdateConfigurationUtility -silent
-ORACLE_HOME=/opt/oracle/product/12.1.0/db_1 -oraclePassword=<ORACLE
PASSWORD>

where <ORACLE PASSWORD> is the database password used for Symantec Data Loss
Prevention.

During the update process, services shut down, then restart automatically. You can review
the update log file EnforceServerUpdateConfigurationUtility.log located at
/var/log/Symantec/DataLossPrevention/EnforceServer/15.5/debug.

Updating the detection server on Linux


The instructions that follow describe how to apply the maintenance pack to a detection server
on a Linux computer.
Applying a Maintenance Pack 243
Applying a Symantec Data Loss Prevention Maintenance Pack

These instructions assume that Symantec Data Loss Prevention 15.5 is installed and that the
DetectionServer.zip file has been copied into the /opt/temp/ directory on the server
computer.
To update the detection server
1 Log on as root to the system where the detection server is installed.
2 Navigate to the directory where you copied the DetectionServer.zip file. (/opt/temp)
3 Unzip the file to the same directory.
4 Apply the maintenance pack to the detection server by running the following command:

rpm -Uvh
symantec-dlp-15-5-content-extraction-plugins-15.5-01074.x86_64.rpm
symantec-dlp-15-5-content-extraction-service-15.5-01074.x86_64.rpm
symantec-dlp-15-5-detection-server-15.5-01074.x86_64.rpm
symantec-dlp-15-5-server-platform-common-15.5-01074.x86_64.rpm

Note: Replace filenames with those the maintenance pack version you are installing.

You can install the RPMs at once by running the following command:
rpm -Uvh *.rpm

If you used any relocations (--relocate default-path=new-path) during the initial


installation, you must use them again with the upgrade command.

Updating a single-tier system on Linux


The instructions that follow describe how to apply a the maintenance pack to a single-tier
installation on a Linux computer.
These instructions assume that Symantec Data Loss Prevention 15.5 is installed and that the
SingleTierServer.zip file has been copied into the /opt/temp directory on the computer.

To update a single-tier installation


1 Log on as root to the Enforce Server system.
2 Navigate to the directory where you copied the SingleTierServer.zip file. (/opt/temp)
3 Unzip the file to the same directory.
Applying a Maintenance Pack 244
Applying a Symantec Data Loss Prevention Maintenance Pack

4 Apply the maintenance pack to the single-tier installation by running the following command:

rpm -Uvh
symantec-dlp-15-5-content-extraction-plugins-15.5-01074.x86_64.rpm
symantec-dlp-15-5-content-extraction-service-15.5-01074.x86_64.rpm
symantec-dlp-15-5-detection-server-15.5-01074.x86_64.rpm
symantec-dlp-15-5-enforce-server-15.5-01074.x86_64.rpm
symantec-dlp-15-5-server-platform-common-15.5-01074.x86_64.rpm
symantec-dlp-15-5-single-tier-server-15.5-01074.x86_64.rpm

Note: Replace filenames with those the maintenance pack version you are installing.

You can install the RPMs at once by running the following command:
rpm -Uvh *.rpm

If you used any relocations (--relocate default-path=new-path) during the initial


installation, you must use them again with the upgrade command.
5 Run the Update Configuration utility by running the following command:
cd "/opt/Symantec/DataLossPrevention/SingleTierServer/15.5/Protect/install"

./SingleTierServerUpdateConfigurationUtility

Note: You can install Maintenance Patch 1 using Silent Mode by running the following
command:
./SingleTierServerUpdateConfigurationUtility -silent
-ORACLE_HOME=/opt/oracle/product/12.1.0/db_1 -oraclePassword=<ORACLE
PASSWORD>

where <ORACLE PASSWORD> is the database password used for Symantec Data Loss
Prevention 15.5.

During the update process, services shut down, then restart automatically. You can review
the update log file SingleTierServerUpdateConfigurationUtility.log located at
/var/log/Symantec/DataLossPrevention/SingleTierServer/15.5/debug/.
Section 3
Managing detection servers

■ Chapter 14. Installing and managing detection servers and cloud detectors

■ Chapter 15. Managing log files

■ Chapter 16. Using Symantec Data Loss Prevention utilities


Chapter 14
Installing and managing
detection servers and cloud
detectors
This chapter includes the following topics:

■ About managing Symantec Data Loss Prevention servers

■ Preparing for Microsoft Rights Management file monitoring

■ Enabling Advanced Process Control

■ Server controls

■ Server configuration—basic

■ Editing a detector

■ Server and detector configuration—advanced

■ Adding a detection server

■ Adding a cloud detector

■ Removing a server

■ Importing SSL certificates to Enforce or Discover servers

■ About the Overview screen

■ Configuring the Enforce Server to use a proxy to connect to cloud services

■ Server and detector status overview

■ Recent error and warning events list


Installing and managing detection servers and cloud detectors 247
About managing Symantec Data Loss Prevention servers

■ Server/Detector Detail screen

■ Advanced server settings

■ Advanced detector settings

■ About using load balancers in an endpoint deployment

About managing Symantec Data Loss Prevention


servers
Symantec Data Loss Prevention servers and cloud detectors are managed from the System
> Servers and Detectors > Overview screen. This screen provides an overview of your
system, including server status and recent system events. It displays summary information
about all Symantec Data Loss Prevention servers, a list of recent error and warning events,
and information about your license. From this screen you can add or remove detection servers.
■ Click on the name of a server to display its Server/Detector Detail screen, from which you
can control and configure that server.
See “Installing a new license file” on page 234.
See “About the Enforce Server administration console” on page 83.
See “About the Overview screen” on page 278.
See “Server/Detector Detail screen” on page 283.
See “Adding a detection server” on page 273.
See “Adding a cloud detector” on page 275.
See “Removing a server” on page 277.
See “Server controls” on page 251.
See “Server configuration—basic” on page 253.

Preparing for Microsoft Rights Management file


monitoring
You must complete prerequisites before enabling Microsoft Rights Management (RMS) file
detection. The following prerequisites apply to RMS administered by Azure RMS or Active
Directory (AD) RMS.
Installing and managing detection servers and cloud detectors 248
Preparing for Microsoft Rights Management file monitoring

Table 14-1 Microsoft Rights Management file monitoring prerequisites

RMS client Requirements

Azure RMS Install the RMS client, version 2.1, on the detection server.

AD RMS ■ Install the RMS client, version 2.1, on the detection server using a domain service
user that is added to the AD RMS Super Users group.
■ Provide both the AD RMS Service User and the DLP Service User with Read and
Execute permissions to access ServerCertification.asmx. Refer to the
Microsoft Developer Network for additional details:
https://msdn.microsoft.com/en-us/library/mt433203.aspx.
■ Add the detection server to the AD RMS server domain.
■ Run the detection server services using a domain user that is a member of the AD
RMS Super Users group.

After you install the detection server, you enable RMS file detection. See “Enabling Microsoft
Rights Management file monitoring” on page 248.

Enabling Microsoft Rights Management file monitoring


Symantec Data Loss Prevention can detect files that are encrypted using Microsoft Rights
Management (RMS) administered by Azure or Active Directory (AD).
Before you enable Microsoft Rights Management file monitoring, confirm that prerequisites for
the RMS environment and the detection server have been completed. See “Preparing for
Microsoft Rights Management file monitoring” on page 247.

Enabling RMS detection for Azure-managed RMS


For Azure RMS, complete the following on each detection server to enable RMS file monitoring:
1 Locate the plugin Enable-Plugin.ps1 located on the detection server at the following
path:

C:\Program Files\Symantec\DataLossPrevention\ContentExtractionService\15.5\Protect\plugins\
contentextraction\MicrosoftRightsManagementPlugin\

2 Run the plugin by executing the following command:

powershell.exe -ExecutionPolicy RemoteSigned -File


"C:\Program Files\Symantec\DataLossPrevention\ContentExtractionService\15.5\Protect\plugins\
contentextraction\MicrosoftRightsManagementPlugin\Enable-Plugin.ps1"

3 Run the configuration utility ConfigurationCreator.exe to add the system user. Run
the utility as the protect user.
Installing and managing detection servers and cloud detectors 249
Preparing for Microsoft Rights Management file monitoring

Note: Enter all credentials accurately to ensure that the feature is enabled.

C:\Program Files\Symantec\DataLossPrevention\ContentExtractionService\15.5\Protect\
plugins\contentextraction\MicrosoftRightsManagementPlugin\ConfigurationCreator.exe
Do you want to configure ADAL authentication [y/n]: n
Do you want to configure symmetric key authentication [y/n]: y
Enter your symmetric key (base-64): [user's Azure RMS symmetric key]
Enter your app principal ID: [user's Azure RMS app principal ID]
Enter your BPOS tenant ID: [user's Azure RMS BPOS tenant ID]

After running this script, the following files are created in the
MicrosoftRightsManagementPlugin at \Program
Files\Symantec\DataLossPrevention\ContentExtractionService\15.5\Protect\plugins\contentextraction\:

■ rightsManagementConfiguration

■ rightsManagementConfigurationProtection

4 Restart each detection server to complete the process.

Note: You can confirm that Symantec Data Loss Prevention is monitoring RMS content
by reviewing the ContentExtractionHost_FileReader.log file (located at
\ProgramData\Symantec\DataLossPrevention\DetectionServer\15.5\protect\Logs\debug).
Error messages that display for the MicrosoftRightsManagementPlugin.cpp item indicate
that the plugin is not monitoring RMS content.
Installing and managing detection servers and cloud detectors 250
Enabling Advanced Process Control

Enabling RMS detection for AD-managed RMS


For AD RMS, complete the following on each detection server to enable RMS file monitoring:
1 Run the plugin, Enable-Plugin.ps1, which is located at located at \Program
Files\Symantec\DataLossPrevention\Protect\bin on the Enforce Server.

powershell.exe -ExecutionPolicy RemoteSigned -File


"C:\Program Files\Symantec\DataLossPrevention\ContentExtractionService\15.5\Protect\plugins\
contentextraction\MicrosoftRightsManagementPlugin\Enable-Plugin.ps1"

2 Restart each detection server to complete the process.

Note: You can confirm that Symantec Data Loss Prevention is monitoring RMS content
by reviewing the ContentExtractionHost_FileReader.log file (located at
\ProgramData\Symantec\DataLossPrevention\DetectionServer\15.5\protect\Logs\debug).
Error messages that display for the MicrosoftRightsManagementPlugin.cpp item indicate
that the plugin is not monitoring RMS content.

Enabling Advanced Process Control


Symantec Data Loss Prevention Advanced Process Control lets you start or stop individual
server processes from the Enforce Server administration console. You do not have to start or
stop an entire server. This feature can be useful for debugging. When Advanced Process
Control is off (the default), each Server/Detector Detail screen shows only the status of the
entire server. When you turn Advanced Process Control on, the General section of the
Server/Detector Detail screen displays individual processes.
See “Server/Detector Detail screen” on page 283.
To enable Advanced Process Control
1 Go to System > Settings > General and click Configure.
The Edit General Settings screen is displayed.
2 Scroll down to the Process Control section and check the Advanced Process Control
box.
3 Click Save.
Table 14-2 describes the individual processes and the servers on which they run once advanced
process control is enabled.
Installing and managing detection servers and cloud detectors 251
Server controls

Table 14-2 Advanced processes

Process Description Control

Monitor Controller The Monitor Controller process The MonitorController Status is available for
controls detection servers. the Enforce Server.

File Reader The File Reader process detects The FileReader Status is available for all
incidents. detection servers.

Incident Writer The Incident Writer process sends The IncidentWriter Status is available for all
incidents to the Enforce Server. detection servers, unless they are part of a
single-tier installation, in which case there is only
one Incident Writer process.

Packet Capture The Packet Capture process The PacketCapture Status is available for
captures network streams. Network Monitor.

Request The Request Processor processes The RequestProcessor Status is available for
Processor SMTP requests. Network Prevent for Email.

Endpoint Server The Endpoint Server process The EndpointServer Status is available for
interacts with Symantec DLP Endpoint Prevent.
Agents.

Detection Server The Detection Server Database The DetectionServerDatabase Status is


Database process is used for automated available for Network Discover.
incident remediation tracking.

See “Server configuration—basic” on page 253.

Server controls
Servers and their processes are controlled from the Server/Detector Detail screen.
■ To reach the Server/Detector Detail screen for a particular server, go to the System >
Servers and Detectors > Overview screen and click a server name, detector name, or
appliance name in the list.
See “Server/Detector Detail screen” on page 283.
The status of the server and its processes appears in the General section of the
Server/Detector Detail screen. The Start, Recycle and Stop buttons control server and
process operations.
Current status of the server is displayed in the General section of the Server/Detector Detail
screen. The possible values are:
Installing and managing detection servers and cloud detectors 252
Server controls

Table 14-3 Server status values

Icon Status

Starting - In the process of starting.

Running - Running without errors.

Running Selected - Some processes on the server are stopped or have errors. To see
the statuses of individual processes, you must first enable Advanced Process Control
on the System Settings screen.

Stopping - In the process of stopping.

Stopped - Fully stopped.

Unknown - The Server has encountered one of the following errors:

■ Start. To start a server or process, click Start.


■ Recycle. To stop and restart a server, click Recycle.
■ Stop. To stop a server or process, click Stop.
■ To halt a process during its start-up procedure, click Terminate.
■ To reboot an appliance, click Reboot.

Note: Status and controls for individual server processes are only displayed if Advanced
Process Control is enabled for the Enforce Server. To enable Advanced Process Control, go
to System > Settings > General > Configure, check the Advanced Process Control box,
and click Save.

■ To update the status, click the refresh icon in the upper-right portion of the screen, as
needed.
See “About Symantec Data Loss Prevention administration” on page 82.
See “About the Overview screen” on page 278.
See “Server/Detector Detail screen” on page 283.
See “Server configuration—basic” on page 253.
See “System events reports” on page 165.
See “Server and Detectors event detail” on page 169.
Installing and managing detection servers and cloud detectors 253
Server configuration—basic

Server configuration—basic
Enforce Servers are configured from the System > Settings > General menu.
Detection servers and detectors are configured from each server's individual Configure Server
screen.
To configure a server
1 Go to the System > Servers and Detectors > Overview screen.
2 Click on the name of the server in the list.
That server's Server/Detector Detail screen is displayed. The following buttons are in
the upper-left portion of a Server/Detector Detail:
■ Done. Click Done to return to the previous screen.
■ Configure. Click Configure to specify a basic configuration for this server.
■ Server Settings. Click Server Settings to specify advanced configuration parameters
for this server. Use caution when modifying advanced server settings. It is
recommended that you check with Symantec Support before changing any of the
advanced settings.
See “Server and detector configuration—advanced” on page 273.
See Symantec Data Loss Prevention online Help for information about advanced
server configuration.

3 Click Configure or Server Settings to display a configuration screen for that type of
server.
4 Specify or change settings on the screen as needed, and then click Save.
Click Cancel to return to the previous screen without changing any settings.

Note: A server must be recycled before new settings take effect.

See “Server controls” on page 251.


The Configure Server screen contains a General section for all detection servers that contains
the following parameters:
■ Name. The name you choose to give the server. This name appears in the Enforce Server
administration console (System > Servers and Detectors > Overview). The name is
limited to 255 characters.
■ Host. The host name or IP address of the system hosting the server. Host names must be
fully qualified. If the host has more than one IP address, specify the address on which the
detection server listens for connections to the Enforce Server.
Installing and managing detection servers and cloud detectors 254
Server configuration—basic

■ Port. The port number used by the detection server to communicate with the Enforce
Server. The default is 8100.
For Single Tier Monitors, the Host field on the Configure Server page is pre-populated with
the local IP address 127.0.0.1. You cannot change this value.
The next portions of a Configure Server screen vary according to the type of server, except
for the OCR Engine and Detection tabs, which are common to all servers.
Click the OCR Engine tab to set up a connection to an OCR server.
See “Server configuration—basic”on page 705 on page 705.
Click the Detection tab to customize the Inspection Content Size.
See “Increasing the inspection content size” on page 459.
See “Network Monitor Server—basic configuration” on page 254.
See “Network Discover/Cloud Storage Discover Server and Network Protect—basic
configuration” on page 261.
See “Network Prevent for Email Server—basic configuration” on page 256.
See “Network Prevent for Web Server—basic configuration” on page 259.
See “Endpoint Server—basic configuration” on page 262.
See “Single Tier Monitor — basic configuration” on page 263.
See “Server/Detector Detail screen” on page 283.

Network Monitor Server—basic configuration


Detection servers are configured from each server's individual Configure Server screen. To
display the Configure Server screen, go to the Overview screen (System > Servers and
Detectors > Overview) and click the name of the server in the list. That server's
Server/Detector Detail screen appears. Click Configure to display the Configure Server
screen.
A Network Monitor Server's Configure Server screen is divided into a general section and
two tabs:
■ General section. Use this section to specify the server's name, host, and port.
See “Server configuration—basic” on page 253.
■ Packet Capture tab. Use this tab to configure network packet capture settings.
■ SMTP Copy Rule tab. Use this tab to modify the source folder where the server retrieves
SMTP message files.
The top portion of the Packet Capture defines general packet capture parameters. It provides
the following fields:
Installing and managing detection servers and cloud detectors 255
Server configuration—basic

Field Description

Source Folder Override The source folder is the directory the server uses to
buffer network streams before it processes them.
The recommended setting is to leave the Source
Folder Override field blank to accept the default. If
you want to specify a custom buffer directory, type
the full path to the directory.

Network Interfaces Select the network interface card(s) to use for


monitoring. Note that to monitor a NIC WinPcap
software must be installed on the Network Monitor
Server.

See the Symantec Data Loss Prevention Installation


Guide for more information about NICs.

Th Protocol section of the Packet Capture specifies the types of network traffic (by protocol)
to capture. It also specifies any custom parameters to apply. This section lists the standard
protocols that you have licensed with Symantec, and any custom TCP protocols you have
added.
To monitor a particular protocol, check its box. When you initially configure a server, the settings
for each selected protocol are inherited from the system-wide protocol settings. You configure
these settings by going to System > Settings > Protocol. System-wide default settings are
listed as Standard.
Consult Symantec Data Loss Prevention online Help for information about working with
system-wide settings.
To override the inherited filtering settings for a protocol, click the name of the protocol. The
following custom settings are available (some settings may not be available for some protocols):
■ IP filter
■ L7 sender filter
■ L7 recipient filter
■ Content filter
■ Search Depth (packets)
■ Sampling rate
■ Maximum wait until written
■ Maximum wait until dropped
■ Maximum stream packets
■ Minimum stream size
Installing and managing detection servers and cloud detectors 256
Server configuration—basic

■ Maximum stream size


■ Segment Interval
■ No traffic notification timeout (The maximum value for this setting is 360000 seconds.)
Use the SMTP Copy Rule to modify the source folder where this server retrieves SMTP
message files. You can modify the Source Folder by entering the full path to a folder.
See “About Symantec Data Loss Prevention administration” on page 82.
See “About the Overview screen” on page 278.
See “Server/Detector Detail screen” on page 283.
See “Server configuration—basic” on page 253.
See “Server controls” on page 251.
In addition to the settings available through the Configure Server screen, you can specify
advanced settings for this server. To specify advanced configuration parameters, click Server
Settings on the server's Server/Detector Detail screen. Use caution when modifying advanced
server settings. Check with Symantec Support before you change any advanced setting.
See “Advanced server settings” on page 285.
See the Symantec Data Loss Prevention online Help for information about advanced server
settings.

Network Prevent for Email Server—basic configuration


Detection servers are configured from each server's individual Configure Server screen. To
display the Configure Server screen, go to the Overview screen (System > Servers and
Detectors > Overview) and click the name of the server in the list. That server's
Server/Detector Detail screen appears. Click Configure to display the Configure Server
screen.
A Network Prevent for Email Server Configure Server screen is divided into a General section
and an Inline SMTP tab. The General section specifies the server's name, host, and port.
See “Server configuration—basic” on page 253.
Use the Inline SMTP tab to configure different Network Prevent for Email Server features:
Installing and managing detection servers and cloud detectors 257
Server configuration—basic

Field Description

Trial Mode Trial mode lets you test prevention capabilities


without blocking requests. When trial mode is
selected, the server detects incidents and creates
incident reports, but does not block any messages.
Deselect this option to block those messages that
are found to violate Symantec Data Loss Prevention
policies.

Keystore Password If you use TLS authentication in a forwarding mode


configuration, enter the correct password for the
keystore file.

Next Hop Configuration Select Reflect to operate Network Prevent for Email
Server in reflecting mode. Select Forward to
operate in forwarding mode.
Note: If you select Forward you must also select
Enable MX Lookup orDisable MX Lookup to
configure the method that is used to determine the
next-hop MTA.

Enable MX Lookup This option applies only to forwarding mode


configurations.

Select Enable MX Lookup to perform a DNS query


on a domain name to obtain the mail exchange (MX)
records for the server. Network Prevent for Email
Server uses the returned MX records to select the
address of the next hop mail server.

If you select Enable MX Lookup, also add one or


more domain names in the Enter Domains text
box. For example:

companyname.com

Network Prevent for Email Server performs MX


record queries for the domain names that you
specify.
Note: You must include at least one valid entry in
the Enter Domains text box to successfully
configure forwarding mode behavior.
Installing and managing detection servers and cloud detectors 258
Server configuration—basic

Field Description

Disable MX Lookup This field applies only to forwarding mode


configurations.

Select Disable MX Lookup if you want to specify


the exact or IP address of one or more next-hop
MTAs. Network Prevent for Email Server uses the
hostnames or addresses that you specify and does
not perform an MX record lookup.

If you select Disable MX Lookup, also add one or


more hostnames or IP addresses for next-hop MTAs
in the Enter Hostnames text box. You can specify
multiple entries by placing each entry on a separate
line. For example:

smtp1.companyname.com
smtp2.companyname.com
smtp3.companyname.com

Network Prevent for Email Server always tries to


use the first MTA that you specify in the list. If that
MTA is not available, Network Prevent for Email
Server tries the next available entry in the list.
Note: You must include at least one valid entry in
the Enter Hostnames text box to successfully
configure forwarding mode behavior.

See the Symantec Data Loss Prevention MTA Integration Guide for Network Prevent for Email
for additional information about configuring Network Prevent for Email Server options.
See “About Symantec Data Loss Prevention administration” on page 82.
See “About the Overview screen” on page 278.
See “Server/Detector Detail screen” on page 283.
See “Server configuration—basic” on page 253.
See “Server controls” on page 251.
In addition to the settings available through the Configure Server screen, you can specify
advanced settings for this server. To specify advanced configuration parameters, click Server
Settings on the server's Server/Detector Detail screen. Use caution when modifying advanced
server settings. Check with Symantec Support before you change any advanced setting.
See “Advanced server settings” on page 285.
Installing and managing detection servers and cloud detectors 259
Server configuration—basic

See the Symantec Data Loss Prevention online Help for information about advanced server
settings.

Network Prevent for Web Server—basic configuration


Detection servers are configured from each server's individual Configure Server screen. To
display the Configure Server screen, go to the Overview screen (System > Servers and
Detectors > Overview) and click the name of the server in the list. That server's
Server/Detector Detail screen appears. Click Configure to display the Configure Server
screen.
A Network Prevent for Web Server Configure Server screen is divided into a general section,
a Symantec Encryption Server Administration section, and two tabs:
■ General section. This section specifies the server's name, host, and port.
■ Symantec Encryption Server Administration section. This section specifies the Symantec
Encryption Server Name, the Universal Service Protocol Port, and the Credential.
■ ICAP tab. This tab is for configuring the Internet Content Adaptation Protocol (ICAP) Use
the ICAP tab to configure web-based network traffic.
The ICAP tab is divided into four sections:
■ The Trial Mode section enables you to test prevention without blocking traffic. When trial
mode is selected, the server detects incidents and creates incident reports, but it does not
block any traffic. This option enables you to test your policies without blocking traffic. Check
the box to enable trial mode.
■ Click the box in the Security Configuration section to enable Secure ICAP with the Blue
Coat ProxySG server. You also must have a keystore configured and provide the keystore
password when you enable secure ICAP.
See “Configuring a secure ICAP keystore for Network Prevent for Web” on page 2067.
For instructions on setting up the Secure ICAP client configuration with Blue Coat ProxySG,
see the Blue Coat ProxySG documentation at
https://www.symantec.com/docs/DOC10459.html.
■ The Request Filtering section configures traffic filtering criteria:

Field Description

Ignore Requests Smaller Than Specify the minimum body size of HTTP
requests to inspect on this server. The
default value is 4096 bytes. HTTP requests
with bodies smaller than this number are
not inspected.
Installing and managing detection servers and cloud detectors 260
Server configuration—basic

Field Description

Ignore Requests from Hosts or Domains Enter the host names or domains whose
requests should be filtered out (ignored).
Enter one host or domain name per line.

Ignore Requests from User Agents Enter the names of user agents whose
requests should be filtered out (ignored).
Enter one agent per line.

■ The Response Filtering section configures the filtering criteria to manage HTTP responses:

Field Description

Ignore Responses Smaller Than Enter the minimum body size of HTTP
responses to inspect on this server. The
default value is 4096 bytes. HTTP
responses with bodies smaller than this
number are not inspected.

Inspect Content Type Specify the MIME content types that you
want this server to monitor. By default, this
field contains content type values for
standard Microsoft Office, PDF, and
plain-text formats. You can add other MIME
content type values. Enter separate content
types on separate lines. For example, to
inspect Excel files enter
application/ynd.ms-excel.

Ignore Responses from Hosts or Domains Enter the host names or domains whose
responses are to be ignored. Enter one host
or domain name per line.

Ignore Responses to User Agents Enter the names of user agents whose
responses are to be ignored. Enter one user
agent per line.

■ Click the OCR Engine tab to add an OCR Engine Configuration profile. Scroll to select
a configuration.
See “Server configuration—basic”on page 705 on page 705.
See “Creating an OCR configuration” on page 711.
■ The Connection section configures settings for the ICAP connection between an HTTP
proxy server and the Network Prevent for Web Server:
Installing and managing detection servers and cloud detectors 261
Server configuration—basic

Field Description

TCP Port Specify the TCP port number that this


server is to use to listen to ICAP requests.
The same value must be configured on the
HTTP proxy sending ICAP requests to this
server. The recommended value is 1344.

Maximum Number of Requests Enter the maximum number of simultaneous


ICAP request connections. The default is
25.

Maximum Number of Responses Enter the maximum number of simultaneous


ICAP response connections from the HTTP
proxy or proxies that are allowed. The
default is 25.

Connection Backlog Enter the maximum number of waiting


connections allowed. Each waiting
connection means that a user waits at their
browser. The minimum value is 1.

See “Configuring Network Prevent for Web Server” on page 2064.


See “About Symantec Data Loss Prevention administration” on page 82.
See “About the Overview screen” on page 278.
See “Server/Detector Detail screen” on page 283.
See “Server configuration—basic” on page 253.
See “Server controls” on page 251.
In addition to the settings available through the Configure Server screen, you can specify
advanced settings for this server. To specify advanced configuration parameters, click Server
Settings on the server's Server/Detector Detail screen. Use caution when modifying Advanced
Server settings. Check with Symantec Support before you change any advanced setting.
See “Advanced server settings” on page 285.
See the Symantec Data Loss Prevention online Help for information about Advanced Server
settings.

Network Discover/Cloud Storage Discover Server and Network


Protect—basic configuration
Detection servers are configured from each server's individual Configure Server screen. To
display the Configure screen for a server, go to the System > Servers and Detectors >
Installing and managing detection servers and cloud detectors 262
Server configuration—basic

Overview screen and click on the name of the server in the list. That server's Server/Detector
Detail screen is displayed. Click Configure. The server's Configure Server screen is displayed.
See “Modifying the Network Discover/Cloud Storage Discover Server configuration” on page 2083.
A Network Discover Server's Configure Server screen is divided into a the following sections:
■ General section. This section is for specifying the server's name, host, and port.
See “Server configuration—basic” on page 253.
■ Discover tab. This tab is for performing the following configurations:
■ Modifying the number of parallel scans that run on this Discover Server.
The maximum count can be increased at any time. After it is increased, any queued
scans that are eligible to run on the Network Discover Server are started. The count
can be decreased only if the Network Discover Server has no running scans. Before
you reduce the count, pause, or stop, all scans running on the server.
To view the scans running on Network Discover Servers, go to Manage > Discover
Scanning > Discover Targets.

■ Configuring network proxy settings for connecting to the Symantec Information Centric
Encryption (ICE) Cloud.
You can specify an existing network proxy in your setup and, optionally, provide the
authentication credentials for connecting to it. Network Discover uses the proxy server
to communicate with the ICE Cloud whenever file share (File System) scans trigger the
Network Protect: Encrypt File response action.
See “Configuring Network Discover to use a proxy to connect to the Symantec ICE
Cloud for file share scans” on page 2085.

See “About Symantec Data Loss Prevention administration” on page 82.


See “Server/Detector Detail screen” on page 283.
See “Server configuration—basic” on page 253.
See “Server controls” on page 251.
In addition to the settings available through the Configure Server screen, you can also specify
advanced settings for this server. To specify advanced configuration parameters, click Server
Settings on the Server/Detector Detail screen. Use caution when modifying advanced server
settings. It is recommended that you check with Symantec Support before changing any of
the advanced settings.
See “Advanced server settings” on page 285.

Endpoint Server—basic configuration


Detection servers are configured from each server's individual Configure Server screen. To
display the Configure screen for a server, go to the System > Servers and Detectors >
Installing and managing detection servers and cloud detectors 263
Server configuration—basic

Overview screen and click the name of the server. The Server/Detector Detail screen for
that server is displayed. Click Configure to display the Configure Server screen for that
server.
See “Adding a detection server” on page 273.
The Configure Server screen for an Endpoint Server is divided into a general section and the
following tabs:
■ General. This section is for specifying the server name, host, and port.
See “Server configuration—basic” on page 253.
■ Agent. This section is for adding agent security certificates to the Endpoint Server.
See “Adding and editing agent configurations” on page 2348.
Agent Listener. Use this section to configure the Endpoint Server to listen for connections
from Symantec DLP Agents:

Field Description

Bind address Enter the IP address on which the Endpoint Server listens for communications from
the Symantec DLP Agents. The default IP address is 0.0.0.0 which allows the
Endpoint Server to listen on all host IP addresses.

Port Enter the port over which the Endpoint Server listens for communications from the
Symantec DLP Agents.
Note: Many Linux systems restrict ports below 1024 to root access. The Endpoint
Server cannot by configured to listen for connections from Symantec DLP Agents
to these restricted ports on Linux systems.

Note: If you are using FIPS 140-2 mode for communication between the Endpoint Server and
DLP Agents, do not use Diffie-Hellman (DH) cipher suites. Mixing cipher suites prevents the
agent and Endpoint Server from communicating. You can confirm the current cipher suit setting
by referring to the EndpointCommunications.SSLCipherSuites setting on the Server
Settings page. See “Advanced server settings” on page 285.

Single Tier Monitor — basic configuration


Detection servers are configured from each server's individual Configure Server screen. To
display the Configure Server screen, go to the System > Servers and Detectors > Overview
screen and click the name of the server in the list. That server's Server/Detector Detail screen
appears. Click Configure to display the Configure Server screen.
The Single Tier Monitor is a detection server that includes the detection capabilities of the
Network Monitor, Network Discover/Cloud Storage Discover, Network Prevent for Web, Network
Prevent for Email, and the Endpoint Prevent and Endpoint Discover detection servers. Each
Installing and managing detection servers and cloud detectors 264
Server configuration—basic

of these detection server types is associated with one or more detection "channels." The Single
Server deployment simplifies Symantec Data Loss Prevention administration and reduces
maintenance and hardware costs for small organizations, or for branch offices of larger
enterprises that would benefit from on-site deployments of Symantec Data Loss Prevention.

Configuring the channels for Network Monitor


Network Monitor uses two channels: Packet Capture and SMTP Copy Rule. To configure
Network Monitor, enter your configuration information on both the Packet Capture and SMTP
Copy Rule tabs on the Configure Server screen.
To configure the Packet Capture and SMTP Copy Rule tabs
1 Optional: On the Packet Capture tab of the Configure Server Screen, specify the Source
Folder Override.
The source folder is the directory the server uses to buffer network streams before it
processes them. The recommended setting is to leave the Source Folder Override field
blank to accept the default. If you want to specify a custom buffer directory, type the full
path to the directory.
2 Select the Network Interfaces.
Select the network interface card(s) to use for monitoring.
Note that to monitor a NIC WinPcap software must be installed on the Network Monitor
Server.
See the Symantec Data Loss Prevention Installation Guide for more information about
NICs.
3 In the Protocol section, check the box for each type of network traffic to capture.
When you initially configure a server, the settings for each selected protocol are inherited
from the system-wide protocol settings. You configure these settings by going to System
> Settings > Protocol. System-wide default settings are listed as Standard. To override
the inherited filtering settings for a protocol, click the name of the protocol. The following
custom settings are available (some settings may not be available for some protocols):
■ IP filter
■ L7 sender filter
■ L7 recipient filter
■ Content filter
■ Search Depth (packets)
■ Sampling rate
■ Maximum wait until written
Installing and managing detection servers and cloud detectors 265
Server configuration—basic

■ Maximum wait until dropped


■ Maximum stream packets
■ Minimum stream size
■ Maximum stream size
■ Segment Interval
■ No traffic notification timeout (The maximum value for this setting is 360000 seconds.)

4 Optional: On the SMTP Copy Rule tab, specify the Source Folder Override to modify
the source folder where this server retrieves SMTP message files.
You can modify the source folder by entering the full path to a folder. Leave this field blank
to use the default source folder.

Configuring the channel for Network Discover/Cloud Storage Discover


Network Discover/Cloud Storage Discover uses the Discover channel. On the Discover tab,
you can modify the number of parallel scans that run on the Single Tier Monitor by entering a
number in the Maximum Parallel Scans field.

Note: If you plan to use the grid scanning feature to distribute the scanning workload across
multiple detection servers, retain the default value (1).

The maximum count can be increased at any time. After it is increased, any queued scans
that are eligible to run on the Network Discover Server are started. The count can be decreased
only if the Network Discover Server has no running scans. Before you reduce the count, pause,
or stop, all scans running on the server.

Configuring the channel for Network Prevent for Web


Network Prevent for Web uses the ICAP channel. The ICAP channel configuration tab is
divided into four sections: Request Filtering, Response Filtering, and Connection.
Installing and managing detection servers and cloud detectors 266
Server configuration—basic

To configure the ICAP tab


1 Verify or change the Trial Mode setting. Trial Mode lets you test prevention without
blocking requests in real time. If you select Trial Mode, Symantec Data Loss Prevention
detects incidents and indicates that it has blocked an HTTP communication, but it does
not block the communication.
2 Verify or modify the filter options for requests from HTTP clients (user agents). The options
in the Request Filtering section are as follows:

Ignore Requests Smaller Than Specifies the minimum body size of HTTP
requests to inspect. (The default is 4096 bytes.)
For example, search-strings typed in to search
engines such as Yahoo or Google are usually
short. By adjusting this value, you can exclude
those searches from inspection.

Ignore Requests without Attachments Causes the server to inspect only the requests
that contain attachments. This option can be
useful if you are mainly concerned with requests
intended to post sensitive files.

Ignore Requests to Hosts or Domains Causes the server to ignore requests to the hosts
or domains you specify. This option can be useful
if you expect a lot of HTTP traffic between the
domains of your corporate headquarters and
branch offices. You can type one or more host
or domain names (for example,
www.company.com), each on its own line.

Ignore Requests from User Agents Causes the server to ignore requests from user
agents (HTTP clients) you specify. This option
can be useful if your organization uses a program
or language (such as Java) that makes frequent
HTTP requests. You can type one or more user
agent values, each on its own line.
Installing and managing detection servers and cloud detectors 267
Server configuration—basic

3 Verify or modify the filter options for responses from web servers. The options in the
Response Filtering section are as follows:

Ignore Responses Smaller Than Specifies the minimum size of the body of HTTP
responses that are inspected by this server.
(Default is 4096 bytes.)

Inspect Content Type Specifies the MIME content types that Symantec
Data Loss Prevention should monitor in
responses. By default, this field contains
content-type values for Microsoft Office, PDF,
and plain text formats. To add others, type one
MIME content type per line. For example, type
application/word2013 to have Symantec
Data Loss Prevention analyze Microsoft Word
2013 files.

Note that it is generally more efficient to specify


MIME content types at the Web proxy level.

Ignore Responses from Hosts or Domains Causes the server to ignore responses from the
hosts or domains you specify. You can type one
or more host or domain names (for example,
www.company.com), each on its own line.

Ignore Responses to User Agents Causes the server to ignore responses to user
agents (HTTP clients) you specify. You can type
one or more user agent values, each on its own
line.
Installing and managing detection servers and cloud detectors 268
Server configuration—basic

4 Verify or modify settings for the ICAP connection between the HTTP proxy server and the
Web Prevent Server. The Connection options are as follows:

TCP Port Specifies the TCP port number over which this
server listens for ICAP requests. This number
must match the value that is configured on the
HTTP proxy that sends ICAP requests to this
server. The recommended value is 1344.

Maximum Number of Requests Specifies the maximum number of simultaneous


ICAP request connections from the HTTP proxy
or proxies. The default is 25.

Maximum Number of Responses Specifies the maximum number of simultaneous


ICAP response connections from the HTTP proxy
or proxies. The default is 25.

Connection Backlog Specifies the number of waiting connections


allowed. A waiting connection is a user waiting
for an HTTP response from the browser. The
minimum value is 1. If the HTTP proxy gets too
many requests (or responses), the proxy handles
them according to your proxy configuration. You
can configure the HTTP proxy to block any
requests (or responses) greater than this number.

Configuring the channel for Network Prevent for Email


Network Prevent for Email uses the Inline SMTP channel. The Inline SMTP configuration tab
is divided into three sections: Maximum number of connections, Security Configuration,
and Next Hop Configuration.
To configure the Inline SMTP tab
1 Verify or change the Trial Mode setting. Trial Mode lets you test prevention without
blocking requests in real time. If you select Trial Mode, Symantec Data Loss Prevention
detects incidents and indicates that it has blocked an email message, but it does not block
the message.
2 Verify or modify the Maximum number of connections. By default, the maximum number
of connections is 12.
Installing and managing detection servers and cloud detectors 269
Server configuration—basic

3 If you use TLS authentication in a forwarding mode configuration, enter the correct
password for the keystore file in the Keystore Password field of the Security
Configuration section.
Installing and managing detection servers and cloud detectors 270
Server configuration—basic

4 In the Next Hop Configuration section, configure reflecting mode or forwarding mode by
modifying the following fields:

Field Description

Next Hop Configuration Select Reflect to operate Network Prevent for


Email Server in reflecting mode. Select Forward
to operate in forwarding mode.
Note: If you select Forward you must also select
Enable MX Lookup or Disable MX Lookup to
configure the method used to determine the
next-hop MTA.

Enable MX Lookup This option applies only to forwarding mode


configurations.

Select Enable MX Lookup to perform a DNS


query on a domain name to obtain the mail
exchange (MX) records for the server. Network
Prevent for Email Server uses the returned MX
records to select the address of the next hop mail
server.

If you select Enable MX Lookup, also add one


or more domain names in the Enter Domains
text box. For example:

companyname.com

Network Prevent for Email Server performs MX


record queries for the domain names that you
specify.
Note: You must include at least one valid entry
in the Enter Domains text box to successfully
configure forwarding mode behavior.
Installing and managing detection servers and cloud detectors 271
Server configuration—basic

Field Description

Disable MX Lookup This field applies only to forwarding mode


configurations.

Select Disable MX Lookup if you want to specify


the exact hostname or IP address of one or more
next-hop MTAs. Network Prevent for Email
Server uses the hostnames or addresses that
you specify and does not perform an MX record
lookup.

If you select Disable MX Lookup, also add one


or more hostnames or IP addresses for next-hop
MTAs in the Enter Hostnames text box. You can
specify multiple entries by placing each entry on
a separate line. For example:

smtp1.companyname.com
smtp2.companyname.com
smtp3.companyname.com

Network Prevent for Email Server always tries to


proxy to the first MTA that you specify in the list.
If that MTA is not available, Network Prevent for
Email Server tries the next available entry in the
list.
Note: You must include at least one valid entry
in the Enter Hostnames text box to successfully
configure forwarding mode behavior.

Configuring the channel for Endpoint


Endpoint uses the Endpoint channel. You can configure the Endpoint channel on the Agent
tab.
To configure the Agent tab
◆ Configure the Agent Listener fields:

Field Description

Bind address Enter the IP address on which the Endpoint Server listens for communications
from the Symantec DLP Agents. The default IP address is 0.0.0.0 which allows
the Endpoint Server to listen on all host IP addresses.

Port Enter the port over which the Endpoint Server listens for communications from
the Symantec DLP Agents.
Installing and managing detection servers and cloud detectors 272
Editing a detector

Configuring Advanced Server Settings for the Single Tier Monitor


Because the Single Tier Monitor runs multiple channels on the same detection server, you
must modify some Advanced Server Settings to get the best performance from your system.
To modify the Advanced Server Settings on your Single Tier Monitor
1 Log on to the Enforce Server as Administrator.
2 Go to System > Servers and Detectors > Overview.
The Overview page appears.
3 Click the Single Tier Monitor detection server row.
The Server/Detector Detail page appears.
4 Click Server Settings.
The Server/Detector Detail - Advanced Settings page appears.
5 Modify the following settings:

Setting Value

MessageChain.NumChains 32

MessageChain.CacheSize 32

PacketCapture.NUMBER_BUFFER_POOL_PACKETS 1,200,000

PacketCapture.NUMBER_SMALL_POOL_PACKETS 1,000,000

6 Click Save.
See “About Symantec Data Loss Prevention administration” on page 82.
See “About the Overview screen” on page 278.
See “Server/Detector Detail screen” on page 283.
See “Server configuration—basic” on page 253.
See “Server controls” on page 251.
See “Advanced server settings” on page 285.
See the Symantec Data Loss Prevention online Help for information about Advanced Server
settings.

Editing a detector
You can change the name of your detector on the Server/Detector Detail screen.
Installing and managing detection servers and cloud detectors 273
Server and detector configuration—advanced

Editing the name of a detector


1 Go to System > Servers and Detectors > Overview and click on the name of the detector.
The Server/Detector Detail screen appears.
2 Click Edit.
The Edit Detector page appears.
3 Enter a new name for the detector in the Detector Name field.
4 Click Save.

Server and detector configuration—advanced


Symantec Data Loss Prevention provides advanced server and detector configuration settings
for each detection server or detector in your system.

Note: Check with Symantec Support before changing any advanced settings. If you make a
mistake when changing advanced settings, you can severely degrade performance or even
disable the server entirely.

To change an advanced configuration setting for a detection server or detector


1 Go to System > Servers and Detectors > Overview and click on the name of the detection
server.
That server's Server/Detector Detail screen appears.
2 Click Server Settings or Detector Settings, as appropriate.
The Server/Detector Detail - Advanced Settings screen appears.
See Symantec Data Loss Prevention online Help for information about advanced server
configuration.
See “Advanced server settings” on page 285.
3 With the guidance of Symantec Support, modify the appropriate setting(s).
4 Click Save.
Changes to settings on this screen normally do not take effect until you restart the server.
See “Server configuration—basic” on page 253.

Adding a detection server


Add the detection servers that you want to your Symantec Data Loss Prevention system from
the System > Servers and Detectors > Overview screen.
Installing and managing detection servers and cloud detectors 274
Adding a detection server

You can add the following types of servers:


■ Network Monitor Server, which monitors network traffic.
■ Network Discover/Cloud Storage Discover Server, which inspects stored data for policy
violations.
■ Network Prevent for Email Server, which prevents SMTP violations.
■ Cloud Prevent for Email Server, which prevents Microsoft Office 365 Exchange traffic
violations.
■ Network Prevent for Web Server, which prevents ICAP proxy server violations such as
FTP, HTTP, and HTTPS.
■ Endpoint Prevent, which controls Symantec DLP Agents that monitor and scan endpoints.
■ Single-Tier Server: By selecting the Single-Tier Server option, the detection servers that
you have licensed are installed on the same host as the Enforce Server. The single-tier
server performs detection for the following products (you must have a license for each):
Network Monitor, Network Discover, Network Prevent for Email, Network Prevent for Web,
and Endpoint Prevent.

Note: Symantec recommends that you apply the same hardware and software configuration
to all of the detections servers that you intend to use for grid scans. Symantec Data Loss
Prevention supports grid scans that have up to 11 participating detection servers.

To add a detection server


1 Go to the System Overview screen (System > Servers and Detectors > Overview).
See “About the Overview screen” on page 278.
2 Click Add Server.
The Add Server screen appears.
3 Select the type of server you want to install and click Next.
The Configure Server screen for that detection server appears.
Installing and managing detection servers and cloud detectors 275
Adding a cloud detector

4 To perform the basic server configuration, use the Configure Server screen, then click
Save when you are finished.
See “Network Monitor Server—basic configuration” on page 254.
See “Network Prevent for Email Server—basic configuration” on page 256.
See Symantec Data Loss Prevention Cloud Prevent for Microsoft Office 365 Implementation
Guide for more details.
See “Network Prevent for Web Server—basic configuration” on page 259.
See “Network Discover/Cloud Storage Discover Server and Network Protect—basic
configuration” on page 261.
See “Endpoint Server—basic configuration” on page 262.
5 In addition to the configuration steps specific to each server, you can configure the OCR
Engine or Detection server Inspection Content Size from tabs on this screen.
See OCR Engine configuration.
See Inspection Content Size settings.
6 To return to the System Overview screen, click Done.
Your new server is displayed in the Servers and Detectors list with a status of Unknown.
7 Click on the server to display its Server/Detector Detail screen.
See “Server/Detector Detail screen” on page 283.
8 Click [Recycle] to restart the server.
9 Click Done to return to the System Overview screen.
When the server is finished restarting, its status displays Running.
10 If necessary, click Server Settings on the Server/Detector Detail screen to perform
Advanced Server configuration.
See “Advanced server settings” on page 285.
See Symantec Data Loss Prevention online Help for information about Advanced Server
configuration.
See “Server configuration—basic” on page 253.

Adding a cloud detector


A cloud detector is a Symantec Data Loss Prevention detection service deployed in the
Symantec Cloud. After Symantec has set up your detection service in the cloud, Symantec
sends you an enrollment bundle. This bundle contains the information that you need to set up
Installing and managing detection servers and cloud detectors 276
Adding a cloud detector

the connection from your on-premises Enforce Server to the detection service in the Symantec
Cloud.
The enrollment bundle is a ZIP archive. For security reasons, you should save the unextracted
ZIP file to a location that is not accessible by others users. For example, on a Microsoft Windows
system, save the bundle to a folder such as:

c:\Users\username\downloads

On a Linux system, save the bundle to a directory such as:

/home/username/

See the documentation for your cloud detector for more detailed information about the
enrollment process.
After you have saved the enrollment bundle, register your cloud detector to enable
communication between it and your on-premises Enforce Server.
To register a cloud detector
1 Log on to the Enforce Server as Administrator.
2 Navigate to System > Servers and Detectors > Overview.
The Overview page appears.
3 Click Add Cloud Detector.
The Add Cloud Detector page appears.
4 Click Browse in the Enrollment Bundle File field.
5 Locate your saved enrollment bundle file, then enter a name in the Detector Name field.
6 Click Enroll Detector.
The Server/Detector Detail screen appears.
7 If necessary, click Detector Settings on the Server/Detector Detail screen to perform
advanced detector configuration.
See “Advanced detector settings” on page 326.
8 Click Done.
It may take several minutes for the Enforce Server administration console to show that the
cloud detector is running. To verify that the detector was added, check the System > Servers
and Detectors > Overview page. The detector should appear in the Servers and Detectors
list with the Connected status.
Installing and managing detection servers and cloud detectors 277
Removing a server

Removing a server
See the appropriate Symantec Data Loss Prevention Installation Guide for information about
uninstalling Symantec Data Loss Prevention from a server.
An Enforce Server administration console lists the detection servers registered with it on the
System > Servers and Detectors > Overview screen. If Symantec Data Loss Prevention is
uninstalled from a detection server, or that server is stopped or disconnected from the network,
its status is shown as Unknown on the console.
A detection server can be removed (de-registered) from an Enforce Server administration
console. When a detection server is removed from an Enforce Server, its Symantec Data Loss
Prevention services continue to operate. This means that even though a detection server is
de-registered from Enforce, it continues to function unless some action is taken to halt it. In
other words, even though it is removed from an Enforce Server administration console, a
detection server continues to operate. Incidents it detects are stored on the detection server.
If a detection server is re-registered with an Enforce Server, incidents detected and stored are
then forwarded to Enforce.
To remove (de-register) a detection server from Enforce
1 Go to System > Servers and Detectors > Overview.
See “About the Overview screen” on page 278.
2 In the Servers and Detectors section of the screen, click the red X on a server's status
line to remove it from this Enforce Server administration console.
See “Server controls” on page 251.
3 Click OK to confirm.
The server's status line is removed from the System Overview list.

Importing SSL certificates to Enforce or Discover


servers
You can import SSL certificates to the Java trusted keystore on the Enforce or Discover servers.
The SSL certificate can be self-signed (server) or issued by a well-known certificate authority
(CA).
You may need to import an SSL certificate to make secure connections to external servers
such as Active Directory (AD). If a recognized authority has signed the certificate of the external
server, the certificate is automatically added to the Enforce Server. If the server certificate is
self-signed, you must manually import it to the Enforce or Discover Servers.
Installing and managing detection servers and cloud detectors 278
About the Overview screen

Table 14-4 Importing an SSL certificate to Enforce or Discover

Step Description

1 Copy the certificate file you want to import to the Enforce Server or Discover Server computer.

2 Change directory to c:\Program


Files\Symantec\DataLossPrevention\ServerJRE\1.8.0_181\lib\security on
the Enforce Server or Discover Server computer.

3 Execute the keytool utility with the -importcert option to import the public key certificate
to the Enforce Server or Discover Server keystore:

keytool -importcert -alias new_endpointgroup_alias


-keystore ..\lib\security\cacerts -file my-domaincontroller.crt

In this example command, new_endpointgroup_alias is a new alias to assign to the imported


certificate and my-domaincontroler.crt is the path to your certificate.

4 When you are prompted, enter the password for the keystore.

By default, the password is changeit. If you want you can change the password when prompted.

To change the password, use: keytool -storepassword -alias


new_endpointgroup_alias -keystore ..\lib\security\cacerts

5 Answer Yes when you are asked if you trust this certificate.

6 Restart the Enforce Server or Discover Server.

See “Configuring directory server connections” on page 156.

About the Overview screen


The System Overview screen is reached by System > Servers and Detectors > Overview.
This screen provides a quick snapshot of system status. It lists information about the Enforce
Server, and each registered detection server, cloud detector, or appliance.
The System Overview screen provides the following features:
■ The Add Server button is used to register a detection server. When this screen is first
viewed after installation, only the Enforce Server is listed. You must register your various
detection servers with the Add Server button. After you register detection servers, they
are listed in the Servers and Detectors section of the screen.
See “Adding a detection server” on page 273.
■ The Add Cloud Detector button is used to register a cloud detector. When this screen is
first viewed after installation, only the Enforce Server is listed. You must register your cloud
Installing and managing detection servers and cloud detectors 279
Configuring the Enforce Server to use a proxy to connect to cloud services

detectors with the Add Cloud Detector button. After you register cloud detectors, they are
listed in the Servers and Detectors section of the screen.
■ The Add Appliance button is used to register and appliance. When this screen is first
viewed after installation, on the Enforce Server is listed. You must register your appliances
with the Add Appliance button. After you register your appliances, they are listed in the
Servers and Detectors section of the screen.
See “Adding an appliance” on page 2539.
■ The System Readiness and Appliances Update button is used to access the System
Readiness and Appliances Update screen where you can run tests to confirm that
database update readiness and update appliances.

■ The Upgrade button is for upgrading Symantec Data Loss Prevention to a newer version.
See “About system upgrades” on page 235.
See also the appropriate Symantec Data Loss Prevention Upgrade Guide.
■ The Servers and Detectors section of the screen displays summary information about
the status of each server, detector, or appliance. It can also be use to remove (de-register)
a server, detector, or appliance.
See “Server and detector status overview” on page 280.
■ The Recent Error and Warning Events section shows the last five events of error or
warning severity for any of the servers listed in the Servers and Detectors section.
See “Recent error and warning events list” on page 282.
■ The License section of the screen lists the Symantec Data Loss Prevention individual
products that you are licensed to use.
See “Server configuration—basic” on page 253.
See “About Symantec Data Loss Prevention administration” on page 82.

Configuring the Enforce Server to use a proxy to


connect to cloud services
To configure the Enforce Server to use a proxy to connect to cloud services, you must set up
your proxy according to the proxy manufacturer's instructions. Then you configure the Enforce
Server to support the use of the proxy. After setting up your proxy, use these instructions to
complete the setup.
If you have configured the Enforce Server to connect to the Symantec ICE Cloud, Network
Protect uses the configured proxy to connect to the ICE Cloud whenever a SharePoint scan
triggers the SharePoint Encrypt response action.
See “Configuring the Enforce Server to connect to the Symantec ICE Cloud” on page 224.
Installing and managing detection servers and cloud detectors 280
Server and detector status overview

Network Discover also supports network proxies for connecting to the ICE Cloud during file
share (File System) scans. To configure the network proxy settings for file share scans, you
must update the Network Discover/Cloud Storage Discover Server configuration.
See “Configuring Network Discover to use a proxy to connect to the Symantec ICE Cloud for
file share scans” on page 2085.
To configure the Enforce Server to use a proxy to connect to a cloud service
1 Go to System > Settings > General and click Configure. The Edit General Settings
screen is displayed.
2 In the Enforce to Cloud Proxy Settings section, select one of the following proxy
categories:
■ No proxy, or transparent proxy, or
■ Manual proxy

3 If you choose Manual proxy, fields for a URL, Port, and Proxy is Authenticated appear.
■ Enter the the HTTP Proxy URL.
■ Enter a port number.

4 If you are using an authenticated proxy, also enter


■ a user ID
■ a password

Note: The Enforce Server supports basic authentication when using a proxy to connect
to cloud services. For connecting to the ICE Cloud, the Enforce Server supports basic,
NTLM, and Kerberos authentication.

5 Click Save.

Server and detector status overview


The Servers and Detectors section of the System Overview screen is reached by System
> Servers and Detectors > Overview. This section of the screen provides a quick overview
of system status.

Table 14-5 Server and detector statuses

Icon Status Description

Starting The server is starting up.


Installing and managing detection servers and cloud detectors 281
Server and detector status overview

Table 14-5 Server and detector statuses (continued)

Icon Status Description

Running The server is running normally without errors.

Running Selected Some Symantec Data Loss Prevention processes on the server are
stopped or have errors. To see the statuses of individual processes, you
must first enable Advanced Process Control on the System Settings
screen.

See “Enabling Advanced Process Control” on page 250.

Stopping The server is in the process of stopping Symantec Data Loss Prevention
services.

See “About Symantec Data Loss Prevention services” on page 101.

Stopped All Symantec Data Loss Prevention processes are stopped.

Unknown The server is experiencing one of the following errors:

■ The Enforce Server is not reachable from server.


■ Symantec Data Loss Prevention is not installed on the server.
■ A license key has not been configured for the Enforce Server.
■ There is problem with Symantec Data Loss Prevention account
permissions in Windows.

For each server, the following additional information appears. You can also click on any server
name to display the Server/Detector Detail screen for that server.

Table 14-6 Server and detector status additional information

Column name Description

Messages (Last 10 sec) The number of messages processed in the last 10 seconds.

Messages (Today) The number of messages processed since 12:00 AM today.

Incidents (Today) The number of incidents processed since 12:00 AM today.

For Endpoint Servers, the Messages and Incidents are not aligned. This
is because messages are being processed at the Endpoint and not the
Endpoint Server. However, the incident count still increases.
Installing and managing detection servers and cloud detectors 282
Recent error and warning events list

Table 14-6 Server and detector status additional information (continued)

Column name Description

Incident Queue For the Enforce Server, this is the number of incidents that are in the
database, but do not yet have an assigned status. This number is updated
whenever this screen is generated.

For the other types of servers, this is the number of incidents that have
not yet been written to the Enforce Server. This number is updated
approximately every 30 seconds. If the server is shut down, this number
is the last number updated by the server. Presumably the incidents are
still in the incidents folder.

Message Wait Time The amount of time it takes to process a message after it enters the
system. This data applies to the last message processed. If the server
that processed the last message is disconnected, this is N/A.

To see details about a server or detector


◆ Click on any server name to see additional details regarding that server.
See “Server/Detector Detail screen” on page 283.
To remove a server or detector from an Enforce Server
◆ Click the red X for that server, and then confirm your decision.

Note: Removing (de-registering) a server only disconnects it from this Enforce Server, it does
not stop the detection server from operating.

See “Removing a server” on page 277.

Recent error and warning events list


The Recent Error and Warning Events section of the System > Servers and Detectors >
Overview screen shows the last five events of either error or warning severity for any of the
servers listed in the Servers and Detectors section.

Table 14-7 Recent error and warning events information

Column name Description

Type

The yellow triangle indicates a warning, the red octagon indicates an error.
Installing and managing detection servers and cloud detectors 283
Server/Detector Detail screen

Table 14-7 Recent error and warning events information (continued)

Column name Description

Time The date and time when the event occurred.

Server The name of the server on which the event occurred.

Host The IP address or name of the machine where the server resides. The server and
host names may be the same.

Code The system event code. The Messagecolumn provides the code text. Event lists
can be filtered by code number.

Message A summary of the error or warning message that is associated with this event code.

■ To display a list of all error and warning events, click Show all.
■ To display the Event Detail screen for additional information about that particular event,
click an event.
See “About the Overview screen” on page 278.
See “System events reports” on page 165.
See “Server and Detectors event detail” on page 169.

Server/Detector Detail screen


The Server/Detector Detail screen provides detailed information about a single selected
server, detector, or appliance. The Server/Detector Detail screen is also used to control and
configure a server, detector, or appliance.
To display the Server/Detector Detail screen for a particular server or detector
1 Navigate to the System > Servers and Detectors > Overview screen.
2 Click the detection server, detector, or appliance name in the Servers and Detectors list.
See “About the Overview screen” on page 278.
The Server/Detector Detail screen is divided into sections. The sections listed below display
all server, detector, and appliance types. The system displays sections based on the type of
detection.
Installing and managing detection servers and cloud detectors 284
Server/Detector Detail screen

Table 14-8 Server Detail screen display information

Server Detail display Description


sections

General The General section identifies the server, displays system status and statistics,
and provides controls for starting and stopping the server and its processes.

See “Server controls” on page 251.

Configuration The Configuration section displays the Channels, Policy Groups, Agent
Configuration, User Device, and Configuration Status for the detection server.

All Agents The All Agents section displays a summary of all agents that are assigned to
an Endpoint Server.

Click the number next to an agent status to view agent details on the System
> Agents > Overview > Summary Reports screen.
Note: The system only displays the Agent Summary section for an Endpoint
Server.

Recent Error and The Recent Error and Warning Events section displays the five most recent
Warning Events Warning or Severe events that have occurred on this server.

Click on an event to show event details. Click show all to display all error and
warning events.

See “About system events” on page 164.

All Recent Events The All Recent Events section displays all events of all severities that have
occurred on this server during the past 24 hours.

Click on an event to show event details. Click show all to display all detection
server events.

Deployed Exact Data The Deployed Exact Data Profile section lists any Exact Data or Document
Profiles Profiles you have deployed to the detection server. The system displays the
version of the index in the profile.

See “Data Profiles” on page 375.

See “About the Overview screen” on page 278.


See “Server configuration—basic” on page 253.
See “Server controls” on page 251.
See “System events reports” on page 165.
See “Server and Detectors event detail” on page 169.
Installing and managing detection servers and cloud detectors 285
Advanced server settings

Advanced server settings


Click Server Settings on the detection server's System > Servers and Detectors > Overview
> Server/Detector Detail screen to modify the settings on that server.
Use caution when modifying these settings on a server. Contact Symantec Support before
changing any of the settings on this screen. Changes to these settings normally do not take
effect until after the server has been restarted.
You cannot change settings for the Enforce Server from the Server/Detector Detail screen.
The Server/Detector Detail - Advanced Settings screen only displays for detection servers
and detectors.

Note: If you change advanced server settings to Endpoint Servers in a load-balanced


environment, you must apply the same changes to all Endpoint Servers in the load-balanced
environment.

Table 14-9 Detection server advanced settings

Setting Default Description

BoxMonitor.Channels Varies The values are case-sensitive and


comma-separated if multiple.
Although any mix of them can be
configured, the following are the
officially supported configurations:

■ Network Monitor Server: Packet


Capture, Copy Rule
■ Discover Server: Discover
■ Endpoint Server: Endpoint
■ Network Prevent for Email:
Inline SMTP
■ Network Prevent for Web: ICAP

BoxMonitor.DetectionServerDatabase on Enables the BoxMonitor process to


start the Automated Incident
Remediation Tracking database on
the Detection Server. If you set this
to off, you must start the
remediation tracking database
manually.

BoxMonitor.DetectionServerDatabaseMemory -Xrs -Xms300M Any combination of JVM memory


-Xmx1024M flags can be used.
Installing and managing detection servers and cloud detectors 286
Advanced server settings

Table 14-9 Detection server advanced settings (continued)

Setting Default Description

BoxMonitor.DiskUsageError 90 The amount of disk space filled (as


a percentage) that will trigger a
severe system event. For instance,
if Symantec Data Loss Prevention
is installed on the C drive and this
value is 90, then the detection
server creates a severe system
event when the C drive usage is
above 90%.

BoxMonitor.DiskUsageWarning 80 The amount of disk space filled (as


a percentage) that will trigger a
warning system event. For instance,
if Symantec Data Loss Prevention
is installed on the C drive and this
value is 80, then the detection
server generates a warning system
event when the C drive usage is
above 80%.

BoxMonitor.EndpointServer on Enables the Endpoint Server.

BoxMonitor.EndpointServerMemory -Xrs -Xms300M Any combination of JVM memory


-Xmx4096M flags can be used. For example:
-Xrs -Xms300m -Xmx1024m.

BoxMonitor.FileReader on If off, the BoxMonitor cannot start


the FileReader, although it can still
be started manually.

BoxMonitor.FileReaderMemory -Xrs -Xms1200M FileReader JVM command-line


-Xmx4G arguments.

BoxMonitor.HeartbeatGapBeforeRestart 960000 The time interval in milliseconds that


the BoxMonitor waits for a monitor
process (for example, FileReader,
IncidentWriter) to report the
heartbeat. If the heartbeat is not
received within this time interval the
BoxMonitor restarts the process.
Installing and managing detection servers and cloud detectors 287
Advanced server settings

Table 14-9 Detection server advanced settings (continued)

Setting Default Description

BoxMonitor.IncidentWriter on If off, the BoxMonitor cannot start


the IncidentWriter in the two-tier
mode, although it can still be started
manually. This setting has no effect
in the single-tier mode.

BoxMonitor.IncidentWriterMemory -Xrs IncidentWriter JVM command-line


arguments. For example: -Xrs

BoxMonitor.InitialRestartWaitTime 5000 The time interval in milliseconds that


the BoxMonitor waits after restarting
a monitor process, such FileReader
or IncidentWriter.

BoxMonitor.MaxRestartCount 3 The number of times that a process


can be restarted in one hour before
generating a SEVERE system
event.

BoxMonitor.MaxRestartCountDuringStartup 5 The maximum times that the


monitor server will attempt to restart
on its own.

BoxMonitor.PacketCapture on If off, the BoxMonitor cannot start


PacketCapture, although it can still
be started manually. The
PacketCapture channel must be
enabled for this setting to work.

BoxMonitor.PacketCaptureDirectives -Xrs PacketCapture command line


parameters (in Java). For example:
-Xrs

BoxMonitor.ProcessLaunchTimeout 30000 The time interval (in milliseconds)


for a monitor process (e.g.
FileReader) to start.

BoxMonitor.ProcessShutdownTimeout 45000 The time interval (in milliseconds)


allotted to each monitor process to
shut down gracefully. If the process
is still running after this time the
BoxMonitor attempts to kill the
process.
Installing and managing detection servers and cloud detectors 288
Advanced server settings

Table 14-9 Detection server advanced settings (continued)

Setting Default Description

BoxMonitor.RequestProcessor on If off, the BoxMonitor cannot start


the RequestProcessor; although, it
can still be started manually. The
Inline SMTP channel must be
enabled for this setting to work.

BoxMonitor.RequestProcessorMemory -Xrs -Xms300M Any combination of JVM memory


-Xmx1300M flags can be used. For example:
-Xrs -Xms300M -Xmx1300M

BoxMonitor.RmiConnectionTimeout 15000 The time interval (in milliseconds)


allowed to establish connection to
the RMI object.

BoxMonitor.RmiRegistryPort 37329 The TCP port on which the


BoxMonitor starts the RMI registry.

BoxMonitor.StatisticsUpdatePeriod 10000 The monitor statistics are updated


after this time interval (in
milliseconds).

Classification.WebserviceLogRetentionDats 7 Specifies the number of days


classification web service logs are
retained.

ContentExtraction.DefaultCharsetForSubFileName N/A Defines the default character set


that is used in decoding the
sub-filename if the charset
conversion fails.
Installing and managing detection servers and cloud detectors 289
Advanced server settings

Table 14-9 Detection server advanced settings (continued)

Setting Default Description

ContentExtraction.EnableMetaData off Allows detection on file metadata.


If the setting is turned on, you can
detect metadata for Microsoft Office
and PDF files. For Microsoft Office
files, OLE metadata is supported,
which includes the fields Title,
Subject, Author, and Keywords. For
PDF files, only Document
Information Dictionary metadata is
supported, which includes fields
such as Author, Title, Subject,
Creation, and Update dates.
Extensible Metadata Platform
(XMP) content is not detected. Note
that enabling this metadata
detection option can cause false
positives.
Installing and managing detection servers and cloud detectors 290
Advanced server settings

Table 14-9 Detection server advanced settings (continued)

Setting Default Description

ContentExtraction.ImageExtractorEnabled 1 Allows you to adjust or turn off


content extraction for Form
Recognition.

The default setting, 1, loads the


Image Extractor plug-in on demand.
If one or more Form Recognition
rules are used, the Dynamic Image
Extractor plug-in automatically loads
on the detection server when
corresponding policy updates are
received. When Form Recognition
rules are deleted or disabled, the
plug-in automatically unloads. This
option prevents the Dynamic Image
Extractor plug-in from running if
Form Recognition is not being used.

Enter O to disable the Image


Extractor plug-in. This setting
prevents Form Recognition from
extracting images, effectively
disabling the feature.

Enter 2 if you want the Image


Extractor plug-in load when the
content extraction service launches
after the detection server starts up.
The plugin continues to run
regardless of whether form
Recognition policies have been
configured or not.

ContentExtraction.LongContentSize 1M If the message component exceeds


this size (in bytes) then the
ContentExtraction.LongTimeout
is used instead of
ContentExtraction.ShortTimeout.
Installing and managing detection servers and cloud detectors 291
Advanced server settings

Table 14-9 Detection server advanced settings (continued)

Setting Default Description

ContentExtraction.LongTimeout Varies The default value for this setting


varies depending on detection
server type (60,000 or 120,000).

The time interval (in milliseconds)


given to the ContentExtractor
to process a document larger than
ContentExtraction.
LongContentSize. If the
document cannot be processed
within the specified time it's reported
as unprocessed. This value should
be greater than
ContentExtraction.
ShortTimeout and less than
ContentExtraction.
RunawayTimeout.

ContentExtraction.MarkupAsText off Bypasses Content Extraction for


files that are determined to be XML
or HTML. This should be used in
cases such as web pages
containing data in the header block
or script blocks. Default is off.

ContentExtraction.MaxContentSize 30M The maximum size (in MB) of the


document that can be processed by
the ContentExtractor.

ContentExtraction.MaxNumImagesToExtract 10 The maximum number of images to


extract from PDF files and
multi-page TIFF documents.

ContentExtraction.RunawayTimeout 300,000 The time interval (in milliseconds)


given to the ContentExtractor to
finish processing of any document.
If the ContentExtractor does not
finish processing some document
within this time it will be considered
unstable and it will be restarted.
This value should be significantly
greater than
ContentExtraction.
LongTimeout.
Installing and managing detection servers and cloud detectors 292
Advanced server settings

Table 14-9 Detection server advanced settings (continued)

Setting Default Description

ContentExtraction.ShortTimeout 30,000 The time interval (in milliseconds)


given to the ContentExtractor to
process a document smaller than
ContentExtraction.LongContentSize.
If the document cannot be
processed within the specified time
it is reported as unprocessed. This
value should be less than
ContentExtraction.
LongTimeout.

ContentExtraction.TemporaryDirectory N/A Specifies the directory for temporary


content extraction files.

ContentExtraction.TrackedChanges off Allows detection of content that has


changed over time (Track Changes
content) in Microsoft Office
documents.
Note: Using the foregoing option
might reduce the accuracy rate for
IDM and data identifiers. The default
is set to off (disallow).

To index content that has changed


over time, set
ContentExtraction.
TrackedChanges=on in the
Indexer.properties file. The
default and recommended setting
is off.
Installing and managing detection servers and cloud detectors 293
Advanced server settings

Table 14-9 Detection server advanced settings (continued)

Setting Default Description

DDM.MaxBinMatchSize 30,000,000 The maximum size (in bytes) used


to generate the MD5 hash for an
exact binary match in an IDM. This
setting should not be changed. The
following conditions must be
matched for IDM to work correctly:

■ This setting must be exactly


identical to the max_bin_
match_size setting on the
Enforce Server in the
indexer.properties file.
■ This setting must be smaller or
equal to the FileReader.
FileMaxSize value.
■ This setting must be smaller or
equal to the
ContentExtraction.
MaxContentSize value on the
Enforce Server in the
indexer.properties file.

Note: Changing the first or third


item in the list requires re-indexing
all IDM files.

Detection.EncodingGuessingDefaultEncoding ISO-8859-1 Specifies the backup encoding


assumed for a byte stream.

Detection.EncodingGuessingEnabled on Designates whether the encoding


of unknown byte streams should be
guessed.

Detection.EncodingGuessingMinimumConfidence 50 Specifies the confidence level


required for guessing the encoding
of unknown byte streams.
Installing and managing detection servers and cloud detectors 294
Advanced server settings

Table 14-9 Detection server advanced settings (continued)

Setting Default Description

Detection.MessageTimeout ReportIntervalInSeconds 3600 Number of seconds between each


System Event published to display
the number of messages that have
timed out recently. These System
Events are scheduled to be
published at a fixed rate, but will be
skipped if no messages have timed
out in that period.

DI.MaxViolations 100 Specifies the maximum number of


violations allowed with data
identifiers.

Discover.CountAllFilteredItems false Provides more accurate scan


statistics by counting the items in
folders skipped because of filtering.

Setting the value to false enables


optimized Discover path filters,
which improve performance but may
occasionally lead to unexpected
filter behavior. Optimized filters
normalize slashes, truncate filter
strings before wildcard characters,
and remove trailing slashes.
Therefore, the filter string /Fol*der
will match /Folder, but it will also
match /FolXYZ.

Set this value to true to disable


optimized Discover path filters.

Discover.Exchange.FollowRedirects true Specifies whether to follow


redirects. Symantec Data Loss
Prevention follows redirects only
from the public root folder.

Discover.Exchange.ScanHiddenItems false Scan hidden items in Exchange


repositories, when set to true.

Discover.Exchange.UseSecureHttpConnections true Specifies whether connections to


Exchange repositories and Active
Directory are secure when using the
Exchange Web Services crawler.
Installing and managing detection servers and cloud detectors 295
Advanced server settings

Table 14-9 Detection server advanced settings (continued)

Setting Default Description

Discover.IgnorePstMessageClasses IPM.Appointment, This setting specifies a


comma-separated list of .pst
IPM.Contact,
message classes. All items in a
IPM.Task, .pst file that have a message class
in the list will be ignored (no attempt
REPORT. IPM.
will be made to extract the .pst
Note. DR,
item). This setting is case-sensitive.
REPORT. IPM.
Note.IPNRN

Discover.IncludePstMessageClasses IPM.Note This setting specifies a


comma-separated list of .pst
message classes. All items in a
.pst file that have a message class
in the list will be included.

When both the include setting and


the ignore setting are defined,
Discover.IncludePstMessageClasses
takes precedence.

Discover.PollInterval 10000 Specifies the time interval (in


milliseconds) at which Enforce
retrieves data from the Discover
monitor while scanning.

Discover.Sharepoint.FetchACL true Turns off ACL fetching for integrated


SharePoint scans. The default value
is true (on).

Discover.Sharepoint.SocketTimeout 60000 Sets the timeout value of the socket


connection (in milliseconds)
between the Network Discover
server and the SharePoint target.
Installing and managing detection servers and cloud detectors 296
Advanced server settings

Table 14-9 Detection server advanced settings (continued)

Setting Default Description

Discover.ValidateSSLCertificates false Set to true to enable validation of


the SSL certificates for the HTTPS
connections for SharePoint and
Exchange targets. When validation
is enabled, scanning SharePoint or
Exchange servers using self-signed
or untrusted certificates fails. If the
SharePoint web application or
Exchange server is signed by a
certificate issued by a certificate
authority (CA), then the server
certificate or the server CA
certificate must reside in the Java
trusted keystore used by the
Discover Server. If the certificate is
not in the keystore, you must import
it manually using the keytool
utility.

See “Importing SSL certificates to


Enforce or Discover servers”
on page 277.

EDM.HighlightAllMatchesInProximity false If false (default), the system


highlights the minimum number of
matches, starting from the leftmost.
For example, if the EDM policy is
configured to match 3 out of 8
column fields in the index, only the
first 3 matches are highlighted in the
incident snapshot.

If true, the system highlights all


matches occurring in the proximity
window, including duplicates. For
example, if the policy is configured
to match 3 of 8 and there are 7
matches occurring within the
proximity window, the system
highlights all 7 matches in the
incident snapshot.
Installing and managing detection servers and cloud detectors 297
Advanced server settings

Table 14-9 Detection server advanced settings (continued)

Setting Default Description

EDM.MatchCountVariant 3 Specifies how matches are counted.


■ 1 - Counts the total number of
token sets matched.
■ 2 - Counts the number of unique
token sets matched.
■ 3 - Counts the number of unique
super sets of token sets.
(default)

See “Configuring Advanced Settings


for EDM policies” on page 557.

EDM.MaximumNumberOfMatchesToReturn 100 Defines a top limit on the number of


matches returned from each RAM
index search.

See “Configuring Advanced Settings


for EDM policies” on page 557.

EDM.RunProximityLogic true If true, runs the token proximity


check.

See “Configuring Advanced Settings


for EDM policies” on page 557.

EDM.SimpleTextProximityRadius 35 Number of tokens that are


evaluated together when the
proximity check is enabled.

See “Configuring Advanced Settings


for EDM policies” on page 557.

EDM.TokenVerifierEnabled false If enabled (true), the server


validates tokens for Chinese,
Japanese, and Korean (CJK)
keywords.

Default is disabled (false).


Installing and managing detection servers and cloud detectors 298
Advanced server settings

Table 14-9 Detection server advanced settings (continued)

Setting Default Description

EndpointCommunications. 0 If enabled, limits the transfer rate of


AllConnInboundDataThrottleInKBPS all inbound traffic in kilobits per
second.

Default is disabled.

Changes to this setting apply to all


new connections. Changes do not
affect existing connections.

EndpointCommunications. 0 If enabled, limits the transfer rate of


AllConnOutboundDataThrottleInKBPS all outbound traffic in kilobits per
second.

Default is disabled.

Changes to this setting apply to all


new connections. Changes do not
affect existing connections.

EndpointCommunications. 60 Maximum time for server to wait for


ApplicationHandshakeTimeoutInSeconds each round trip during application
handshake communications before
closing the server-to-agent
connection.

Applies to the duration of time


between when the agent accepts
the TCP connection and when the
agent receives the handshake
message. This duration includes the
SSL handshake and the agent
receiving the HTTP headers. If the
process exceeds the specified
duration, the connection closes.

Changes to this setting apply to all


new connections. Changes do not
affect existing connections.

EndpointCommunications.MaxActiveAgentsPerServer 90000 Sets the maximum number of


agents associated with a given
server at any moment in time.

This setting is implemented after the


next Endpoint Server restart.
Installing and managing detection servers and cloud detectors 299
Advanced server settings

Table 14-9 Detection server advanced settings (continued)

Setting Default Description

EndpointCommunications. 150000 Sets the maximum number of


MaxActiveAgentsPerServerGroup agents that will be associated with
a given group of servers behind the
same local load balancer at any
moment in time. Used for maximum
sizes of caches for internal endpoint
features.

This setting is implemented after the


next Endpoint Server restart.

EndpointCommunications.MaxConcurrentConnections 90000 Sets the maximum number of


simultaneous connections to allow.

Changes to this setting apply to all


new connections. Changes do not
affect existing connections.

EndpointCommunications. 86400 (1 day) Sets the maximum time to allow a


MaxConnectionLifetimeInSeconds connection to remain open. Do not
set connections to remain open
indefinitely. Connections that close
ensure that SSL session keys are
frequently updated to improve
security. This timeout only applies
during the normal operation phase
of a connection, after the SSL
handshake and application
handshake phases of a connection.

This setting is implemented


immediately to all connections.

EndpointCommunications.ShutdownTimeoutInMillis 5000 (5 seconds) Sets the maximum time to wait to


gracefully close connections during
shutdown before forcing
connections to close.

This setting is implemented


immediately to all connections.
Installing and managing detection servers and cloud detectors 300
Advanced server settings

Table 14-9 Detection server advanced settings (continued)

Setting Default Description

EndpointCommunications.SSLCipherSuites TLS_RSA_WITH_ Lists the allowed SSL cipher suites.


AES_128_CBC_SHA Enter multiple entries, separated by
commas.

Changes to this setting apply to all


new connections. Changes do not
affect existing connections. You
must restart the Endpoint Server for
changes you make to take effect.
See “Server controls” on page 251.

If you are using FIPS 140-2 mode


for communication between the
Endpoint Server and DLP Agents,
do not use Diffie-Hellman (DH)
cipher suites. Mixing cipher suites
prevents the agent and Endpoint
Server from communicating.

EndpointCommunications. 86400 Sets the maximum SSL session


SSLSessionCacheTimeoutInSeconds entry lifetime in the SSL session
cache.

The default settings equals one day.


This setting is implemented after the
next Endpoint Server restart.

EndpointMessageStatistics.MaxFileDetectionCount 100 The maximum number of times a


valid file will be scanned. The file
must not cause an incident. After
exceeding this number, a system
event is generated recommending
that the file be filtered out.

EndpointMessageStatistics.MaxFolderDetectionCount 1800 The maximum number of times a


valid folder will be scanned. The
folder must not cause an incident.
After exceeding this number, a
system event is generated
recommending that the file be
filtered out.
Installing and managing detection servers and cloud detectors 301
Advanced server settings

Table 14-9 Detection server advanced settings (continued)

Setting Default Description

EndpointMessageStatistics.MaxMessageCount 2000 The maximum number of times a


valid message will be scanned. The
message must not cause an
incident. After exceeding this
number, a system event is
generated recommending that the
file be filtered out.

EndpointMessageStatistics.MaxSetSize 3 The maximum list of hosts displayed


from where valid files, folders, and
messages come. When a system
event for

EndpointMessageStatistics.

MaxFileDetectionCount,

EndpointMessageStatistics.

MaxFolderDetectionCount,

or EndpointMessageStatistics.

MaxMessageCount is generated,
Symantec Data Loss Prevention
lists the host machines where these
system events were generated. This
setting limits the number of hosts
displayed in the list.

EndpointServer.Discover.ScanStatusBatchInterval 60000 The interval of time in milliseconds


the Endpoint Server accumulates
Endpoint Discover scan statuses
before sending them to the Endpoint
Server as a batch.
Installing and managing detection servers and cloud detectors 302
Advanced server settings

Table 14-9 Detection server advanced settings (continued)

Setting Default Description

EndpointServer.Discover.ScanStatusBatchSize 1000 The number of scan statuses the


Aggregator accumulates before
sending them to the Enforce Server
as a batch. The Endpoint Server
forwards a batch of statuses to the
Enforce Server when the status
count reaches the configured value.

The batch is forwarded to the


Enforce Server when any of the
thresholds for the following settings
are met:

■ EndpointServer.Discover.
ScanStatusBatchInterval
■ EndpointServer.Discover.
ScanStatusBatchSize

EndpointServer.EndpointSystemEventQueueSize 20000 The maximum number of system


events that can be stored in the
endpoint agent's queue to be sent
to the Endpoint Server. If the
database connection is lost or some
other occurrence results in a
massive number of system events,
any additional system events that
occur after this number is reached
are discarded. This value can be
adjusted according to memory
requirements.

EndpointServer.MaxPercentage 60 The maximum amount (in


MemToStoreEndpointFiles percentage) of memory to use to
store shadow cache files.

EndpointServer.MaxTimeToKeepEndpointFilesOpen 20000 The time interval (in minutes) that


the endpoint file is kept open or the
file size can exceed the
EndpointServer.
MaxEndpointFileSize setting,
whichever occurs first.

EndpointServer.MaxTimeToWaitForWriter 1000 The maximum time (in milliseconds)


that the agent will wait to connect
to the server.
Installing and managing detection servers and cloud detectors 303
Advanced server settings

Table 14-9 Detection server advanced settings (continued)

Setting Default Description

EndpointServer.NoOfRecievers 15 The number of endpoint shadow


cache file receivers.

EndpointServer.NoOfWriters 10 The number of endpoint shadow


cache file writers.

FileReader.MaxFileSize 30M The maximum size (in MB) of a


message to be processed. Larger
messages are truncated to this size.
To process large files, ensure that
this value is equal to or greater than
the value of
ContentExtraction.MaxContentSize.

FileReader.MaxFileSystemCrawlerMemory 1024M The maximum memory that is


allocated for the File System
Crawler. If this value is less than
FileReader.MaxFileSize, then
the greater of the two values is
assigned.

FileReader.MaxReadGap 15 The time that a child process can


have data but not have read
anything before it stops sending
heartbeats.

FileReader.ScheduledInterval 1000 The time interval (in milliseconds)


between drop folder checks by the
filereader. This affects Copy Rule,
Packet Capture, and File System
channels only.

FileReader.TempDirectory Path to a secure A secure directory on the detection


directory as specified in server in which to store temporary
the filereader. files for the file reader.
temp. io.dir
attribute in the
FileReader.
properties
configuration file.

FormRecognition.ALIGNMENT_COEFFICIENT 85.00 A threshold on a scale from 0 to


100, indicating how well an image
should align with an indexed gallery
form in order to create an incident.
Installing and managing detection servers and cloud detectors 304
Advanced server settings

Table 14-9 Detection server advanced settings (continued)

Setting Default Description

FormRecognition.CANONICAL_FORM_WIDTH 930 The width in pixels to which all


images are internally resized for
form recognition.

Icap.AllowHosts any The default value of "any" permits


all systems to make a connection
to the Network Prevent for Web
Server on the ICAP service port.
Replacing "any" with the IP address
or Fully-Qualified Domain Name
(FQDN) of one or more systems
restricts ICAP connections to just
those designated systems. To
designate multiple systems,
separate their IP addresses of
FQDNs by commas.

Icap.AllowStreaming false If true, ICAP output is streamed to


the proxy directly without buffering
the ICAP request first.

Icap.BindAddress 0.0.0.0 IP address to which a Network


Prevent for Web Server listener
binds. When BindAddress is
configured, the server will only
answer a connection to that IP
address. The default value of
0.0.0.0 is a wild card that permits
listening to all available addresses
including 127.0.0.1.

Icap.BufferSize 3K The size (in kilobytes) of the


memory buffer used for ICAP
request streaming and chunking.
The streaming can happen only if
the request is larger than
FileReader.MaxFileSize and the
request has a Content-Length
header.
Installing and managing detection servers and cloud detectors 305
Advanced server settings

Table 14-9 Detection server advanced settings (continued)

Setting Default Description

Icap.DisableHealthCheck false If true, disables the ICAP periodic


self-check. If false, enables the
ICAP periodic self-check. This
setting is useful for debugging to
remove clutter produced by
self-check requests from the logs.

Icap.EnableIncidentSuppression true Enables the Incident Suppression


cache for Gmail Tablet ICAP traffic.

Icap.EnableTrace false If set to true, protocol debug tracing


is enabled once a folder is specified
using the Icap.TraceFolder setting.

Icap.ExchangeActiveSyncCommandsToInspect SendMail A comma-separated, case-sensitive


list of ActiveSync commands which
need to be sent through Symantec
Data Loss Prevention detection. If
this parameter is left blank,
ActiveSync support is disabled. If
this parameter is set to "any", all
ActiveSync commands are
inspected.

Icap.IncidentSuppressionCacheCleanupInterval 120000 The time interval in milliseconds for


running the Incident Suppression
cache clean-up thread.

Icap.IncidentSuppressionCacheTimeout 120000 The time in milliseconds to


invalidate the Incident Suppression
cache entry.

Icap.LoadBalanceFactor 1 The number of web proxy servers


that a Network Prevent for
Webserver is able to communicate
with. For example, if the server is
configured to communicate with 3
proxies, set the
Icap.LoadBalanceFactor value
to 3.

Icap.SpoolFolder N/A This value is needed for ICAP


Spools.
Installing and managing detection servers and cloud detectors 306
Advanced server settings

Table 14-9 Detection server advanced settings (continued)

Setting Default Description

Icap.TraceFolder N/A The fully qualified name of the folder


or directory where protocol debug
trace data is stored when the
Icap.EnableTrace setting is true.
By default, the value for this setting
is left blank.

ImagePreclassifier.ENABLE_FORM_RECOGNITION true Determines what types of images


_PRECLASSIFIER are processed for form recognition.
If true, Symantec Data Loss
Prevention filters out colorful
photographs, images such as logos,
email signatures, and other images
that are not characteristic of forms.
If false, Symantec Data Loss
Prevention processes all images.

ImagePreclassifier.ENABLE_OCR_PRECLASSIFIER true Determines what types of images


are processed for optical character
recognition (OCR). If true,
Symantec Data Loss Prevention
filters out colorful photographs,
images such as logos, email
signatures, and other images that
do not include meaningful text. If
false, Symantec Data Loss
Prevention processes all images.

ImageRecognition.NUM_WORKER_THREADS 2 The number of threads in the pool


used by the image recognition
detection process. The value for this
setting should equal half of the
number of physical cores on your
system.

IncidentDetection.IncidentLimitResetTime 86400000 Specifies the time frame (in


milliseconds) used by the

IncidentDetection.

MaxIncidentsPerPolicy

setting. The default setting


86400000 equals one day.
Installing and managing detection servers and cloud detectors 307
Advanced server settings

Table 14-9 Detection server advanced settings (continued)

Setting Default Description

IncidentDetection.MaxContentLength 2000000 Applies only to regular expression


rules. On a per-component basis,
only the first MaxContentLength
number of characters are scanned
for violations. The default
(2,000,000) is equivalent to > 1000
pages of typical text. The limiter
exists to prevent regular expression
rules from taking too long.

IncidentDetection.MaxIncidentsPerPolicy 10000 Defines the maximum number of


incidents detected by a specific
policy on a particular monitor within
the time-frame specified in the

IncidentDetection.

IncidentTimeLimitResetTime.

The default is 10,000 incidents per


policy per time limit.

IncidentDetection.MessageWaitSevere 240 The number of minutes to wait


before sending a severe system
event about message wait times.

IncidentDetection.MessageWaitWarning 60 The number of minutes to wait


before sending a warning system
event about message wait times.

IncidentDetection.MinNormalizedSize 30 This setting applies to IDM


detection. It MUST be kept in sync
with the corresponding setting in the
Indexer.properties file on the
Enforce Server (which applies to
indexing). Derivative detections only
apply to messages when their
normalized content is greater than
this setting. If the normalized
content size is less than this setting,
IDM detection does a straight binary
match.
Installing and managing detection servers and cloud detectors 308
Advanced server settings

Table 14-9 Detection server advanced settings (continued)

Setting Default Description

IncidentDetection.patternConditionMaxViolations 100 The maximum number of matches


a detection server reports. The
detection server does not report
matches more than the value of the

IncidentDetection.

patternConditionMaxViolations

parameter, even if there are any.

IncidentDetection.StopCachingWhenMemoryLowerThan 400M Instructs Detection to stop caching


tokenized and cryptographic content
between rule executions if the
available JVM memory drops below
this value (in megabytes). Setting
this attribute to 0 enables caching
regardless of the available memory
and is not recommended because
OutOfMemoryErrors may occur.

Setting this attribute to a value close


to, or larger than, the value of the
-Xmx option in
BoxMonitor.FileReaderMemory
effectively disables the caching.

Note that setting this value too low


can have severe performance
consequences.

IncidentDetection.TrialMode false Prevention trial mode setting to


generate prevention incidents
without having a prevention setup.

If true, SMTP incidents coming from


the Copy Rule and Packet Capture
channels appear as if they were
prevented and HTTP incidents
coming from Packet Capture
channel appear as if they were
prevented.

IncidentWriter.BacklogInfo 1000 The number of incidents that collect


in the log before an information level
message about the number of
messages is generated.
Installing and managing detection servers and cloud detectors 309
Advanced server settings

Table 14-9 Detection server advanced settings (continued)

Setting Default Description

IncidentWriter.BacklogSevere 10000 The number of incidents that collect


in the log before a severe level
message about the number of
messages is generated.

IncidentWriter.BacklogWarning 3000 The number of incidents that collect


in the log before a warning level
message about the number of
messages is generated.

IncidentWriter.ResolveIncidentDNSNames false If true, only recipient host names


are resolved from IP.

IncidentWriter.ShouldEncryptContent true If true, the monitor will encrypt the


body of every message, message
component and cracked component
before writing to disk or sending to
Enforce.

Keyword.TokenVerifierEnabled false Default is disabled (false).

If enabled (true), the server


validates tokens for Asian language
keywords (Chinese, Japanese, and
Korean).

See “Enabling and using CJK token


verification for server keyword
matching” on page 847.

L7.cleanHttpBody true If true, the HTML entity references


are replaced with spaces.
Installing and managing detection servers and cloud detectors 310
Advanced server settings

Table 14-9 Detection server advanced settings (continued)

Setting Default Description

L7.DefaultBATV Standard This setting determines the tagging


scheme that Network Prevent for
Email uses to interpret Bounce
Address Tag Validation (BATV) tags
in the MAIL FROM header of a
message. If this setting is
“Standard” (the default), Network
Prevent uses the tagging scheme
described in the BATV specification:

http://tools.ietf.org/html/

draft-levine-mass-batv-02

Change this setting to “Ironport” to


enable compatibility with the
IronPort proxy’s implementation of
BATV tagging.

L7.DefaultUrlEncodedCharset UTF-8 Defines the default character set to


be used in decoding query
parameters or URL-encoded body
when the character set information
is missing from the header.

L7.discardDuplicateMessages true If true, the Monitor ignores duplicate


messages based on the
messageID.
Installing and managing detection servers and cloud detectors 311
Advanced server settings

Table 14-9 Detection server advanced settings (continued)

Setting Default Description

L7.ExtractBATV true If true (the default), Network Prevent


for Email interprets Bounce Address
Tag Validation (BATV) tags that are
present in the MAIL FROM header
of a message. This allows Network
Prevent to include a meaningful
sender address in incidents that are
generated from messages having
BATV tags. If this setting is false,
Network Prevent for Email does not
interpret BATV tags, and a message
that contains BATV tags may
generate an incident that has an
unreadable sender address.

See http://tools.ietf.org/html/

draft-levine-mass-batv-02 for more


information about BATV.

L7.httpClientIdHeader X-Forwarded-For The sender identifier header name.

L7.MAX_NUM_HTTP_HEADERS 30 If any HTTP message that contains


more than the specified header
lines, it is discarded.

L7.maxWordLength 30 The maximum word length (in


characters) allowed in UTCP string
extraction.

L7.messageIDCacheCleanupInterval 600000 The length of time that the


messageID is cached. The system
will not cache duplicate messages
during this time period if the
L7.discardDuplicateMessages
setting is set to true.
Installing and managing detection servers and cloud detectors 312
Advanced server settings

Table 14-9 Detection server advanced settings (continued)

Setting Default Description

L7.minSizeOfGetUrl 100 The minimum size of the GET URL


to process. HTTP GET actions are
not inspected by Symantec Data
Loss Prevention for policy violations
if the number of bytes in the URL is
less than the value of this setting.
For example, with the default value
of 100, no detection check is
performed when a browser displays
the Symantec web site at:
http://www.symantec.com/index.jsp.
The reason is that the URL contains
only 33 characters, which is less
than the 100 minimum.
Note: Other request types such as
POST or PUT are not affected by
L7.minSizeofGetURL. In order for
Symantec Data Loss Prevention to
inspect any GET actions at all, the
L7.processGets setting must be set
to true.

L7.processGets true If true, the GET requests are


processed. If false, the GET
requests are not processed. Note
that this setting interacts with the
L7.minSizeofGetURL setting.

Lexer.IncludePunctuationInWords true If true, punctuation characters


internal to a token are considered
during detection.

See “Configuring Advanced Settings


for EDM policies” on page 557.
Installing and managing detection servers and cloud detectors 313
Advanced server settings

Table 14-9 Detection server advanced settings (continued)

Setting Default Description

Lexer.MaximumNumberOfTokens 30000 Maximum number of tokens


extracted from each message
component for detection. Applicable
to all detection technologies where
tokenization is required (EDM,
profiled DGM, and the system
patterns supported by those
technologies). Increasing the default
value may cause the detection
server to run out of memory and
restart.

See “Configuring Advanced Settings


for EDM policies” on page 557.

Lexer.Validate true If true, performs system


pattern-specific validation.

See “Configuring Advanced Settings


for EDM policies” on page 557.

MessageChain.ArchiveTimedOutStreams false Specifies whether messages should


be archived to the temp folder

MessageChain.CacheSize 8 Limits the number of messages that


can be queued in the message
chains.

MessageChain.ContentDumpEnabled false If set to true, each message


entering the detection message
chain is logged to
${\SymantecDLP.temp.dir\}/dump.
This setting is intended for use in
troubleshooting and debugging.

MessageChain.MaximumComponentTime 60,000 The time interval (in milliseconds)


allowed before any chain
component is restarted.

MessageChain.MaximumFailureTime 360000 Number of milliseconds that must


elapse before restarting the file
reader. This is tracked after a
message chain error is detected
and that message chain has not
been recovered.
Installing and managing detection servers and cloud detectors 314
Advanced server settings

Table 14-9 Detection server advanced settings (continued)

Setting Default Description

MessageChain.MaximumMessageTime Varies This setting varies between is either


600,000 or 1,800,000 depending on
detection server type.

The maximum time interval (in


milliseconds) that a message can
remain in a message chain.

MessageChain.MemoryThrottlerReservedBytes 200,000,000 Number of bytes required to be


available before a message is sent
through the message chain. This
setting can avoid out of memory
issues. The default value is 200 MB.
The throttler can be disabled by
setting this value to 0.

MessageChain.MinimumFailureTime 30000 Number of milliseconds that must


elapse before failure of a message
chain is tracked. Failure eventually
leads to restarting the message
chain or file reader.

MessageChain.NumChains Varies This number varies depending on


detection server type. It is either 4
or 8.
The number of messages, in
parallel, that the file reader will
process. Setting this number higher
than 8 (with the other default
settings) is not recommended. A
higher setting does not substantially
increase performance and there is
a much greater risk of running out
of memory. Setting this to less than
8 (in some cases 1) helps when
processing big files, but it may slow
down the system considerably.
Installing and managing detection servers and cloud detectors 315
Advanced server settings

Table 14-9 Detection server advanced settings (continued)

Setting Default Description

MessageChain.StopProcessing 200M Instructs detection to stop drilling


WhenMemoryLowerThan down into and processing sub-files
if JVM available memory drops
below this value. Setting this
attribute to 0 will force sub-file
processing, regardless of how little
memory is available. Setting this
attribute to a value close to or larger
than the value of the -Xmx option
in
BoxMonitor.FileReaderMemory
will effectively disable sub-file
processing.

OCR.ENABLE_AUTO_LANGUAGE_DETECTION true When true, this setting enables the


OCR engine to extract text more
quickly by automatically identifying
the language or languages in an
image, rather than processing every
language in the OCR configuration.
When false, the OCR engine
extracts the text using every
language in the OCR configuration,
making text extraction slower but
improving accuracy.

OCR.ENABLE_SPELL_CHECK true When true, this setting enables the


OCR engine to extract text more
accurately by using internal spelling
dictionaries. When false, the
accuracy of extracted text may be
reduced.

OCR. RECORD_REQUEST _STATISTICS false When true, this setting enables the
OCR sizing tool. The OCR sizing
tool gives you insight into your
image traffic data, which helps you
determine the sizing requirements
for your OCR implementation.

PacketCapture.DISCARD_HTTP_GET true If true, discards HTTP GET


streams.
Installing and managing detection servers and cloud detectors 316
Advanced server settings

Table 14-9 Detection server advanced settings (continued)

Setting Default Description

PacketCapture.DOES_DISCARD_ false If true, a list of tcpstreams is


TRIGGER_STREAM_DUMP dumped to an output file in the log
directory the first time a discard
message is received.

PacketCapture.ENDACE_BIN_PATH N/A To enable packet-capture using an


Endace card, enter the path to the
Endace /bin directory. Note that
environment variables (such as
%ENDACE_HOME%) cannot be used
in this setting. For example:
/usr/local/bin

PacketCapture.ENDACE_LIB_PATH N/A To enable packet-capture using an


Endace card, enter the path to the
Endace /lib directory. Note that
environment variables (such as
%ENDACE_HOME%) cannot be used
in this setting. For example:
/usr/local/lib

PacketCapture.ENDACE_XILINX_PATH N/A To enable packet-capture using an


Endace card, enter the path to the
Endace /xilinx directory. Note that
environment variables (such as
%ENDACE_HOME%) cannot be used
in this setting. For example:
/usr/local/dag/xilinx

PacketCapture.Filter tcp || ip proto 47 || When set to the default value all


(vlan && (tcp || ip non-TCP packets are filtered out
proto 47)) and not sent to Network Monitor.
The default value can be overridden
using the tcpdump filter format
documented in the tcpdump
program. This setting allows
specialists to create more exact
filters (source and destination IPs
for given ports).

PacketCapture.INPUT_SOURCE_FILE /dummy.dmp The full path and name of the input


file.
Installing and managing detection servers and cloud detectors 317
Advanced server settings

Table 14-9 Detection server advanced settings (continued)

Setting Default Description

PacketCapture.IS_ARCHIVING_PACKETS false DO NOT USE THIS FIELD.


Diagnostic setting that creates
dumps of packets captured in
packetcapture for later reuse. This
feature is unsupported and does not
have normal error checking. May
cause repeated restarts on pcap.

PacketCapture.IS_ENDACE_ENABLED false To enable packet-capture using an


Endace card, set this value to true.

PacketCapture.IS_FTP_RETR_ENABLED false If true, FTP GETS and FTP PUTS


are processed. If false, only process
FTP PUTS are processed.

PacketCapture.IS_INPUT_SOURCE_FILE false If true, continually reads in packets


from a tcpdump formatted file
indicated in INPUT_SOURCE_FILE.
Set to dag when an Endace card is
installed.

PacketCapture.IS_NAPATECH_ENABLED false To enable packet-capture using a


Napatech card, set this value to
true. The default setting is false.

PacketCapture.KERNEL_BUFFER_SIZE_I686 64M For 32-bit Linux platforms, this


setting specifies the amount of
memory allocated to buffer network
packets. Specify K for kilobytes or
M for megabytes. Do not specify a
value larger than 128M.

PacketCapture.KERNEL_BUFFER_SIZE_Win32 16M For 32-bit Windows platforms, this


setting specifies the amount of
memory allocated to buffer network
packets. Specify K for kilobytes or
M for megabytes.

PacketCapture.KERNEL_BUFFER_SIZE_X64 64M For 64-bit Windows platforms, this


setting specifies the amount of
memory allocated to buffer network
packets. Specify K for kilobytes or
M for megabytes.
Installing and managing detection servers and cloud detectors 318
Advanced server settings

Table 14-9 Detection server advanced settings (continued)

Setting Default Description

PacketCapture.KERNEL_BUFFER_SIZE_X86_64 64M For 64-bit Linux platforms, this


setting specifies the amount of
memory allocated to buffer network
packets. Specify K for kilobytes or
M for megabytes. Do not specify a
value larger than 64M.

PacketCapture.MAX_FILES_PER_DIRECTORY 30000 After the specified number of file


streams are processed a new
directory is created.

PacketCapture.MBYTES_LEFT_ 1000 If the amount of disk space (in MB)


TO_DISABLE_CAPTURE left on the drop_pcap drive falls
below this specification, packet
capture is suspended. For example,
if this number is 100, pcap will stop
writing out drop_pcap files when
there is less than 100 MB on the
installed drive

PacketCapture.MBYTES_REQUIRED 1500 The amount of disk space (in MB)


_TO_RESTART_CAPTURE needed on the drop_pcap drive
before packet capture resumes
again after stopping due to lack of
space. For example, if this value is
150 and packet capture is
suspended, packet capture resumes
when more than 150 MB is available
on the drop_pcap drive.

PacketCapture.NAPATECH_TOOLS_PATH N/A This setting specifies the location of


the Napatech Tools directory. This
directory is not set by default. If
packet-capture is enabled for
Napatech, enter the fully qualified
path to the Napatech Tools
installation directory.
Installing and managing detection servers and cloud detectors 319
Advanced server settings

Table 14-9 Detection server advanced settings (continued)

Setting Default Description

PacketCapture.NO_TRAFFIC_ALERT_PERIOD 86,400 The refresh time (in seconds),


between no traffic alert messages.
No traffic system events are created
for a given protocol based on this
time period. For instance, if this is
set to 24*60*60 seconds, a new
message is sent every day that
there is no new traffic for a given
protocol. Do not confuse with the
per protocol traffic timeout, that tells
us how long we initially go without
traffic before sending the first alert.

PacketCapture.NUMBER_BUFFER_ 600000 The number of standard-sized


POOL_PACKETS preallocated packet buffers used to
buffer and sort incoming traffic.

PacketCapture.NUMBER_JUMBO_ POOL_PACKETS 1 The number of large-sized


preallocated packet buffers that are
used to buffer and sort incoming
traffic.

PacketCapture.NUMBER_SMALL_ POOL_PACKETS 200000 The number of small-sized


preallocated packet buffers that are
used to buffer and sort incoming
traffic.

PacketCapture.RING_CAPTURE_LENGTH 1518 Controls the amount of packet data


that is captured. The default value
of 1518 is sufficient to capture
typical Ethernet networks and
Ethernet over 802.1Q tagged
VLANs.
Installing and managing detection servers and cloud detectors 320
Advanced server settings

Table 14-9 Detection server advanced settings (continued)

Setting Default Description

PacketCapture.RING_DEVICE_MEM 67108864 This setting is deprecated. Instead,


use the PacketCapture.KERNEL_
BUFFER_SIZE_I686 setting (for
32-bit Linux platforms) or the
PacketCapture.KERNEL_
BUFFER_SIZE_X86_64 setting (for
64-bit Linux platforms).

Specifies the amount of memory (in


bytes) to be allocated to buffer
packets per device. (The default of
67108864 is equivalent to 64MB.)

PacketCapture.SIZE_BUFFER_POOL_PACKETS 1540 The size of standard-sized buffer


pool packets.

PacketCapture.SIZE_JUMBO_POOL_PACKETS 10000 The size of jumbo-sized buffer pool


packets.

PacketCapture.SIZE_SMALL_POOL_PACKETS 150 The size of small-sized buffer pool


packets.

PacketCapture.SPOOL_DIRECTORY N/A The directory in which to spool


streams with large numbers of
packets. This setting is user
defined.

PacketCapture.STREAM_WRITE_TIMEOUT 5000 The time (in milliseconds) between


each count (StreamManager's write
timeout)

RequestProcessor.AddDefaultHeader true If true, adds a default header to


every email processed (when in
Inline SMTP mode). The default
header is
RequestProcessor.DefaultHeader.
This header is added to all
messages that pass through the
system, i.e., if it is redirected, if
another header is added, if the
message has no policy violations
then the header is added.
Installing and managing detection servers and cloud detectors 321
Advanced server settings

Table 14-9 Detection server advanced settings (continued)

Setting Default Description

RequestProcessor.AddHeaderOnMessageTimeout false The default value sets the system


to continue sending messages if
there is a message timeout.

Set to true, then the X-Header


"X-Symantec-DLP: Message timed
out (potential Enforce System event
1213)” is inserted in the email
message. The downstream edge
MTA uses this header information
to handle the message, and the log
message displays “Passed
message through due to timeout,
with added timeout header.”

RequestProcessor.AllowExtensions 8BITMIME VRFY DSN This setting lists the SMTP protocol
HELP PIPELINING extensions that Network Prevent for
SIZE Email can use when it
ENHANCEDSTATUSCODES communicates with other MTAs.
STARTTLS

RequestProcessor.AllowHosts any The default value of any permits all


systems to make connections to the
Network Prevent for Email Server
on the SMTP service port.
Replacing any with the IP address
or Fully-Qualified Domain Name
(FQDN) of one or more systems
restricts SMTP connections to just
those designated systems. To
designate multiple systems,
separate their addresses with
commas. Use only a comma to
separate addresses; do not include
any spaces between the addresses.

RequestProcessor.AllowUnauthenticatedConnections false The default value ensures that


MTAs must authenticate with
Network Prevent for Email for TLS
communication.

RequestProcessor.Backlog 12 The backlog that the request


processor specifies for the server
socket listener.
Installing and managing detection servers and cloud detectors 322
Advanced server settings

Table 14-9 Detection server advanced settings (continued)

Setting Default Description

RequestProcessor.BindAddress 0.0.0.0 IP address to which a Network


Prevent for Email Server listener
binds. When BindAddress is
configured, the server will only
answer a connection to that IP
address. The default value of
0.0.0.0 is a wild card that permits
listening to all available addresses
including 127.0.0.1.

RequestProcessor.BlockStatusCodeOverride 5.7.1 Enables overriding of the ESMTP


status code sent back to the
upstream MTA when executing a
block response rule.

Accepted values are 5.7.0 and


5.7.1. If any other values are
entered, this setting will fall back to
the default of 5.7.1.

Use of the 5.7.0 value (other or


undefined security status) is
preferred when the detection server
is working with Office365 email,
because the 5.7.1 value provides
an incorrect context for the
Office365 use case.

RequestProcessor.CacheCleanupInterval 120000 Specifies the interval after which the


cached responses are cleaned from
the cache. Units are in milliseconds.

RequestProcessor.CachedMessageTimeout 120000 Specifies the amount of time after


generation when a given cached
response can be cleared from the
cache. Units are in milliseconds.

RequestProcessor.CacheEnabled false Enables caching of responses for


duplicate SMTP messages. The
cache was added as part of the
cloud solution to support envelope
splitting.
Installing and managing detection servers and cloud detectors 323
Advanced server settings

Table 14-9 Detection server advanced settings (continued)

Setting Default Description

RequestProcessor.DefaultCommandTimeout 300 Specifies the number of seconds


the Network Prevent for Email
Server waits for a response to an
SMTP command before closing
connections to the upstream and
downstream MTAs. The default is
300 seconds. This setting does not
apply to the "." command (the end
of a DATA command). Do not
modify the default without first
consulting Symantec support.

RequestProcessor.DefaultPassHeader X-CFilter-Loop: This is the default header that will


Reflected be added if RequestProcessor.
AddDefaultPassHeader is set to
true, when in Inline SMTP mode.
Must be in a valid header format,
recommended to be an X header.

RequestProcessor.DotCommandTimeout 600 Specifies the number of seconds


the Network Prevent for Email
Server waits for a response to the
"." command (the end of a DATA
command) before closing
connections to the upstream and
downstream MTAs. The default is
600 seconds. Do not modify the
default without first consulting
Symantec support.

RequestProcessor.ForwardConnectionTimeout 20000 The timeout value to use when


forwarding to an MTA.

RequestProcessor.KeyManagementAlgorithm SunX509 The key management algorithm


used in TLS communication.

RequestProcessor.MaxLineSize 1048576 The maximum size (in bytes) of data


lines expected from an external
MTA. If the data lines are larger
than they are broken down to this
size.

RequestProcessor.Mode ESMTP Specifies the protocol mode to use


(SMTP or ESMTP).
Installing and managing detection servers and cloud detectors 324
Advanced server settings

Table 14-9 Detection server advanced settings (continued)

Setting Default Description

RequestProcessor.MTAResubmitPort 10026 This is the port number used by the


request processor on the MTA to
resend the SMTP message.

RequestProcessor.NumberOfDNSAttempts 4 The maximum number of DNS


queries that Network Prevent for
Email performs when it attempts to
obtain mail exchange (MX) records
for a domain. Network Prevent for
Email uses this setting only if you
have enabled MX record lookups.

RequestProcessor.RPLTimeout 360000 The maximum time in milliseconds


allowed for email message
processing by a Prevent server. Any
email messages not processed
during this time interval are passed
on by the server.

RequestProcessor.ServerSocketPort 10025 The port number to be used by the


SMTP monitor to listen for incoming
connections from MTA.

RequestProcessor.TagHighestSeverity false When set to true, an additional


email header that reports the
highest severity of all the violated
policies is added to the message.
For example, if the email violated a
policy of severity HIGH and a policy
of severity LOW, it shows:
X-DLP-MAX-Severity:HIGH.

RequestProcessor.TagPolicyCount false When set to true an additional email


header reporting the total number
of policies that the message violates
is added to the message. For
example, if the message violates 3
policies a header reading:
X-DLP-Policy-Count: 3 is added.
Installing and managing detection servers and cloud detectors 325
Advanced server settings

Table 14-9 Detection server advanced settings (continued)

Setting Default Description

RequestProcessor.TagScore false When set to true an additional email


header reporting the total
cumulative score of all the policies
that the message violates is added
to the message. Scores are
calculated using the formula:
High=4, Medium=3, Low=2, and
Info=1. For example, if a message
violates three policies, one with a
severity of medium and two with a
severity of low a header reading:
X-DLP-Score: 7 is added.

RequestProcessor.TrustManagementAlgorithm PKIX The trust management algorithm


that Network Prevent for Email uses
when it validates certificates for TLS
communication. You can optionally
specify a built-in Java trust manager
algorithm (such as SunX509 or
SunPKIX) or a custom algorithm
that you have developed.

RequestProcessorListener.ServerSocketPort 12355 The local TCP port that FileReader


will use to listen for connections
from RequestProcessor on a
Network Prevent server.

ServerCommunicator.CONNECT_ 60 The delay time (in seconds) after


which a detection server returning
DELAY_POST_WAKEUP_
online attempts to connect to the
OR_POST_VPN_ Enforce Server. The default value
is 60 seconds. The range for this
SECONDS
setting is 30 to 600 seconds.

SocketCommunication.BufferSize 8K The size of the buffer that Network


Prevent for Web uses to process
ICAP requests. Increase the default
value only if you need to process
ICAP requests that are greater than
8K. Certain features, such as Active
Directory authentication, may
require an increas in buffer size.
Installing and managing detection servers and cloud detectors 326
Advanced detector settings

Table 14-9 Detection server advanced settings (continued)

Setting Default Description

UnicodeNormalizer.AsianCharRanges default Can be used to override the default


definition of characters that are
considered Asian by the detection
engine. Must be either default, or a
comma-separated list of ranges, for
example: 11A80-11F9,3200-321E

UnicodeNormalizer.Enabled on Can be used to disable Unicode


normalization.

Enter off to disable.

UnicodeNormalizer.NewlineEliminationEnabled on Can be used to disable newline


elimination for Asian languages.

Enter off to disable.

See “About Symantec Data Loss Prevention administration” on page 82.


See “Advanced agent settings” on page 2372.
See “About the Overview screen” on page 278.
See “Server/Detector Detail screen” on page 283.
See “Server configuration—basic” on page 253.
See “Server controls” on page 251.

Advanced detector settings


Click Detector Settings on the detector's System > Servers and Detectors > Overview >
Server/Detector Detail screen to modify the settings on that server.
Use caution when modifying these settings on a detector. Contact Symantec Support before
changing any of the settings on this screen. Changes to these settings normally do not take
effect until after the detector has been restarted.
You cannot change settings for the Enforce Server from the Server/Detector Detail screen.
The Server/Detector Detail - Advanced Settings screen only displays for detection servers
and detectors.
Installing and managing detection servers and cloud detectors 327
Advanced detector settings

Table 14-10 Detector advanced settings

Setting Default Description

ContentExtraction.EnableMetaData off Allows detection on file metadata. If the setting is


turned on, you can detect metadata for Microsoft
Office and PDF files. For Microsoft Office files, OLE
metadata is supported, which includes the fields
Title, Subject, Author, and Keywords. For PDF files,
only Document Information Dictionary metadata is
supported, which includes fields such as Author,
Title, Subject, Creation, and Update dates.
Extensible Metadata Platform (XMP) content is not
detected. Note that enabling this metadata detection
option can cause false positives.

ContentExtraction.MarkupAsText off Bypasses Content Extraction for files that are


determined to be XML or HTML. This should be
used in cases such as web pages containing data
in the header block or script blocks. Default is off.

ContentExtraction.TrackedChanges off Allows detection of content that has changed over


time (Track Changes content) in Microsoft Office
documents.
Note: Using the foregoing option might reduce the
accuracy rate for IDM and data identifiers. The
default is set to off (disallow).

To index content that has changed over time, set


ContentExtraction.TrackedChanges=on in file
\Protect\config\Indexer.properties. The
default and recommended setting is
ContentExtraction.TrackedChanges=off.
Installing and managing detection servers and cloud detectors 328
Advanced detector settings

Table 14-10 Detector advanced settings (continued)

Setting Default Description

DDM.MaxBinMatchSize 30,000,000 The maximum size (in bytes) used to generate the
MD5 hash for an exact binary match in an IDM. This
setting should not be changed. The following
conditions must be matched for IDM to work
correctly:

■ This setting must be exactly identical to the


max_bin_match_size setting on the Enforce
Server in file indexer.properties.
■ This setting must be smaller or equal to the
FileReader.FileMaxSize value.
■ This setting must be smaller or equal to the
ContentExtraction.MaxContentSize value on the
Enforce Server in file indexer.properties.

Note: Changing the first or third item in the list


requires re-indexing all IDM files.

Detection.EncodingGuessingDefaultEncoding ISO-8859-1 Specifies the backup encoding assumed for a byte


stream.

Detection.EncodingGuessingEnabled on Designates whether the encoding of unknown byte


streams should be guessed.

Detection.EncodingGuessingMinimumConfidence 50 Specifies the confidence level required for guessing


the encoding of unknown byte streams.

DI.MaxViolations 100 Specifies the maximum number of violations allowed


with data identifiers.

EDM.MatchCountVariant 3 Specifies how matches are counted.

■ 1 - Counts the total number of token sets


matched.
■ 2 - Counts the number of unique token sets
matched.
■ 3 - Counts the number of unique super sets of
token sets. (default)

See “Configuring Advanced Settings for EDM


policies” on page 557.

EDM.MaximumNumberOfMatchesToReturn 100 Defines a top limit on the number of matches


returned from each RAM index search.

See “Configuring Advanced Settings for EDM


policies” on page 557.
Installing and managing detection servers and cloud detectors 329
Advanced detector settings

Table 14-10 Detector advanced settings (continued)

Setting Default Description

EDM.SimpleTextProximityRadius 35 Number of tokens that are evaluated together when


the proximity check is enabled.

See “Configuring Advanced Settings for EDM


policies” on page 557.

EDM.TokenVerifierEnabled false If enabled (true), the server validates tokens for


Chinese, Japanese, and Korean (CJK) keywords.

Default is disabled (false).

IncidentDetection.MaxContentLength 2000000 Applies only to regular expression rules. On a


per-component basis, only the first
MaxContentLength number of characters are
scanned for violations. The default (2,000,000) is
equivalent to > 1000 pages of typical text. The
limiter exists to prevent regular expression rules
from taking too long.

IncidentDetection.MinNormalizedSize 30 This setting applies to IDM detection. It must be


kept in sync with the corresponding setting in the
Indexer.properties file on the Enforce Server
(which applies to indexing). Derivative detections
only apply to messages when their normalized
content is greater than this setting. If the normalized
content size is less than this setting, IDM detection
does a straight binary match.

IncidentDetection.patternConditionMaxViolations 100 The maximum number of matches a detector


reports. The detector does not report matches more
than the value of the
'IncidentDetection.patternConditionMaxViolations'
parameter, even if there are any.

Keyword.TokenVerifierEnabled false Default is disabled (false).

If enabled (true), the server validates tokens for


Asian language keywords (Chinese, Japanese, and
Korean).

See “Enabling and using CJK token verification for


server keyword matching” on page 847.
Installing and managing detection servers and cloud detectors 330
About using load balancers in an endpoint deployment

Table 14-10 Detector advanced settings (continued)

Setting Default Description

Lexer.IncludePunctuationInWords true If true, punctuation characters internal to a token


are considered during detection.

See “Configuring Advanced Settings for EDM


policies” on page 557.

Lexer.MaximumNumberOfTokens 30000 Maximum number of tokens extracted from each


message component for detection. Applicable to all
detection technologies where tokenization is
required (EDM, profiled DGM, and the system
patterns supported by those technologies).
Increasing the default value may cause the detector
to run out of memory and restart.

See “Configuring Advanced Settings for EDM


policies” on page 557.

Lexer.Validate true If true, performs system pattern-specific validation.

See “Configuring Advanced Settings for EDM


policies” on page 557.

UnicodeNormalizer.AsianCharRanges default Can be used to override the default definition of


characters that are considered Asian by the
detection engine. Must be either default, or a
comma-separated list of ranges, for example:
11A80-11F9,3200-321E

UnicodeNormalizer.Enabled on Can be used to disable Unicode normalization.

Enter off to disable.

UnicodeNormalizer.NewlineEliminationEnabled on Can be used to disable newline elimination for Asian


languages.

Enter off to disable.

About using load balancers in an endpoint deployment


You can use a load balancer to manage multiple Endpoint Servers, or a server pool. Adding
Endpoint Servers to a load-balanced server pool enables Symantec Data Loss Prevention to
use less bandwidth while managing more agents. When setting up a server pool to manage
Endpoint Servers and agents, default Symantec Data Loss Prevention settings allow for
communication between servers and agents. However, there are a number of load balancer
settings that may affect how Endpoint Servers and agents communicate. You may have to
Installing and managing detection servers and cloud detectors 331
About using load balancers in an endpoint deployment

make changes to advanced agent and server settings if the load balancer you use does not
use default settings.
In general, load balancers should have the following settings applied to work best with Symantec
Data Loss Prevention:
■ 1-Gbps throughput
■ Source IP persistence. Set the persistence time to be greater than the agent polling period.
■ 24-hour SSL session timeout period
The Endpoint Servers communicate most efficiently with agents when the load balancer is set
up to use source IP persistence. (This protocol name may differ across load balancer brands.)
Using source IP persistence in a Symantec Data Loss Prevention implementation ensures
that if an agent is restarted on the same network, it reconnects to the same Endpoint Server
regardless of the SSL session state. Source IP persistence also uses less bandwidth during
the SSL handshake between agents and Endpoint Servers. This protocol also helps maintain
the event/attribute cache coherence.
For agents that connect to the Endpoint Server over a NAT or a proxy, SSL session server
affinity is the optimal load balancer setting. However, if this setting is used, and the agent is
restarted or if the SSL cached session identity is flushed, a new SSL session is negotiated.
Negotiating a new SSL session may cause the agent to connect to a different monitor more
frequently which may interfere with agent status updates on the Enforce Server.
You review agent connection settings if the load balancer idle connection settings is not set
to default. The load balancer idle connection setting can also be called connection timeout
interval, clean idle connection, and so-on depending on the load balancer brand.
You can assess your Symantec Data Loss Prevention and load balancer settings by considering
the following two scenarios:
■ Default DLP settings. Table 14-11
■ Non-default DLP settings. Table 14-12

Note: Contact Symantec Support before changing default advanced agent and advanced
server settings.
Installing and managing detection servers and cloud detectors 332
About using load balancers in an endpoint deployment

Table 14-11 Default Symantec Data Loss Prevention settings scenario

Description Resolution

Symantec Data Loss Prevention uses Consider how the agent idle timeout coincides with the load balancer
non-persistent connections by default. Using close idle connection setting. If the load balancer is configured to close
non-persistent connections means that idle connections after less than 30 seconds, agents are prematurely
Endpoint Servers close connections to agents disconnected from Endpoint Servers.
after agents are idle for 30 seconds.
To resolve the issue, complete one of the following:

■ Change the agent idle timeout setting (EndpointCommunications.


IDLE_TIMEOUT_IN_SECONDS.int) to less than the close idle
connection setting on the load balancer.
■ Increase the agent heartbeat setting
(EndpointCommunications.HEARTBEAT_INTERVAL_IN_SECONDS.int)
to be less than the load balancer close idle connections setting.
The user must also increase the no traffic timeout setting
(CommLayer.NO_TRAFFIC_TIMEOUT_IN_SECONDS.int) to a
value greater than the agent heartbeat setting.

Table 14-12 Non-default Symantec Data Loss Prevention settings scenario

Description Resolution

Consider how changes to default Symantec To resolve the issue, complete one of the following:
Data Loss Prevention settings affect how the
■ Change the agent heartbeat
load balancer handles idle and persistent
(EndpointCommunications.HEARTBEAT_INTERVAL_IN_SECONDS.int)
agent connections. For example, if you change
and no traffic timeout settings
the idle timeout setting to 0 to create a
(CommLayer.NO_TRAFFIC_TIMEOUT_IN_SECONDS.int) to less
persistent connection and you leave the default
than the load balancer idle connection setting.
agent heartbeat setting (270 seconds), you
■ Verify that the no traffic timeout setting is greater than the heartbeat
must consider the idle connection setting on
setting.
the load balancer. If the idle connection setting
on the load balancer is less than 270 seconds,
then agents are prematurely disconnected
from Endpoint Servers.

See “Advanced server settings” on page 285.


See “Advanced agent settings” on page 2372.
Chapter 15
Managing log files
This chapter includes the following topics:

■ About log files

■ Log collection and configuration screen

■ Configuring server logging behavior

■ Collecting server logs and configuration files

■ About log event codes

About log files


Symantec Data Loss Prevention provides a number of different log files that record information
about the behavior of the software. Log files fall into these categories:
■ Operational log files record detailed information about the tasks the software performs and
any errors that occur while the software performs those tasks. You can use the contents
of operational log files to verify that the software functions as you expect it to. You can also
use these files to troubleshoot any problems in the way the software integrates with other
components of your system.
For example, you can use operational log files to verify that a Network Prevent for Email
Server communicates with a specific MTA on your network.
See “Operational log files” on page 334.
■ Debug log files record fine-grained technical details about the individual processes or
software components that comprise Symantec Data Loss Prevention. The contents of
debug log files are not intended for use in diagnosing system configuration errors or in
verifying expected software functionality. You do not need to examine debug log files to
administer or maintain an Symantec Data Loss Prevention installation. However, Symantec
Support may ask you to provide debug log files for further analysis when you report a
Managing log files 334
About log files

problem. Some debug log files are not created by default. Symantec Support can explain
how to configure the software to create the file if necessary.
See “Debug log files” on page 337.
■ Installation log files record information about the Symantec Data Loss Prevention installation
tasks that are performed on a particular computer. You can use these log files to verify an
installation or troubleshoot installation errors. Installation log files reside in the following
locations:
■ installdir\SymantecDLP\.install4j\installation.log stores the installation log
for Symantec Data Loss Prevention.
■ installdir\oracle_home\admin\protect\ stores the installation log for Oracle.
See the Symantec Data Loss Prevention Installation Guide for more information.

Operational log files


The Enforce Server and the detection servers store operational log files in the
c:\ProgramData\Symantec\DataLossPrevention\<EnforceServer or
DetectionServer>\15.5\Protect\logs\ directory on Windows installations and in the
/var/log/Symantec/DataLossPrevention/<EnforceServer or DetectionServer>/15.5/
directory on Linux installations. A number at the end of the log file name indicates the count
(shown as 0 in Table 15-1).
Table 15-1 lists and describes the Symantec Data Loss Prevention operational log files.

Table 15-1 Operational log files

Log file name Description Server

agentmanagement_webservices_access_0.log Logs successful and failed attempts Enforce Server


to access the Agent Management
API web service.

agentmanagement_webservices_soap_0.log Logs the entire SOAP request and Enforce Server


response for most requests to the
Agent Management API web
Service.
Managing log files 335
About log files

Table 15-1 Operational log files (continued)

Log file name Description Server

boxmonitor_operational_0.log The BoxMonitor process All detection servers


oversees the detection server
processes that pertain to that
particular server type.

For example, the processes that run


on Network Monitor are file reader
and packet capture.

The BoxMonitor log file is typically


very small, and it shows how the
application processes are running.

detection_operational_0.log The detection operation log file All detection servers


provides details about how the
detection server configuration and
whether it is operating correctly.

detection_operational_trace_0.log The detection trace log file provides All detection servers
details about each message that
the detection server processes. The
log file includes information such
as:

■ The policies that were applied


to the message
■ The policy rules that were
matched in the message
■ The number of incidents the
message generated.

machinelearning_training_operational_0.log This log records information about Enforce Server


the tasks, logs, and configuration
files called on startup of the VML
training process.

manager_operational_0.log. Logs information about the Enforce Server


Symantec Data Loss Prevention
manager process, which
implements the Enforce Server
administration console user
interface.
Managing log files 336
About log files

Table 15-1 Operational log files (continued)

Log file name Description Server

monitorcontroller_operational_0.log Records a detailed log of the Enforce Server


connections between the Enforce
Server and all detection servers. It
provides details about the
information that is exchanged
between these servers including
whether policies have been pushed
to the detection servers or not.

SmtpPrevent_operational0.log This operational log file pertains to SMTP Prevent


SMTP Prevent only. It is the primary detection servers
log for tracking the health and
activity of a Network Prevent for
Email system. Examine this file for
information about the
communication between the MTAs
and the detection server.

WebPrevent_Access0.log This access log file contains ■ Network Prevent


information about the requests that for Web detection
are processed by Network Prevent servers
for Web detection servers. It is
similar to web access logs for a
proxy server.

WebPrevent_Operational0.log This operational log file reports on ■ Network Prevent


the operating condition of Network for Web detection
Prevent for Web, such as whether servers
the system is up or down and
connection management.

webservices_access_0.log This log file records successful and Enforce Server


failed attempts to access the
Incident Reporting Web Service.
Managing log files 337
About log files

Table 15-1 Operational log files (continued)

Log file name Description Server

webservices_soap_0.log Contains the entire SOAP request Enforce Server


and response for most requests to
the Incident Reporting API Web
Service. This log records all
requests and responses except
responses to incident binary
requests. This log file is not created
by default. See the Symantec Data
Loss Prevention Incident Reporting
API Developers Guide for more
information.

See “Network Prevent for Web operational log files and event codes” on page 351.
See “Network Prevent for Web access log files and fields” on page 352.
See “Network Prevent for Email log levels” on page 355.
See “Network Prevent for Email operational log codes” on page 355.
See “Network Prevent for Email originated responses and codes” on page 359.

Debug log files


The Enforce Server and the detection servers store debug log files in the
c:\ProgramData\Symantec\DataLossPrevention\<Enforce Server or Detection
Server>\15.5\Protect\logs\ directory on Windows installations and in the
/var/log/Symantec/DataLossPrevention/<Enforce Server or Detection Server>/15.5/
directory on Linux installations. A number at the end of the log file name indicates the count
(shown as 0 in debug log files).
The following table lists and describes the Symantec Data Loss Prevention debug log files.
Managing log files 338
About log files

Table 15-2 Debug log files

Log file name Description Server

Aggregator0.log This file describes communications between the Endpoint


detection server and the agents. detection
servers
Look at this log to troubleshoot the following
problems:

■ Connection to the agents


■ To find out why incidents do not appear when they
should
■ If unexpected agent events occur

BoxMonitor0.log This file is typically very small, and it shows how the All
application processes are running. The BoxMonitor detection
process oversees the detection server processes that servers
pertain to that particular server type.

For example, the processes that run on Network


Monitor are file reader and packet capture.

ContentExtractionAPI_FileReader.log Logs the behavior of the Content Extraction API file Detection
reader that sends requests to the plug-in host. The Server
default logging level is "info" which is configurable
using log4cxx_config_filereader.xml in the
C:\Program
Files\Symantec\DataLossPrevention\
DetectionServer (Windows) or
/opt/Symantec/DataLossPrevention/
DetectionServer/15.5/Protect/config
(Linux) directory.

ContentExtractionAPI_Manager.log Logs the behavior of the Content Extraction API Enforce


manager that sends requests to the plug-in host. The Server
default logging level is "info" which is configurable
using log4cxx_config_manager.xml in the
C:\Program
Files\Symantec\DataLossPrevention\
DetectionServer (Windows) or
/opt/Symantec/DataLossPrevention/
DetectionServer/15.5/Protect/config
(Linux) directory.
Managing log files 339
About log files

Table 15-2 Debug log files (continued)

Log file name Description Server

ContentExtractionHost_FileReader.log Logs the behavior of the Content Extraction File Detection


Reader hosts and plug-ins. The default logging level Server
is "info" which is configurable using
log4cxx_config_filereader.xml in the
C:\Program
Files\Symantec\DataLossPrevention\
DetectionServer (Windows) or
/opt/Symantec/DataLossPrevention/
DetectionServer/15.5/Protect/config
(Linux) directory.

ContentExtractionHost_Manager.log Logs the behavior of the Content Extraction Manager Enforce


hosts and plug-ins. The default logging level is "info" Server
which is configurable using
log4cxx_config_manager.xml in the
C:\Program
Files\Symantec\DataLossPrevention\
DetectionServer (Windows) or
/opt/Symantec/DataLossPrevention/
DetectionServer/15.5/Protect/config
(Linux) directory.

DiscoverNative.log.0 This log file is located in c:\Program Files\ Discover


Symantec\DataLossPrevention\ detection
DetectionServer\15.5\Protect\logs\debug servers

This log file contains the log statements that the


Network Discover/Cloud Storage Discover native
code emits. Currently contains the information that
is related to .pst scanning. This log file applies only
to the Network Discover/Cloud Storage Discover
Servers that run on Windows platforms.

You can configure this log in the c:\Program


Files\ Symantec\DataLossPrevention\
DetectionServer\15.5\Protect\config\
DiscoverNativeLogging.properties file.

FileReader0.log This log file pertains to the file reader process and All
contains application-specific logging, which may be detection
helpful in resolving issues in detection and incident servers
creation. One symptom that shows up is content
extractor timeouts.
Managing log files 340
About log files

Table 15-2 Debug log files (continued)

Log file name Description Server

flash_client_0.log Logs messages from the Adobe Flex client used for Enforce
folder risk reports by Network Discover. Server

flash_server_remoting_0.log Contains log messages from BlazeDS, an Enforce


open-source component that responds to remote Server
procedure calls from an Adobe Flex client. This log
indicates whether the Enforce Server has received
messages from the Flash client. At permissive log
levels (FINE, FINER, FINEST), the BlazeDS logs
contain the content of the client requests to the server
and the content of the server responses to the client

IncidentPersister0.log This log file pertains to the Incident Persister process. Enforce
This process reads incidents from the incidents folder Server
on the Enforce Server, and writes them to the
database. Look at this log if the incident queue on
the Enforce Server (manager) grows too large. This
situation can be observed also by checking the
incidents folder on the Enforce Server to see if
incidents have backed up.

Indexer0.log This log file contains information when an EDM profile Enforce
or IDM profile is indexed. It also includes the Server
information that is collected when the external indexer (or
is used. If indexing fails then this log should be computer
consulted. where
the
external
indexer
is
running)

jdbc.log This log file is a trace of JDBC calls to the database. Enforce
By default, writing to this log is turned off. Server
Managing log files 341
About log files

Table 15-2 Debug log files (continued)

Log file name Description Server

machinelearning_native_filereader.log This log file records the runtime category classification Detection
(positive and negative) and associated confidence Server
levels for each message detected by a VML profile.
The default logging level is "info" which is configurable
using \log4cxx_config_filereader.xml in the
C:\Program
Files\Symantec\DataLossPrevention\
DetectionServer (Windows) or
/opt/Symantec/DataLossPrevention/
DetectionServer/15.5/Protect/config
(Linux) directory.

machinelearning_training_0_0.log This log file records the design-time base accuracy Enforce
percentages for the k-fold evaluations for all VML Server
profiles.

machinelearning_training_native_manager.log This log file records the total number of features Enforce
modeled at design-time for each VML profile training Server
run. The default logging level is "info" which is
configurable using log4cxx_config_manager.xml
in the C:\Program
Files\Symantec\DataLossPrevention\
DetectionServer (Windows) or
/opt/Symantec/DataLossPrevention/
DetectionServer/15.5/Protect/config
(Linux) directory.

MonitorController0.log This log file is a detailed log of the connections Enforce


between the Enforce Server and the detection Server
servers. It gives details around the information that
is exchanged between these servers including
whether policies have been pushed to the detection
servers or not.

PacketCapture.log This log file pertains to the packet capture process Network
that reassembles packets into messages and writes Monitor
to the drop_pcap directory. Look at this log if there
is a problem with dropped packets or traffic is lower
than expected. PacketCapture is not a Java
process, so it does not follow the same logging rules
as the other Symantec Data Loss Prevention system
processes.
Managing log files 342
About log files

Table 15-2 Debug log files (continued)

Log file name Description Server

PacketCapture0.log This log file describes issues with PacketCapture Network


communications. Monitor

RequestProcessor0.log This log file pertains to SMTP Prevent only. The log SMTP
file is primarily for use in cases where Prevent
SmtpPrevent_operational0.log is not sufficient. detection
servers

ScanDetail-target-0.log Where target is the name of the scan target. All white Discover
spaces in the target's name are replaced with detection
hyphens. This log file pertains to Discover server servers
scanning. It is a file by file record of what happened
in the scan. If the scan of the file is successful, it
reads success, and then the path, size, time, owner,
and ACL information of the file scanned. If it failed,
a warning appears followed by the file name.

tomcat\localhost.date.log These Tomcat log files contain information for any Enforce
action that involves the user interface. The logs Server
include the user interface errors from red error
message box, password failures when logging on,
and Oracle errors (ORA –#).

SymantecDLPIncidentPersister.log This log file contains minimal information: stdout Enforce


and stderr only (fatal events). Server

SymantecDLPManager.log This log file contains minimal information: stdout Enforce


and stderr only (fatal events). Server

SymantecDLPMonitor.log This log file contains minimal information: stdout All


and stderr only (fatal events). detection
servers

SymantecDLPMonitorController.log This log file contains minimal information: stdout Enforce


and stderr only (fatal events). Server

SymantecDLPNotifier.log This log file pertains to the Notifier service and its Enforce
communications with the Enforce Server and the Server
MonitorController service. Look at this file to
see if the MonitorController service registered
a policy change.

SymantecDLPUpdate.log This log file is populated when you update Symantec Enforce
Data Loss Prevention. Server
Managing log files 343
Log collection and configuration screen

See “Network Prevent for Web protocol debug log files” on page 354.
See “Network Prevent for Email log levels” on page 355.

Log collection and configuration screen


Use the System > Servers and Detectors > Logs screen to collect log files or to configure
logging behavior for any Symantec Data Loss Prevention server. The Logs screen contains
two tabs that provide the following features:
■ Collection—Use this tab to collect log files and configuration files from one or more
Symantec Data Loss Prevention servers.
See “Collecting server logs and configuration files” on page 347.
■ Configuration—Use this tab to configure basic logging behavior for a Symantec Data Loss
Prevention server, or to apply a custom log configuration file to a server.
See “Configuring server logging behavior” on page 343.
See “About log files” on page 333.

Configuring server logging behavior


Use the Configuration tab of the System > Servers and Detectors > Logs screen to change
logging configuration parameters for any server in the Symantec Data Loss Prevention
deployment. The Select a Diagnostic Log Setting menu provides preconfigured settings for
Enforce Server and detection server logging parameters. You can select an available
preconfigured setting to define common log levels or to enable logging for common server
features. The Select a Diagnostic Log Setting menu also provides a default setting that
returns logging configuration parameters to the default settings used at installation time.
Table 15-3 describes the preconfigured log settings available for the Enforce Server.
Optionally, you can upload a custom log configuration file that you have created or modified
using a text editor. (Use the Collection tab to download a log configuration file that you want
to customize.) You can upload only those configuration files that modify logging properties (file
names that end with Logging.properties). When you upload a new log configuration file to
a server, the server first backs up the existing configuration file of the same name. The new
file is then copied into the configuration file directory and its properties are applied immediately.
You do not need to restart the server process for the changes to take effect, unless you are
directed to do so. As of the current software release, only changes to the
PacketCaptureNativeLogging.properties and DiscoverNativeLogging.properties files
require you to restart the server process.
See “Server controls” on page 251.
Managing log files 344
Configuring server logging behavior

Make sure that the configuration file that you upload contains valid property definitions that
are applicable to the type of server you want to configure. If you make a mistake when uploading
a log configuration file, use the preconfigured Restore Defaults setting to revert the log
configuration to its original installed state.
The Enforce Server administration console performs only minimal validation of the log
configuration files that you upload. It ensures that:
■ Configuration file names correspond to actual logging configuration file names.
■ Root level logging is enabled in the configuration file. This configuration ensures that some
basic logging functionality is always available for a server.
■ Properties in the file that define logging levels contain only valid values (such as INFO,
FINE, or WARNING).

If the server detects a problem with any of these items, it displays an error message and
cancels the file upload.
If the Enforce Server successfully uploads a log configuration file change to a detection server,
the administration console reports that the configuration change was submitted. If the detection
server then encounters any problems when tries to apply the configuration change, it logs a
system event warning to indicate the problem.

Table 15-3 Preconfigured log settings for the Enforce Server

Select a Diagnostic Log Description


Setting value

Restore Defaults Restores log file parameters to their default values.

Incident Reporting API Logs the entire SOAP request and response message for most requests to the Incident
SOAP Logging Reporting API Web Service. The logged messages are stored in the
webservices_soap.log file. To begin logging to this file, edit the
c:\ProgramData\Symantec\DataLossPrevention\EnforceServer\15.5\Protect\config\
ManagerLogging.properties (Windows) or
/var/log/Symantec/DataLossPrevention/EnforceServer/15.5/Protect/config/
ManagerLogging.properties (Linux) file to set the com.vontu.enforce.

reportingapi.webservice.log.

WebServiceSOAPLogHandler.level property to INFO.

You can use the contents of webservices_soap.log to diagnose problems when


developing Incident Reporting API Web Service clients. See the Symantec Data Loss
Prevention Incident Reporting API Developers Guide for more information.
Managing log files 345
Configuring server logging behavior

Table 15-3 Preconfigured log settings for the Enforce Server (continued)

Select a Diagnostic Log Description


Setting value

Custom Attribute Lookup Logs diagnostic information each time the Enforce Server uses a lookup plug-in to
Logging populate custom attributes for an incident. Lookup plug-ins populate custom attribute
data using LDAP, CSV files, or other data repositories. The diagnostic information is
recorded in the Tomcat log file
(c:\ProgramData\Symantec\DataLossPrevention\EnforceServer\
15.5\Protect\logs\tomcat\localhost.date.log [Windows] or
/var/log/Symantec/DataLossPrevention/EnforceServer/
15.5/Protect/tomcat/localhost.date.log [Linux]) and the
IncidentPersister_0.log file.

See “About custom attributes” on page 1968.

See “About using custom attributes” on page 1969.

Table 15-4 Preconfigured log settings for detection servers

Select a Detection server uses Description


Diagnostic Log
Setting value

Restore All detection servers Restores log file parameters to their default values.
Defaults

Discover Trace Network Discover Servers Enables informational logging for Network Discover scans. These
Logging log messages are stored in FileReader0.log.

Detection All detection servers Logs information about each message that the detection server
Trace Logging processes. This includes information such as:

■ The policies that were applied to the message


■ The policy rules that were matched in the message
■ The number of incidents that the message generated.

When you enable Detection Trace Logging, the resulting


messages are stored in the
detection_operational_trace_0.log file.
Note: Trace logging can produce a large amount of data, and the
data is stored in clear text format. Use trace logging only when
you need to debug a specific problem.
Managing log files 346
Configuring server logging behavior

Table 15-4 Preconfigured log settings for detection servers (continued)

Select a Detection server uses Description


Diagnostic Log
Setting value

Packet Capture Network Monitor Servers Enables basic debug logging for packet capture with Network
Debug Logging Monitor. This setting logs information in the PacketCapture.log
file.

While this type of logging can produce a large amount of data, the
Packet Capture Debug Logging setting limits the log file size to
50 MB and the maximum number of log files to 10.

If you apply this log configuration setting to a server, you must


restart the server process to enable the change.

Email Prevent Network Prevent for Email Enables full message logging for Network Prevent for Email
Logging servers servers. This setting logs the complete message content and
includes execution and error tracing information. Logged
information is stored in the RequestProcessor0.log file.
Note: Trace logging can produce a large amount of data, and the
data is stored in clear text format. Use trace logging only when
you need to debug a specific problem.

See “Network Prevent for Email operational log codes” on page 355.

See “Network Prevent for Email originated responses and codes”


on page 359.

ICAP Prevent Network Prevent for Web Enables operational and access logging for Network Prevent for
Message servers Web. This setting logs information in the FileReader0.log file.
Processing
See “Network Prevent for Web operational log files and event
Logging
codes” on page 351.

See “Network Prevent for Web access log files and fields”
on page 352.

Follow this procedure to change the log configuration for a Symantec Data Loss Prevention
server.
To configure logging properties for a server
1 Click the Configuration tab if it is not already selected.
2 If you want to configure logging properties for a detection server, select the server name
from the Select a Detection Server menu.
Managing log files 347
Collecting server logs and configuration files

3 If you want to apply preconfigured log settings to a server, select the configuration name
from the Select a Diagnostic Configuration menu next to the server you want to
configure.
See Table 15-3 and Table 15-4 for a description of the diagnostic configurations.
4 If you instead want to use a customized log configuration file, click Browse... next to the
server you want to configure. Then select the logging configuration file to use from the
File Upload dialog, and click Open. You upload only logging configuration files, and not
configuration files that affect other server features.

Note: If the Browse button is unavailable because of a previous menu selection, click
Clear Form.

5 Click Configure Logs to apply the preconfigured setting or custom log configuration file
to the selected server.
6 Check for any system event warnings that indicate a problem in applying configuration
changes on a server.
See “Log collection and configuration screen” on page 343.

Note: The following debug log files are configured manually outside of the logging framework
available through the Enforce Server administration console:
ContentExtractionAPI_FileReader.log, ContentExtractionAPI_Manager.log,
ContentExtractionHost_FileReader.log, ContentExtractionHost_Manager.log,
machinelearning_native_filereader.log, and
machinelearning_training_native_manager.log. Refer to the entry for each of these log
files in debug log file list for configuration details. See “Debug log files” on page 337.

Collecting server logs and configuration files


Use the Collection tab of the System > Servers and Detectors > Logs screen to collect log
files and configuration files from one or more Symantec Data Loss Prevention servers. You
can collect files from a single detection server or from all detection servers, as well as from
the Enforce Server computer. You can limit the collected files to only those files that were last
updated in a specified range of dates.
The Enforce Server administration console stores all log and configuration files that you collect
in a single ZIP file on the Enforce Server computer. If you retrieve files from multiple Symantec
Data Loss Prevention servers, each server's files are stored in a separate subdirectory of the
ZIP file.
Managing log files 348
Collecting server logs and configuration files

Checkboxes on the Collection tab enable you to collect different types of files from the selected
servers. Table 15-5 describes each type of file.

Table 15-5 File types for collection

File type Description

Operational Operational log files record detailed information about the tasks the software performs and any errors
Logs that occur while the software performs those tasks. You can use the contents of operational log files
to verify that the software functions as you expect it to. You can also use these files to troubleshoot
any problems in the way the software integrates with other components of your system.

For example, you can use operational log files to verify that a Network Prevent for Email Server
communicates with a specific MTA on your network.

Debug and Debug log files record fine-grained technical details about the individual processes or software
Trace Logs components that comprise Symantec Data Loss Prevention. The contents of debug log files are not
intended for use in diagnosing system configuration errors or in verifying expected software
functionality. You do not need to examine debug log files to administer or maintain an Symantec
Data Loss Prevention installation. However, Symantec Support may ask you to provide debug log
files for further analysis when you report a problem. Some debug log files are not created by default.
Symantec Support can explain how to configure the software to create the file if necessary.

Configuration Use the Configuration Files option to retrieve both logging configuration files and server feature
Files configuration files.

Logging configuration files define the overall level of logging detail that is recorded in server log files.
Logging configuration files also determine whether specific features or subsystem events are recorded
to log files.

For example, by default the Enforce console does not log SOAP messages that are generated from
Incident Reporting API Web service clients. The ManagerLogging.properties file contains a
property that enables logging for SOAP messages.

You can modify many common logging configuration properties by using the presets that are available
on the Configuration tab.

If you want to update a logging configuration file by hand, use the Configuration Files checkbox to
download the configuration files for a server. You can modify individual logging properties using a
text editor and then use the Configuration tab to upload the modified file to the server.

See “Configuring server logging behavior” on page 343.

The Configuration Files option retrieves the active logging configuration files and also any backup
log configuration files that were created when you used the Configuration tab. This option also
retrieves server feature configuration files. Server feature configuration files affect many different
aspects of server behavior, such as the location of a syslog server or the communication settings of
the server. You can collect these configuration files to help diagnose problems or verify server settings.
However, you cannot use the Configuration tab to change server feature configuration files. You
can only use the tab to change logging configuration files.
Managing log files 349
Collecting server logs and configuration files

Table 15-5 File types for collection (continued)

File type Description

Agent Logs Use the Agent Logs option to collect DLP agent service and operational log files from an Endpoint
Prevent detection server. This option is available only for Endpoint Prevent servers. To collect agent
logs using this option, you must have already pulled the log files from individual agents to the Endpoint
Prevent detection server using a Pull Logs action.

Use the Agent List screen to select individual agents and pull selected log files to the Endpoint
Prevent detection server. Then use the Agent Logs option on this page to collect the log files.

When the logs are pulled from the endpoint, they are stored on the Endpoint Server in an unencrypted
format. After you collect the logs from the Endpoint Server, the logs are deleted from the Endpoint
Server and are stored only on the Enforce Server. You can only collect logs from one endpoint at a
time.

See “Using the Agent List screen” on page 2430.

Operational, debug, trace log files are stored in the server_identifier/logs subdirectory
of the ZIP file. server_identifier identifies the server that generated the log files, and it
corresponds to one of the following values:
■ If you collect log files from the Enforce Server, Symantec Data Loss Prevention replaces
server_identifier with the string Enforce. Note that Symantec Data Loss Prevention does
not use the localized name of the Enforce Server.
■ If a detection server’s name includes only ASCII characters, Symantec Data Loss Prevention
uses the detection server name for the server_identifier value.
■ If a detection server’s name contains non-ASCII characters, Symantec Data Loss Prevention
uses the string DetectionServer-ID-id_number for the server_identifier value. id_number
is a unique identification number for the detection server.
If you collect agent service log files or operational log files from an Endpoint Prevent server,
the files are placed in the server_identifier/agentlogs subdirectory. Each agent log file
uses the individual agent name as the log file prefix.
Follow this procedure to collect log files and log configuration files from Symantec Data Loss
Prevention servers.
To collect log files from one or more servers
1 Click the Collection tab if it is not already selected.
2 Use the Date Range menu to select a range of dates for the files you want to collect. Note
that the collection process does not truncate downloaded log files in any way. The date
range limits collected files to those files that were last updated in the specified range.
3 To collect log files from the Enforce Server, select one or more of the checkboxes next
to the Enforce Server entry to indicate the type of files you want to collect.
Managing log files 350
About log event codes

4 To collect log files from one or all detection servers, use the Select a Detection Server
menu to select either the name of a detection server or the Collect Logs from All
Detection Servers option. Then select one or more of the checkboxes next to the menu
to indicate the type of files you want to collect.
5 Click Collect Logs to begin the log collection process.
The administration console adds a new entry for the log collection process in the Previous
Log Collections list at the bottom of the screen. If you are retrieving many log files, you
may need to refresh the screen periodically to determine when the log collection process
has completed.

Note: You can run only one log collection process at a time.

6 To cancel an active log collection process, click Cancel next to the log collection entry.
You may need to cancel log collection if one or more servers are offline and the collection
process cannot complete. When you cancel the log collection, the ZIP file contains only
those files that were successfully collected.
7 To download collected logs to your local computer, click Download next to the log collection
entry.
8 To remove ZIP files stored on the Enforce Server, click Delete next to a log collection
entry.
See “Log collection and configuration screen” on page 343.
See “About log files” on page 333.

About log event codes


Operational log file messages are formatted to closely match industry standards for the various
protocols involved. These log messages contain event codes that describe the specific task
that the software was trying to perform when the message was recorded. Log messages are
generally formatted as:

Timestamp [Log Level] (Event Code) Event description [event parameters]

■ See “Network Prevent for Web operational log files and event codes” on page 351.
■ See “Network Prevent for Email operational log codes” on page 355.
■ See “Network Prevent for Email originated responses and codes” on page 359.
Managing log files 351
About log event codes

Network Prevent for Web operational log files and event codes
Network Prevent for Web log file names use the format of WebPrevent_OperationalX.log
(where X is a number). The number of files that are stored and their sizes can be specified by
changing the values in the FileReaderLogging.properties file. This file is in the c:\Program
Files\Symantec\DataLossPrevention\DetectionServer\15.5\Protect\config (Windows)
or /opt/Symantec/DataLossPrevention/DetectionServer/15.5/Protect/config (Linux)
directory. By default, the values are:
■ com.vontu.icap.log.IcapOperationalLogHandler.limit = 5000000

■ com.vontu.icap.log.IcapOperationalLogHandler.count = 5

Table 15-6 lists the Network Prevent for Web-defined operational logging codes by category.
The italicized part of the text contains event parameters.

Table 15-6 Status codes for Network Prevent for Web operational logs

Code Text and Description

Operational Events

1100 Starting Network Prevent for Web

1101 Shutting down Network Prevent for Web

Connectivity Events

1200 Listening for incoming connections at


icap_bind_address:icap_bind_port

Where:

■ icap_bind_address is the Network Prevent for Web bind address to which the server listens.
This address is specified with the Icap.BindAddress Advanced Setting.
■ icap_bind_port is the port at which the server listens. This port is set in the Server >
Configure page.

1201 Connection (id=conn_id) opened from


host(icap_client_ip:icap_client_port)

Where:

■ conn_id is the connection ID that is allocated to this connection. This ID can be helpful in
doing correlations between multiple logs.
■ icap_client_ip and icap_client_port are the proxy's IP address and port from which the
connect operation to Network Prevent for Web was performed.
Managing log files 352
About log event codes

Table 15-6 Status codes for Network Prevent for Web operational logs (continued)

Code Text and Description

1202 Connection (id=conn_id) closed (close_reason)

Where:

■ conn_id is the connection ID that is allocated to the connect operation.


■ close_reason provides the reason for closing the connection.

1203 Connection states: REQMOD=N, RESPMOD=N,


OPTIONS=N, OTHERS=N

Where N indicates the number of connections in each state, when the message was logged.

This message provides the system state in terms of connection management. It is logged
whenever a connection is opened or closed.

Connectivity Errors

5200 Failed to create listener at icap_bind_address:icap_bind_port

Where:

■ icap_bind_address is the Network Prevent for Web bind address to which the server listens.
This address can be specified with the Icap.BindAddress Advanced Setting.
■ icap_bind_port is the port at which the server listens. This port is set on the Server >
Configure page.

5201 Connection was rejected from unauthorized host (host_ip:port)

Where host_ip and port are the proxy system IP and port address from which a connect attempt
to Network Prevent for Web was performed. If the host is not listed in the Icap.AllowHosts
Advanced setting, it is unable to form a connection.

See “About log files” on page 333.

Network Prevent for Web access log files and fields


Network Prevent for Web log file names use the format of WebPrevent_AccessX.log (where
X is a number). The number of files that are stored and their sizes can be specified by changing
the values in the FileReaderLogging.properties file. By default, the values are:
■ com.vontu.icap.log.IcapAccessLogHandler.limit = 5000000

■ com.vontu.icap.log.IcapAccessLogHandler.count = 5

A Network Prevent for Web access log is similar to a proxy server’s web access log. The “start”
log message format is:
Managing log files 353
About log event codes

# Web Prevent starting: start_time

Where start_time format is date:time, for example: 13/Aug/2018:03:11:22:015-0700.


The description message format is:

# host_ip "auth_user" time_stamp "request_line" icap_status_code


request_size "referer" "user_agent" processing_time(ms) conn_id client_ip
client_port action_code icap_method_code traffic_source_code

Table 15-7 lists the fields. The values of fields that are enclosed in quotes in this example are
quoted in an actual message. If field values cannot be determined, the message displays -
or "" as a default value.

Table 15-7 Network Prevent for Web access log fields

Fields Explanation

host_ip IP address of the host that made the request.

auth_user Authorized user for this request.

time_stamp Time that Network Prevent for Web receives the request.

request_line Line that represents the request.

icap_status_code ICAP response code that Network Prevent for Web sends by for this
request.

request_size Request size in bytes.

referrer Header value from the request that contains the URI from which this request
came.

user_agent User agent that is associated with the request.

processing_time Request processing time in milliseconds. This value is the total of the
(milliseconds) receiving, content inspection, and sending times.

conn_id Connection ID associated with the request.

client_ip IP of the ICAP client (proxy).

client_port Port of the ICAP client (proxy).


Managing log files 354
About log event codes

Table 15-7 Network Prevent for Web access log fields (continued)

Fields Explanation

action_code An integer representing the action that Network Prevent for Web takes.
Where the action code is one of the following:

■ 0 = UNKNOWN
■ 1 = ALLOW
■ 2 = BLOCK
■ 3 = REDACT
■ 4 = ERROR
■ 5 = ALLOW_WITHOUT_INSPECTION
■ 6 = OPTIONS_RESPONSE
■ 7 = REDIRECT

icap_method_code An integer representing the ICAP method that is associated with this
request. Where the ICAP method code is one of the following:

■ -1 = ILLEGAL
■ 0 = OPTIONS
■ 1 = REQMOD
■ 2 = RESPMOD
■ 3 = LOG

traffic_source_code An integer that represents the source of the network traffic. Where the
traffic source code is one of the following:

■ 1 = WEB
■ 2 = UNKNOWN

See “About log files” on page 333.

Network Prevent for Web protocol debug log files


To enable ICAP trace logging, set the Icap.EnableTrace advanced setting to true and use
the Icap.TraceFolder advanced setting to specify a directory to receive the traces. Symantec
Data Loss Prevention service must be restarted for this change to take effect.
Trace files that are placed in the specified directory have file names in the format:
timestamp-conn_id. The first line of a trace file provides information about the connecting host
IP and port along with a timestamp. File data that is read from the socket is displayed in the
format <<timestamp number_of_bytes_read. Data that is written to the socket is displayed
in the format >>timestamp number_of_bytes_written. The last line should note that the
connection has been closed.
Managing log files 355
About log event codes

Note: Trace logging produces a large amount of data and therefore requires a large amount
of free disk storage space. Trace logging should be used only for debugging an issue because
the data that is written in the file is in clear text.

See “About log files” on page 333.

Network Prevent for Email log levels


Network Prevent for Email log file names use the format of EmailPrevent_OperationalX.log
(where X is a number). The number of files that are stored and their sizes can be specified by
changing the values in the FileReaderLogging.properties file. By default, the values are:
■ com.vontu.mta.log.SmtpOperationalLogHandler.limit = 5000000

■ com.vontu.mta.log.SmtpOperationalLogHandler.count = 5

At various log levels, components in the com.vontu.mta.rp package output varying levels of
detail. The com.vontu.mta.rp.level setting specifies log levels in the
RequestProcessorLogging.properties file which is stored in the
FileReaderLogging.properties file. This file is in the c:\Program
Files\Symantec\DataLossPrevention\DetectionServer\15.5\Protect\config (Windows)
or /opt/Symantec/DataLossPrevention/DetectionServer/15.5/Protect/config (Linux)
directory. For example, com.vontu.mta.rp.level = FINE specifies the FINE level of detail.
Table 15-8 describes the Network Prevent for Email log levels.

Table 15-8 Network Prevent for Email log levels

Level Guidelines

INFO General events: connect and disconnect notices, information on the messages that are
processed per connection.

FINE Some additional execution tracing information.

FINER Envelope command streams, message headers, detection results.

FINEST Complete message content, deepest execution tracing, and error tracing.

See “About log files” on page 333.

Network Prevent for Email operational log codes


Table 15-9 lists the defined Network Prevent for Email operational logging codes by category.
Managing log files 356
About log event codes

Table 15-9 Status codes for Network Prevent for Email operational log

Code Description

Core Events

1100 Starting Network Prevent for Email

1101 Shutting down Network Prevent for Email

1102 Reconnecting to FileReader (tid=id)

Where id is the thread identifier.

The RequestProcessor attempts to re-establish its connection with the FileReader for detection.

1103 Reconnected to the FileReader successfully (tid=id)

The RequestProcessor was able to re-establish its connection to the FileReader.

Core Errors

5100 Could not connect to the FileReader (tid=id timeout=.3s)

An attempt to re-connect to the FileReader failed.

5101 FileReader connection lost (tid=id)

The RequestProcessor connection to the FileReader was lost.

Connectivity Events

1200 Listening for incoming connections (local=hostname)

Hostnames is an IP address or fully-qualified domain name.

1201 Connection accepted (tid=id cid=N


local=hostname:port
remote=hostname:port)

Where N is the connection identifier.

1202 Peer disconnected (tid=id cid=N


local=hostname:port
remote=hostname:port)

1203 Forward connection established (tid=id cid=N


local=hostname:port
remote=hostname:port)
Managing log files 357
About log event codes

Table 15-9 Status codes for Network Prevent for Email operational log (continued)

Code Description

1204 Forward connection closed (tid=id cid=N


local=hostname:port
remote=hostname:port)

1205 Service connection closed (tid=id cid=N


local=hostname:port
remote=hostname:port messages=1 time=0.14s)

Connectivity Errors

5200 Connection is rejected from the unauthorized host (tid=id


local=hostname:port
remote=hostname:port)

5201 Local connection error (tid=id cid=N


local=hostname:port
remote=hostname:port reason=Explanation)

5202 Sender connection error (tid=id cid=N


local=hostname:port
remote=hostname:port reason=Explanation)

5203 Forwarding connection error (tid=id cid=N


local=hostname:port
remote=hostname:port reason=Explanation)

5204 Peer disconnected unexpectedly (tid=id cid=N


local=hostname:port
remote=hostname:port reason=Explanation)

5205 Could not create listener (address=local=hostname:port


reason=Explanation)

5206 Authorized MTAs contains invalid hosts: hostname,


hostname, ...

5207 MTA restrictions are active, but no MTAs are authorized


to communicate with this host
Managing log files 358
About log event codes

Table 15-9 Status codes for Network Prevent for Email operational log (continued)

Code Description

5208 TLS handshake failed (reason=Explanation tid=id cid=N


local=hostname remote=hostname)

5209 TLS handshake completed (tid=id cid=N


local=hostname remote=hostname)

5210 All forward hosts unavailable (tid=id cid=N


reason=Explanation)

5211 DNS lookup failure (tid=id cid=N


NextHop=hostname reason=Explanation)

5303 Failed to encrypt incoming message (tid=id cid=N


local=hostname remote=hostname)

5304 Failed to decrypt outgoing message (tid=id cid=N


local=hostname remote=hostname)

Message Events

1300 Message complete (cid=N message_id=3 dlp_id=message_identifier


size=number sender=email_address recipient_count=N
disposition=response estatus=statuscode rtime=N
dtime=N mtime=N

Where:

■ Recipient_count is the total number of addressees in the To, CC, and BCC fields.
■ Response is the Network Prevent for Email response which can be one of: PASS, BLOCK,
BLOCK_AND_REDIRECT, REDIRECT, MODIFY, or ERROR.
■ Thee status is an Enhanced Status code.
See “Network Prevent for Email originated responses and codes” on page 359.
■ The rtime is the time in seconds for Network Prevent for Emailto fully receive the message
from the sending MTA.
■ The dtime is the time in seconds for Network Prevent for Email to perform detection on
the message.
■ The mtime is the total time in seconds for Network Prevent for Email to process the
message Message Errors.

Message Errors
Managing log files 359
About log event codes

Table 15-9 Status codes for Network Prevent for Email operational log (continued)

Code Description

5300 Error while processing message (cid=N message_id=header_ID


dlp_id=message_identifier size=0 sender=email_address
recipient_count=N disposition=response estatus=statuscode
rtime=N dtime=N mtime=N reason=Explanation

Where header_ID is an RFC 822 Message-Id header if one exists.

5301 Sender rejected during re-submit

5302 Recipient rejected during re-submit

See “About log files” on page 333.

Network Prevent for Email originated responses and codes


Network Prevent for Email originates the following responses. Other protocol responses are
expected as Network Prevent for Email relays command stream responses from the forwarding
MTA to the sending MTA. Table 15-10 shows the responses that occur in situations where
Network Prevent must override the receiving MTA. It also shows the situations where Network
Preventgenerates a specific response to an event that is not relayed from downstream.
“Enhanced Status” is the RFC1893 Enhanced Status Code associated with the response.

Table 15-10 Network Prevent for Email originated responses

Code Enhanced Text Description


Status

250 2.0.0 Ok: Carry on. Success code that Network Prevent for Email uses.

221 2.0.0 Service The normal connection termination code that Network Prevent
closing. for Email generates if a QUIT request is received when no
forward MTA connection is active.

451 4.3.0 Error: This “general, transient” error response is issued when a
Processing (potentially) recoverable error condition arises. This error
error. response is issued when a more specific error response is not
available. Forward connections are sometimes closed, and
their unexpected termination is occasionally a cause of a code
451, status 4.3.0. However sending connections should remain
open when such a condition arises unless the sending MTA
chooses to terminate.
Managing log files 360
About log event codes

Table 15-10 Network Prevent for Email originated responses (continued)

Code Enhanced Text Description


Status

421 4.3.0 Fatal: This “general, terminal” error response is issued when a fatal,
Processing unrecoverable error condition arises. This error results in the
error. immediate termination of any sender or receiver connections.
Closing
connection.

421 4.4.1 Fatal: That an attempt to connect the forward MTA was refused or
Forwarding otherwise failed to establish properly.
agent
unavailable.

421 4.4.2 Fatal: Closing connection. The forwarded MTA connection is lost in
Connection a state where further conversation with the sending MTA is
lost to not possible. The loss usually occurs in the middle of message
forwarding header or body buffering. The connection is terminated
agent. immediately.

451 4.4.2 Error: The forward MTA connection was lost in a state that may be
Connection recoverable if the connection can be re-established. The
lost to sending MTA connection is maintained unless it chooses to
forwarding terminate.
agent.

421 4.4.7 Error: The last command issued did not receive a response within
Request the time window that is defined in the
timeout RequestProcessor.DefaultCommandTimeout. (The time
exceeded. window may be from RequestProcessor.DotCommandTimeout
if the command issued was the “.”). The connection is closed
immediately.

421 4.4.7 Error: The connection was idle (no commands actively awaiting
Connection response) in excess of the time window that is defined in
timeout RequestProcessor.DefaultCommandTimeout.
exceeded.
Managing log files 361
About log event codes

Table 15-10 Network Prevent for Email originated responses (continued)

Code Enhanced Text Description


Status

501 5.5.2 Fatal: A fatal violation of the SMTP protocol (or the constraints that
Invalid are placed on it) occurred. The violation is not expected to
transmission change on a resubmitted message attempt. This message is
request. only issued in response to a single command or data line that
exceeds the boundaries that are defined in
RequestProcessor.MaxLineLength.

502 5.5.1 Error: Defined but not currently used.


Unrecognized
command.

550 5.7.1 User This combination of code and status indicates that a Blocking
Supplied. response rule has been engaged. The text that is returned is
supplied as part of the response rule definition.

Note that a 4xx code and a 4.x.x enhanced status indicate a temporary error. In such cases
the MTA can resubmit the message to the Network Prevent for Email Server. A 5xx code and
a 5.x.x enhanced status indicate a permanent error. In such cases the MTA should treat the
message as undeliverable.
See “About log files” on page 333.
Chapter 16
Using Symantec Data Loss
Prevention utilities
This chapter includes the following topics:

■ About Symantec Data Loss Prevention utilities

■ About Endpoint utilities

■ About DBPasswordChanger

About Symantec Data Loss Prevention utilities


Symantec provides a suite of utilities to help users accomplish those tasks that need to be
done on an infrequent basis. The utilities are typically used to perform troubleshooting and
maintenance tasks. They are also used to prepare data and files for use with the Symantec
Data Loss Prevention software.
The Symantec Data Loss Prevention utilities are provided for both Windows and Linux operating
systems. You use the command line to run the utilities on both operating systems. The utilities
operate in a similar manner regardless of operating system.
Table 16-1 describes how and when to use each utility.

Table 16-1 Symantec Data Loss Prevention utilities

Name Description

DBPasswordChanger Changes the encrypted password that the Enforce Server uses to connect to the Oracle
database.

See “About DBPasswordChanger” on page 364.


Using Symantec Data Loss Prevention utilities 363
About Endpoint utilities

Table 16-1 Symantec Data Loss Prevention utilities (continued)

Name Description

sslkeytool Generates custom authentication keys to improve the security of the data that is transmitted
between the Enforce Server and detection servers. The custom authentication keys must be
copied to each Symantec Data Loss Prevention server.

See the topic "About the sslkeytool utility and server certificates" in the Symantec Data Loss
Prevention Installation Guide.

SQL Preindexer Indexes an SQL database or runs an SQL query on specific data tables within the database.
This utility is designed to pipe its output directly to the Remote EDM Indexer utility.

See “About the SQL Preindexer for EDM” on page 586.

Remote EDM Indexer Converts a comma-separated or tab-delimited data file into an exact data matching index.
The utility can be run on a remote machine to provide the same indexing functionality that is
available locally on the Enforce Server.

This utility is often used with the SQL Preindexer. The SQL Preindexer can run an SQL query
and pass the resulting data directly to the Remote EDM Indexer to create an EDM index.

See “About the Remote EDM Indexer” on page 586.

About Endpoint utilities


Table 16-2 describes those utilities that apply to the Endpoint products.
See “About agent password management”on page 2489 on page 2489.

Table 16-2 Endpoint utilities

Name Description

Service_Shutdown.exe This utility enables an administrator to turn off both the agent and the watchdog services on
an endpoint. (As a tamper-proofing measure, it is not possible for a user to stop either the
agent or the watchdog service.)

See “Shutting down the agent and the watchdog services on Windows endpoints” on page 2492.

Vontu_sqlite3.exe This utility provides an SQL interface that enables you to view or modify the encrypted
database files that the Symantec DLP Agent uses. Use this tool when you want to investigate
or make changes to the Symantec Data Loss Prevention files.

See “Inspecting the database files accessed by the agent” on page 2493.

Logdump.exe This tool lets you view the Symantec DLP Agent extended log files, which are hidden for
security reasons.

See “Viewing extended log files” on page 2494.


Using Symantec Data Loss Prevention utilities 364
About DBPasswordChanger

Table 16-2 Endpoint utilities (continued)

Name Description

Start_agent This utility enables an administrator to start agents running on Mac endpoints that have been
shut down using the shutdown task.

See “Starting DLP Agents that run on Mac endpoints” on page 2499.

About DBPasswordChanger
Symantec Data Loss Prevention stores encrypted passwords to the Oracle database in a file
that is called DatabasePassword.properties, located in C:\Program
Files\Symantec\DataLossPrevention\EnforceServer\15.5\Protect\config (Windows)
or /opt/Symantec/DataLossPrevention/EnforceServer/15.5/Protect/config (Linux).
Because the contents of the file are encrypted, you cannot directly modify the file. The
DBPasswordChanger utility changes the stored Oracle database passwords that the Enforce
Server uses.
Before you can use DBPasswordChanger to change the password to the Oracle database
you must:
■ Shut down the Enforce Server.
■ Change the Oracle database password using Oracle utilities.
See “Example of using DBPasswordChanger” on page 365.

DBPasswordChanger syntax
The DBPasswordChanger utility uses the following syntax:

DBPasswordChanger password_file new_oracle_password

All command-line parameters are required. The following table describes each command-line
parameter.
See “Example of using DBPasswordChanger” on page 365.
Using Symantec Data Loss Prevention utilities 365
About DBPasswordChanger

Table 16-3 DBPasswordChanger command-line parameters

Parameter Description

password_file Specifies the file that contains the encrypted password. By


default, this file is named DatabasePassword.properties
and is stored in

C:\Program Files\Symantec\DataLossPrevention
\EnforceServer\15.5\Protect\config (Windows) or

/opt/Symantec/DataLossPrevention/
EnforceServer/15.5/Protect/config (Linux).

new_oracle_password Specifies the new Oracle password to encrypt and store.

Example of using DBPasswordChanger


If Symantec Data Loss Prevention was installed in the default location, then the
DBPasswordChanger utility is located at C:\Program
Files\Symantec\DataLossPrevention\EnforceServer\15.5\Protect\bin (Windows) or
/opt/Symantec/DataLossPrevention/EnforceServer/15.5/Protect/bin (Linux). You must
be an Administrator (or root) to run DBPasswordChanger.
For example, type:

DBPasswordChanger \Symantec\DataLossPrevention\EnforceServer\15.5\Protect\config\Datab
protect_oracle

See “DBPasswordChanger syntax” on page 364.


Section 4
Authoring policies

■ Chapter 17. Introduction to policies

■ Chapter 18. Overview of policy detection

■ Chapter 19. Creating policies from templates

■ Chapter 20. Configuring policies

■ Chapter 21. Administering policies

■ Chapter 22. Best practices for authoring policies

■ Chapter 23. Increasing the Inspection Content Size

■ Chapter 24. Installing remote indexers

■ Chapter 25. Detecting content using Exact Match Data Identifiers (EMDI)

■ Chapter 26. Detecting content using Exact Data Matching (EDM)

■ Chapter 27. Detecting content using Indexed Document Matching (IDM)

■ Chapter 28. Detecting content using Vector Machine Learning (VML)

■ Chapter 29. Detecting content using Form Recognition - Sensitive Image Recognition

■ Chapter 30. Detecting Content using OCR - Sensitive Image Recognition

■ Chapter 31. Detecting content using data identifiers


Authoring policies 367

■ Chapter 32. Detecting content using keyword matching

■ Chapter 33. Detecting content using regular expressions

■ Chapter 34. Detecting content using classification matching

■ Chapter 35. Detecting international language content

■ Chapter 36. Detecting file properties

■ Chapter 37. Detecting network incidents

■ Chapter 38. Detecting endpoint events

■ Chapter 39. Detecting described identities

■ Chapter 40. Detecting synchronized identities

■ Chapter 41. Detecting profiled identities

■ Chapter 42. Using contextual attributes for Application Detection

■ Chapter 43. Supported file formats for detection

■ Chapter 44. Supported Office Open XML formats for high-performance content extraction

■ Chapter 45. Library of system data identifiers

■ Chapter 46. Library of policy templates


Chapter 17
Introduction to policies
This chapter includes the following topics:

■ About Data Loss Prevention policies

■ Policy components

■ Policy templates

■ Solution packs

■ Policy groups

■ Policy deployment

■ Policy severity

■ Policy authoring privileges

■ Data Profiles

■ User Groups

■ Policy template import and export

■ Workflow for implementing policies

■ Viewing, printing, and downloading policy details

About Data Loss Prevention policies


You implement policies to detect and prevent data loss. A Symantec Data Loss Prevention
policy combines detection rules and response actions. If a policy rule is violated, the system
generates an incident that you can report and act on. The policy rules you implement are
based on your information security objectives. The actions you take in response to policy
Introduction to policies 369
About Data Loss Prevention policies

violations are based on your compliance requirements. The Enforce Server administration
console provides an intuitive, centralized, Web-based interface for authoring policies.
See “Workflow for implementing policies” on page 378.
Table 17-1 describes the policy authoring features provided by Symantec Data Loss Prevention.

Table 17-1 Policy authoring features

Feature Description

Intuitive policy The policy builder interface supports Boolean logic for detection configuration.
building
You can combine different detection methods and technologies in a single policy.

See “Detecting data loss” on page 381.

See “Best practices for authoring policies” on page 449.

Decoupled The system stores response rules and policies as separate entities.
response rules
You can manage and update response rules without having to change policies; you can reuse
response rules across policies.

See “About response rules” on page 1738.

Fine-grained policy The system provides severity levels for policy violations.
reporting
You can report the overall severity of a policy violation by the highest severity.

See “Policy severity” on page 374.

Centralized data The system stores data and group profiles separate from policies.
and group profiling
This separation enables you to manage and update profiles without changing policies.

See “Data Profiles” on page 375.

See “User Groups” on page 376.

Template-based The system provides 65 pre-built policy templates.


policy detection
You can use these templates to quickly configure and deploy policies.

See “Policy templates” on page 371.

Policy sharing The system supports policy template import and export.

You can share policy templates across environments and systems.

See “Policy template import and export” on page 377.

Role-based access The system provides role-based access control for various user and administrative functions.
control
You can create roles for policy authoring, policy administration, and response rule authoring.

See “Policy authoring privileges” on page 375.


Introduction to policies 370
Policy components

Policy components
A valid policy has at least one detection or group rule with at least one match condition.
Response rules are optional policy components.
Policy components describes Data Loss Prevention policy components.

Table 17-2 Policy components

Component Use Description

Policy group Required A policy must be assigned to a single Policy Group.

See “Policy groups” on page 372.

Policy name Required The policy name must be unique within the Policy Group

See “Manage and add policies” on page 432.

Policy rule Required A valid policy must contain at least one rule that declares at least one
match condition.

See “Policy matching conditions” on page 386.

Data Profile May be Exact Data Matching (EDM), Indexed Document Matching (IDM), Vector
required Machine Learning (VML), and Form Recognition policies all require data
profiles.

See “Data Profiles” on page 375.

User group May be A policy requires a User Group only if a group method in the policy
required requires it.

Synchronized DGM rules and exceptions require a User Group.

See “User Groups” on page 376.

Policy description Optional A policy description helps users identify the purpose of the policy.

See “Configuring policies” on page 413.

Policy label Optional A policy label helps Veritas Data Insight business users identify the
purpose of the policy when using the Self-Service Portal.

See “Configuring policies” on page 413.

Response Rule Optional A policy can implement one or more response rules to report and
remediate incidents.

See “About response rules” on page 1738.

Policy exception Optional A policy can contain one or more exceptions to exclude data from
matching.

See “Exception conditions” on page 393.


Introduction to policies 371
Policy templates

Table 17-2 Policy components (continued)

Component Use Description

Compound match Optional A policy rule or exception can implement multiple match conditions.
conditions
See “Compound conditions” on page 394.

Policy templates
Symantec Data Loss Prevention provides policy templates to help you quickly deploy detection
policies in your enterprise. You can share policies across systems and environments by
importing and exporting policy rules and exceptions as templates.
Using policy templates saves you time and helps you avoid errors and information gaps in
your policies because the detection methods are predefined. You can edit a template to create
a policy that precisely suits your needs. You can also export and import your own policy
templates.
Some policy templates are based on well-known sets of regulations, such as the Payment
Card Industry Security Standard, Gramm-Leach-Bliley, California SB1386, and HIPAA. Other
policy templates are more generic, such as Customer Data Protection, Employee Data
Protection, and Encrypted Data. Although the regulation-based templates can help address
the requirements of the relevant regulations, consult with your legal counsel to verify compliance.
See “Creating a policy from a template” on page 397.
Table 17-3 describes the system-defined policy templates provided by Symantec Data Loss
Prevention.

Table 17-3 System-defined policy templates

Policy template type Description

US Regulatory Enforcement See “US Regulatory Enforcement policy templates” on page 400.

General Data Protection Regulation See “General Data Protection Regulation (GDPR) policy templates”
on page 402.

International Regulatory Enforcement See “International Regulatory Enforcement policy templates” on page 403.

Customer and Employee Data Protection See “Customer and Employee Data Protection policy templates”
on page 404.

Confidential or Classified Data Protection See “Confidential or Classified Data Protection policy templates”
on page 405.

Network Security Enforcement See “Network Security Enforcement policy templates” on page 406.
Introduction to policies 372
Solution packs

Table 17-3 System-defined policy templates (continued)

Policy template type Description

Acceptable Use Enforcement See “Acceptable Use Enforcement policy templates” on page 407.

Imported Templates See “Policy template import and export” on page 377.

Solution packs
Symantec Data Loss Prevention provides solution packs for several industry verticals. A
solution pack contains configured policies, response rules, user roles, reports, protocols, and
the incident statuses that support a particular industry or organization. For a list of available
solution packs and instructions, refer to chapter 4, "Importing a solution pack" in the Symantec
Data Loss Prevention Installation Guide. You can import one solution pack to the Enforce
Server.
Once you have imported the solution pack, start by reviewing its policies. By default the solution
pack activates the policies it provides.
See “Manage and add policies” on page 432.

Policy groups
You deploy policies to detection servers using policy groups. Policy groups limit the policies,
incidents, and detection mechanisms that are accessible to specific users.
Each policy belongs to one policy group. When you configure a policy, you assign it to a policy
group. You can change the policy group assignment, but you cannot assign a policy to more
than one policy group. You deploy policy groups to one or more detection servers.
The Enforce Server is configured with a single policy group called the Default Policy Group.
The system deploys the default policy group to all detection servers. If you define a new policy,
the system assigns the policy to the default policy group, unless you create and specify a
different policy group. You can change the name of the default policy group. A solution pack
creates several policy groups and assigns policies to them.
After you create a policy group, you can link policies, Discover targets, and roles to the policy
group. When you create a Discover target, you must associate it with a single policy group.
When you associate a role with particular policy groups, you can restrict users in that role.
Policies in that policy group detect incidents and report them to users in the role that is assigned
to that policy group.
The relationship between policy groups and detection servers depends on the server type.
You can deploy a policy group to one or more Network Monitor, Network Prevent, or Endpoint
Servers. Policy groups that you deploy to an Endpoint Server apply to any DLP Agent that is
Introduction to policies 373
Policy deployment

registered with that server. The Enforce Server automatically associates all policy groups with
all Network Discover Servers.
For Network Monitor and Network Prevent, each policy group is assigned to one or more
Network Monitor Servers, Network Prevent for Email Servers, or Network Prevent for Web
Servers. For Network Discover, policy groups are assigned to individual Discover targets. A
single detection server may handle as many policy groups as necessary to scan its targets.
For Endpoint Monitor, policy groups are assigned to the Endpoint Server and apply to all
registered DLP Agents.
See “Manage and add policy groups” on page 435.
See “Creating and modifying policy groups” on page 436.

Policy deployment
You can use policy groups to organize and deploy your policies in different ways. For example,
consider a situation in which your detection servers are set up across a system that spans
several countries. You can use policy groups to ensure that a detection server runs only the
policies that are valid for a specific location.
You can dedicate some of your detection servers to monitor internal network traffic and dedicate
others to monitor network exit points. You can use policy groups to deploy less restrictive
policies to servers that monitor internal traffic. At the same time, you can deploy stricter policies
to servers that monitor traffic leaving your network.
You can use policy groups to organize policies and incidents by business units, departments,
geographic regions, or any other organizational unit. For example, policy groups for specific
departments may be appropriate where security responsibilities are distributed among various
groups. In such cases, policy groups provide for role-based access control over the viewing
and editing of incidents. You deploy policy groups according to the required division of access
rights within your organization (for example, by business unit).
You can use policy groups for detection-server allocation, which may be more common where
security departments are centralized. In these cases, you would carefully choose the detection
server allocation for each role and reflect the server name in the policy group name. For
example, you might name the groups Inbound and Outbound, United States and International,
or Testing and Production.
In more complex environments, you might consider some combination of the following policy
groups for deploying policies:
■ Sales and Marketing - US
■ Sales and Marketing - Europe
■ Sales and Marketing - Asia
■ Sales and Marketing - Australia, New Zealand
Introduction to policies 374
Policy severity

■ Human Resources - US
■ Human Resources - International
■ Research and Development
■ Customer service
Lastly, you can use policy groups to test policies before deploying them in production, to
manage legacy policies, and to import and export policy templates.
See “Policy groups” on page 372.
See “About role-based access control” on page 109.

Policy severity
When you configure a detection rule, you can select a policy severity level. You can then use
response rules to take action based on a severity level. For example, you can configure a
response rule to take action after a specified number of "High" severity violations.
See “About response rule conditions” on page 1752.
The default severity level is set to "High," unless you change it. The default severity level
applies to any condition that the detection rule matches. For example, if the default severity
level is set to "High," every detection rule violation is labeled with this severity level. If you do
not want to tag every violation with a specific severity, you can define the criteria by which a
severity level is established. In this case the default behavior is overridden. For example, you
can define the "High" severity level to be applied only after a specified number of condition
matches have occurred.
See “Defining rule severity” on page 420.
In addition, you can define multiple severity levels to layer severity reporting. For example,
you can set the "High" severity level after 100 matches, and the medium severity level to apply
after 50 matches.

Table 17-4 Rule severity levels

Rule severity level Description

High If a condition match occurs, it is labeled "High" severity.

Medium If a condition match occurs, it is labeled "Medium" severity.

Low If a condition match occurs, it is labeled "Low" severity.

Info If a condition match occurs, it is labeled "Info" severity.


Introduction to policies 375
Policy authoring privileges

Policy authoring privileges


Policy authors configure and manage policies and their rules and exceptions. To author policies,
a user must be assigned to a role that grants the policy authoring privilege. This role can be
expanded to include management of policy groups, scanning targets, and credentials.
Response rule authoring privileges are separate credentials from policy authoring and
administration privileges. Whether or not policy authors have response rule authoring privileges
is based on your enterprise needs.
Table 17-5 describes the typical privileges for the policy and response rule authoring roles.

Table 17-5 Policy authoring privileges

Role privilege Description

Author Policies Add, configure, and manage policies.

Add, configure, and manage policy rules and exceptions.

Import and export policy templates.

Modify system-defined data identifiers and create custom data identifiers.

Add, configure, and manage User Groups.

Add response rules to policies (but do not create response rules).

See “About role-based access control” on page 109.

Enforce Server Add, configure, and manage policy groups.


Administration
Add, configure, and manage Data Profiles.

See “Configuring roles” on page 114.

Author Response Add, configure, and manage response rules (but do not add them to policies).
Rules
See “About response rule authoring privileges” on page 1757.

Data Profiles
Data Profiles are user-defined configurations that you create to implement Exact Data Matching
(EDM), Indexed Document Matching (IDM), Form Recognition, and Vector Machine Learning
(VML) policy conditions.
See “Data Loss Prevention policy detection technologies” on page 383.
Table 17-6 describes the types of Data Profiles that the system supports.
Introduction to policies 376
User Groups

Table 17-6 Types of Data Profiles

Data Profile type Description

Exact Data Profile An Exact Data Profile is used for Exact Data Matching (EDM) policies. The Exact Data Profile
contains data that has been indexed from a structured data source, such as a database,
directory server, or CSV file. The Exact Data Profile runs on the detection server. If an EDM
policy is deployed to an endpoint, the DLP Agent sends the message to the detection server
for evaluation (two-tier detection).

See “About the Exact Data Profile and index” on page 528.

See “Introducing profiled Directory Group Matching (DGM)” on page 942.

See “About two-tier detection for EDM on the endpoint” on page 533.

Indexed Document An Indexed Document Profile is used for Indexed Document Matching (IDM) policies. The
Profile Indexed Document Profile contains data that has been indexed from a collection of confidential
documents. The Indexed Document Profile runs on the detection server. If an IDM policy is
deployed to an endpoint, the DLP Agent sends the message to the detection server for
evaluation (two-tier detection).

See “About the Indexed Document Profile” on page 615.

Vector Machine A Vector Machine Learning Profile is used for Vector Machine Learning (VML) policies. The
Learning Profile Vector Machine Learning Profile contains a statistical model of the features (keywords)
extracted from content that you want to protect. The VML profile is loaded into memory by
the detection server and DLP Agent. VML does not require two-tier detection.

See “About the Vector Machine Learning Profile” on page 665.

See “About the Vector Machine Learning Profile” on page 665.

Form Recognition A Form Recognition Profile is used for Form Recognition policies. The Form Recognition
Profile Profile contains blank images of forms you want to detect.

When you configure a profile, yoo specify a numeric value to represent the Fill Threshold.
This number is a value from 1-10. 1 represents a form that has been filled out minimally and
10 a form that is completely filled in. If the Fill Threshold is met or exceeded, an incident is
opened.

See “Managing Form Recognition profiles” on page 700.

User Groups
You define User Groups on the Enforce Server. User Groups contain user identity information
that you populate by synchronizing the Enforce Server with a group directory server (Microsoft
Active Directory).
You must have at least policy authoring or server administrator privileges to define User Groups.
You must define the User Groups before you synchronize users.
Introduction to policies 377
Policy template import and export

Once you define a User Group, you populate it with users, groups, and business units from
your directory server. After the user group is populated, you associate it with the User/Sender
and Recipient detection rules or exceptions. The policy only applies to members of that User
Group.
See “Introducing synchronized Directory Group Matching (DGM)” on page 935.
See “Configuring directory server connections” on page 156.
See “Configuring User Groups” on page 936.

Policy template import and export


You can export and import policy templates to and from the Enforce Server. This feature lets
you share policy templates across environments, version existing policies, and archive legacy
policies.
Consider a scenario where you author and refine a policy on a test system and then export
the policy as a template. You then import this policy template to a production system for
deployment to one or more detection servers. Or, if you want to retire a policy, you export it
as a template for archiving, then remove it from the system.
See “Importing policy templates” on page 441.
See “Exporting policy detection as a template” on page 442.
A policy template is an XML file. The template contains the policy metadata, and the detection
and the group rules and exceptions. If a policy template contains more than one condition that
requires a Data Profile, the system imports only one of these conditions. A policy template
does not include policy response rules, or modified or custom data identifiers.
Table 17-7 describes policy template components.

Table 17-7 Components included in policy templates

Policy component Description Included in


Template

Policy metadata (name, The name of the template has to be less than 60 characters or YES
description, label) it does not appear in the Imported Templates list.

Described Content Matching If the template contains only DCM methods, it imports as YES
(DCM) rules and exceptions exported without changes.

Exact Data Matching (EDM) If the template contains multiple EDM or IDM match conditions, YES
and Indexed Document only one is exported.
Matching (IDM) conditions
If the template contains an EDM and an IDM condition, the
system drops the IDM.
Introduction to policies 378
Workflow for implementing policies

Table 17-7 Components included in policy templates (continued)

Policy component Description Included in


Template

User Group User group methods are maintained on import only if the user NO
groups exist on the target before import.

Policy Group Policy groups do not export. On import you can select a local NO
policy group, otherwise the system assigns the policy to the
Default Policy group.

Response Rules You must define and add response rules to policies from the NO
local Enforce Server instance.

Data Profiles On import you must reference a locally defined Data Profile, NO
otherwise the system drops any methods that require a Data
Profile.

Custom data identifiers Modified and custom data identifiers do not export. NO

Custom protocols Custom protocols do not export. NO

Policy state Policy state (Active/Suspended) does not export. NO

Workflow for implementing policies


Policies define the content, event context, and identities you want to detect. Policies may also
define response rule actions if a policy is violated. Successful policy creation is a process that
requires careful analysis and proper configuration to achieve optimum results.
Table 17-8 describes the typical workflow for implementing Data Loss Prevention policies.

Table 17-8 Policy implementation process

Action Description

Familiarize yourself with the different types of detection See “Detecting data loss” on page 381.
technologies and methods that Symantec Data Loss
See “Data Loss Prevention policy detection technologies”
Prevention provides, and considerations for authoring
on page 383.
data loss prevention policies.
See “Policy matching conditions” on page 386.

See “Best practices for authoring policies” on page 449.

Develop a policy detection strategy that defines the type See “Develop a policy strategy that supports your data
of data you want to protect from data loss. security objectives” on page 451.
Introduction to policies 379
Viewing, printing, and downloading policy details

Table 17-8 Policy implementation process (continued)

Action Description

Review the policy templates that ship with Symantec See “Policy templates” on page 371.
Data Loss Prevention, and any templates that you import
See “Solution packs” on page 372.
manually or by solution pack.

Create policy groups to control how your policies are See “Policy groups” on page 372.
accessed, edited, and deployed.
See “Policy deployment” on page 373.

To detect exact data or content or similar unstructured See “Data Profiles” on page 375.
data, create one or more Data Profiles.

To detect exact identities from a synchronized directory See “User Groups” on page 376.
server (Active Directory), configure one or more User
Groups.

Configure conditions for detection and group rules and See “Creating a policy from a template” on page 397.
exceptions.

Test and tune your policies. See “Test and tune policies to improve match accuracy”
on page 453.

Add response rules to the policy to take action when See “About response rules” on page 1738.
the policy is violated.

Manage the policies in your enterprise. See “Manage and add policies” on page 432.

Viewing, printing, and downloading policy details


You may be required to share high-level details about your policies with individuals who are
not Symantec Data Loss Prevention users. For example, you might be asked to provide policy
details to an information security officer in your company, or to and outside security auditor.
To facilitate such an action, you can view and print policy details in an easily readable format
from the Policy List screen. The policy detail view does not include any technical nomenclature
or branding specific to Symantec Data Loss Prevention. It displays the policy name, description,
label, group, status, version, and last modified date for the policy. It also displays the detection
and the response rules for that policy.
Any user with the Author Policies privilege for a given policy or set of policies can view and
print policy details.
See “Policy authoring privileges” on page 375.
Table 17-9 describes how to work with policy details.
Introduction to policies 380
Viewing, printing, and downloading policy details

Table 17-9 Working with policy details

Action Description

View and print details for a single policy. See “Viewing and printing policy details”
on page 444.

Download details for all policies. See “Downloading policy details” on page 444.
Chapter 18
Overview of policy detection
This chapter includes the following topics:

■ Detecting data loss

■ Data Loss Prevention policy detection technologies

■ Policy matching conditions

■ Detection messages and message components

■ Exception conditions

■ Compound conditions

■ Policy detection execution

■ Two-tier detection for DLP Agents

Detecting data loss


Symantec Data Loss Prevention detects data from virtually any type of message or file, any
user, sender, or recipient, wherever your data or endpoints exist. You can use Data Loss
Prevention to detect both the content and the context of data within your enterprise. You define
and manage your detection policies from the centralized, Web-based Enforce Server
administration console.
See “Content that can be detected” on page 382.
See “Files that can be detected” on page 382.
See “Protocols that can be monitored” on page 382.
See “Endpoint events that can be detected” on page 383.
See “Identities that can be detected” on page 383.
See “Languages that can be detected” on page 383.
Overview of policy detection 382
Detecting data loss

Content that can be detected


Symantec Data Loss Prevention detects data and document content, including text, markup,
presentations, spreadsheets, archive files and their contents, email messages, database files,
designs and graphics, multimedia files, image-based forms and more. For example, the system
can open a compressed file and scan a Microsoft Word document within the compressed file
for the keyword "confidential." If the keyword is matched, the detection engine flags the message
as an incident.
Content-based detection is based on actual content, not the file itself. A detection server can
detect extracts or derivatives of protected or described content. This content may include
sections of documents that have been copied and pasted to other documents or emails. A
detection server can also identify sensitive data in a different file format than the source file.
For example, if a confidential Word file is fingerprinted, the detection engine can match the
content emailed in a PDF attachment.
See “Content matching conditions” on page 387.

Files that can be detected


Symantec Data Loss Prevention recognizes many types of files and attachments based on
their context, including file type, file name, and file size. Symantec Data Loss Prevention
identifies over 300 types of files, including word-processing formats, multimedia files,
spreadsheets, presentations, pictures, encapsulation formats, encryption formats, and others.
For file type detection, the system does not rely on the file extension to identify the file type.
For example, the system recognizes a Microsoft Word file even if a user changes the file
extension to .txt. In this case the detection engine checks the binary signature of the file to
match its type.
See “File property matching conditions” on page 388.

Protocols that can be monitored


Symantec Data Loss Prevention detects messages on the network by identifying the protocol
signature: email (SMTP), Web (HTTP), file transfer (FTP), newsgroups (NNTP), TCP, Telnet,
and SSL.
You can configure a detection server to listen on non-default ports for data loss violations. For
example, if your network transmits Web traffic on port 81 instead of port 80, the system still
recognizes the transmitted content as HTTP.
See “Protocol matching condition for network” on page 389.
Overview of policy detection 383
Data Loss Prevention policy detection technologies

Endpoint events that can be detected


Symantec Data Loss Prevention lets you detect data loss violations at several endpoint
destinations. These destinations include the local drive, CD/DVD drive, removable storage
devices, network file shares, Windows Clipboard, printers and faxes, and application files. You
can also detect protocol events on the endpoint for email (SMTP), Web (HTTP), and file transfer
(FTP) traffic.
For example, the DLP Agent (installed on each endpoint computer) can detect the copying of
a confidential file to a USB device. Or, the DLP Agent can allow the copying of files only to a
specific class of USB device that meets corporate encryption requirements.
See “Endpoint matching conditions” on page 389.

Identities that can be detected


Symantec Data Loss Prevention lets you detect the identity of data users, message senders,
and message recipients using a variety of methods. These methods include described identity
patterns and exact identities matched from a directory server or a corporate database.
For example, you can detect email messages sent by a specific user, or allow email messages
sent to or from a specific group of users as defined in your Microsoft Active Directory server.
See “Groups (identity) matching conditions” on page 390.

Languages that can be detected


Symantec Data Loss Prevention provides broad international support for detecting data loss
in many languages. Supported languages include most Western and Central European
languages, Hebrew, Arabic, Chinese (simplified and traditional), Japanese, Korean, and more.
The detection engine uses Unicode internally. You can build localized policy rules and
exceptions using any detection technology in any supported language.
See “Supported languages for detection” on page 92.
See “Detecting non-English language content” on page 866.

Data Loss Prevention policy detection technologies


Symantec Data Loss Prevention provides several types of detection technologies to help you
author policies to detect data loss. Each type of detection technology provides unique
capabilities. Often you combine technologies in policies to achieve precise detection results.
In addition, Symantec Data Loss Prevention provides you with several ways to extend policy
detection and match any type of data, content, or files you want.
See “About Data Loss Prevention policies” on page 368.
Overview of policy detection 384
Data Loss Prevention policy detection technologies

See “Best practices for authoring policies” on page 449.


Table 18-1 lists the various types of the detection technologies and customizations provided
by Data Loss Prevention.

Table 18-1 Data Loss Prevention detection technologies

Technology Description

Exact Data Matching (EDM) Use EDM to detect personally identifiable information.

See “Introducing Exact Data Matching (EDM)” on page 525.

Exact Match Data Identifiers Use EMDI to detect structured data, especially personally-identifiable information.
(EMDI) EMDI provides better matching performance and greater memory efficiency than EDM.

See “Introducing Exact Match Data Identifiers (EMDI)” on page 468.

Indexed Document Matching Use IDM to detect exact files and file contents, and derivative content.
(IDM)
See “Introducing Indexed Document Matching (IDM)” on page 612.

Vector Machine Learning Use VML to detect similar document content.


(VML)
See “Introducing Vector Machine Learning (VML)” on page 664.

Form Recognition Use Form Recognition to detect images of forms that belong to a gallery associated
with a Form Recognition policy.

See “About Form Recognition detection” on page 695.

Directory Group Matching Use DGM to detect exact identities synchronized from a directory server or profiled
(DGM) from a database.

See “Introducing synchronized Directory Group Matching (DGM)” on page 935.

See “Introducing profiled Directory Group Matching (DGM)” on page 942.


Overview of policy detection 385
Data Loss Prevention policy detection technologies

Table 18-1 Data Loss Prevention detection technologies (continued)

Technology Description

Described Content Matching Use DCM to detect message content and context, including:
(DCM)
■ Data Identifiers to match content using precise patterns and data validators.
See “Introducing data identifiers” on page 717.
■ Keywords to detect content using key words, key phrases, and keyword dictionaries.
See “Introducing keyword matching” on page 838.
■ Regular Expressions to detect characters, patterns, and strings.
See “Introducing regular expression matching” on page 852.
■ File properties to detect files by type, name, size, and custom type.
See “Introducing file property detection” on page 900.
■ User, sender, and recipient patterns to detect described identities.
See “Introducing described identity matching” on page 925.
■ Protocol signatures to detect network traffic.
See “Introducing protocol monitoring for network” on page 912.
■ Destinations, devices, and protocols to detect endpoint events.
See “Introducing endpoint event detection” on page 915.

Information Centric Tagging ■ Classifications to detect Information Centric Tagging tags


(ICT) See “Introducing classification matching” on page 858.
Overview of policy detection 386
Policy matching conditions

Table 18-1 Data Loss Prevention detection technologies (continued)

Technology Description

Custom policy detection Data Loss Prevention provides methods for customizing and extending detection,
methods including:

■ Custom Data Identifiers


Implement your own data identifier patterns and system-defined validators.
See “Introducing data identifiers” on page 717.
■ Custom script validators for Data Identifiers
Use the Symantec Data Loss Prevention Scripting Language to validate custom
data types.
See “Workflow for creating custom data identifiers” on page 812.
■ Custom file type identification
Use the Symantec Data Loss Prevention Scripting Language to detect custom file
types.
See “About custom file type identification” on page 901.
■ Custom endpoint device detection
Detect or allow any endpoint device using regular expressions.
See “About endpoint device detection” on page 917.
■ Custom network protocol detection
Define custom TCP ports to tap.
See “Introducing protocol monitoring for network” on page 912.
■ Custom content extraction
Use a plug-in to identify custom file formats and extract file contents for analysis
by the detection server.
See “Overview of detection file format support” on page 962.

Policy matching conditions


Symantec Data Loss Prevention provides several types of match conditions, each offering
unique detection capabilities. You implement match conditions in policies as rules or exceptions.
Detection rules use conditions to match message content or context. Group rules use conditions
to match identities. You can also use conditions as detection and group policy exceptions.
See “Exception conditions” on page 393.
Table 18-2 lists the various types of policy matching conditions provided by Data Loss
Prevention.

Table 18-2 Policy match condition types

Condition type Description

Content See “Content matching conditions” on page 387.


Overview of policy detection 387
Policy matching conditions

Table 18-2 Policy match condition types (continued)

Condition type Description

File property See “File property matching conditions” on page 388.

Protocol See “Protocol matching condition for network” on page 389.

Endpoint See “Endpoint matching conditions” on page 389.

Groups (identity) See “Groups (identity) matching conditions” on page 390.

Content matching conditions


Symantec Data Loss Prevention provides several conditions to match message content. Certain
content conditions require an associated Data Profile and index. For content detection, you
can match on individual message components, including header, body, attachments, and
subject for some conditions.
See “Detection messages and message components” on page 391.
See “Content that can be detected” on page 382.
Table 18-3 lists the content matching conditions that you can use without a Data Profile and
index.

Table 18-3 Content matching conditions

Content rule type Description

Content Matches Regular Match described content using regular expressions.


Expression
See “Introducing regular expression matching” on page 852.

See “Configuring the Content Matches Regular Expression condition” on page 854.

Content Matches Keyword Match described content using keywords, key phrases, and keyword dictionaries

See “Introducing keyword matching” on page 838.

See “Configuring the Content Matches Keyword condition” on page 844.

Content Matches Data Match described content using Data Identifier patterns and validators.
Identifier
See “Introducing data identifiers” on page 717.

See “Configuring the Content Matches data identifier condition” on page 737.

Content Matches Match described content using Information Centric Tagging tagged files and emails.
Classification
See “Introducing classification matching” on page 858.

Table 18-4 lists the content matching conditions that require a Data Profile and index.
Overview of policy detection 388
Policy matching conditions

See “Data Profiles” on page 375.


See “Two-tier detection for DLP Agents” on page 395.

Table 18-4 Index-based content matching conditions

Content rule type Description

Content Matches Exact Data Match exact data profiled from a structured data source such as a database or CSV
From an Exact Data Profile file.
(EDM)
See “Introducing Exact Data Matching (EDM)” on page 525.

See “Configuring the Content Matches Exact Data policy condition for EDM”
on page 551.
Note: This condition requires two-tier detection on the endpoint. See “About two-tier
detection for EDM on the endpoint” on page 533.

Content Matches Document Match files and file contents exactly or partially using fingerprinting
Signature From an Indexed
See “Introducing Indexed Document Matching (IDM)” on page 612.
Document Profile (IDM)
See “Configuring the Content Matches Document Signature policy condition”
on page 646.
Note: This condition requires two-tier detection on the endpoint. See “About the
Indexed Document Profile” on page 615.

Detect using Vector Machine Match file contents with features similar to example content you have trained.
Learning profile (VML)
See “Introducing Vector Machine Learning (VML)” on page 664.

See “Configuring the Detect using Vector Machine Learning Profile condition”
on page 679.

File property matching conditions


Symantec Data Loss Prevention provides several conditions to match file properties, including
file type, file size, and file name.
See “Files that can be detected” on page 382.

Table 18-5 File property match conditions

Condition type Description

Message Attachment or File Match specific file formats and document attachments.
Type Match
See “About file type matching” on page 900.

See “Configuring the Message Attachment or File Type Match condition” on page 904.
Overview of policy detection 389
Policy matching conditions

Table 18-5 File property match conditions (continued)

Condition type Description

Message Attachment or File Match files or attachments over or under a specified size.
Size Match
See “About file size matching” on page 902.

See “Configuring the Message Attachment or File Size Match condition” on page 905.

Message Attachment or File Match files or attachments that have a specific name or match wildcards.
Name Match
See “About file name matching” on page 903.

See “Configuring the Message Attachment or File Name Match condition”


on page 906.

Message/Email Properties and Classify Microsoft Exchange email messages based on specific message attributes
Attributes (MAPI attributes).

Custom File Type Signature Match custom file types based on their binary signature using scripting.

See “About custom file type identification” on page 901.

See “Enabling the Custom File Type Signature condition in the policy console”
on page 908.

Protocol matching condition for network


Symantec Data Loss Prevention provides the single Protocol Monitoring condition to match
network traffic for policy detection rules and exceptions.
See “Protocols that can be monitored” on page 382.

Table 18-6 Protocol matching condition for network monitoring

Match condition Description

Protocol Monitoring Match incidents on the network transmitted using a specified protocol, including
SMTP, FTP, HTTP/S, IM, and NNTP.

See “Introducing protocol monitoring for network” on page 912.

See “Configuring the Protocol Monitoring condition for network detection” on page 913.

Endpoint matching conditions


Symantec Data Loss Prevention provides several conditions for matching endpoint events.
See “Endpoint events that can be detected” on page 383.
Overview of policy detection 390
Policy matching conditions

Table 18-7 Endpoint matching conditions

Condition Description

Protocol or Endpoint Match endpoint messages transmitted using a specified transport protocol or when
Monitoring data is moved or copied to a particular destination.

See “Introducing endpoint event detection” on page 915.

See “Configuring the Endpoint Monitoring condition” on page 918.

Endpoint Device Class or ID Match endpoint events occurring on specified hardware devices.

See “Introducing endpoint event detection” on page 915.

See “Configuring the Endpoint Device Class or ID condition” on page 920.

Endpoint Location Match endpoint events depending if the DLP Agent is on or off the corporate network.

See “Introducing endpoint event detection” on page 915.

See “Configuring the Endpoint Location condition” on page 919.

Groups (identity) matching conditions


Symantec Data Loss Prevention provides several conditions for matching the identity of users
and groups, and message senders and recipients.
The sender and recipient pattern rules are reusable across policies. The Directory Group
Matching (DGM) rules let you match on sender and recipients derived from Active Directory
(synchronized DGM) or from an Exact Data Profile (profiled DGM).
See “Identities that can be detected” on page 383.
See “Two-tier detection for DLP Agents” on page 395.

Table 18-8 Available group rules for identity matching

Group rule Description

Sender/User Matches Pattern Match message senders and users by email address, user ID, IM screen name,
and IP address.

See “Introducing described identity matching” on page 925.

See “Configuring the Sender/User Matches Pattern condition” on page 927.

Recipient Matches Pattern Match message recipients by email or IP address, or Web domain.

See “Introducing described identity matching” on page 925.

See “Configuring the Recipient Matches Pattern condition” on page 930.


Overview of policy detection 391
Detection messages and message components

Table 18-8 Available group rules for identity matching (continued)

Group rule Description

Sender/User based on a Match message senders and users from a synchronized directory server.
Directory Server Group
See “Introducing synchronized Directory Group Matching (DGM)” on page 935.

See “Configuring the Sender/User based on a Directory Server Group condition”


on page 939.

Sender/User based on a Match message senders and users from a profiled directory server.
Directory from: an Exact Data
See “Introducing profiled Directory Group Matching (DGM)” on page 942.
Profile
See “Configuring the Sender/User based on a Profiled Directory condition”
on page 944.
Note: This condition requires two-tier detection on the endpoint. See “About two-tier
detection for profiled DGM” on page 942.

Recipient based on a Directory Match message recipients from a synchronized directory server.
Server Group
See “Introducing synchronized Directory Group Matching (DGM)” on page 935.

See “Configuring the Recipient based on a Directory Server Group condition”


on page 940.
Note: This condition requires two-tier detection on the endpoint. See “About two-tier
detection for synchronized DGM” on page 936.

Recipient based on a Directory Match message recipients from a profiled directory server.
from: an Exact Data Profile
See “Configuring Exact Data profiles for DGM” on page 943.
See “Configuring the Recipient based on a Profiled Directory condition” on page 945.
Note: This condition requires two-tier detection on the endpoint. See “About two-tier
detection for profiled DGM” on page 942.

Detection messages and message components


Data Loss Prevention detection servers and DLP Agents receive input data for analysis in the
form of messages. The system determines the message type; for example, an email or a Word
document. Depending on the message type, the system either parses the message content
into components (header, subject, body, attachments), or it leaves the message intact. The
system evaluates the message or message components to see if any policy match conditions
apply. If a condition applies and it supports component matching, the system evaluates the
content against each selected message component. If the condition does not support component
matching, the system evaluates the entire message against the match condition.
See “Selecting components to match on” on page 423.
Overview of policy detection 392
Detection messages and message components

The content-based conditions support cross-component matching. You can configure the DCM
content conditions to match across all message components. The EDM condition matches on
message envelope, body, and attachments. The document conditions match on the message
body and attachments, except File Type and Name which only match on the attachment.
Protocol, endpoint, and identity conditions match on the entire message, as does any condition
evaluated by the DLP Agent. The subject component only applies to SMTP email or NNTP
messages.
Table 18-9 summarizes the component matching supported by each match condition type.

Table 18-9 Message components to match on

Condition type Envelope Subject Body Attachment(s)

Described content (DCM) match match match match


conditions for content detection:

Keyword, Data Identifier, Regular


Expression

Information Centric Tagging (ICT) match match


classifications for content
detection:

Classification

Exact Data Matching (EDM) match match match

Indexed Document Matching match match


(IDM)

Vector Machine Learning (VML) match match

Form Recognition match

File Size (DCM) match match

File Type and File Name (DCM) match

Protocol (DCM) match (entire message)

Endpoint (DCM) match (entire message)

Identity (DCM and DGM) match (entire message)

Any condition evaluated by the match (entire message)


DLP Agent
Overview of policy detection 393
Exception conditions

Exception conditions
Symantec Data Loss Prevention provides policy exceptions to exclude messages and message
components from matching. You can use exception conditions to refine the scope of your
detection and group rules.
See “Use a limited number of exceptions to narrow detection scope” on page 455.

Warning: Do not use multiple compound exceptions in a single policy. Doing so can cause
detection to run out of memory. If you find that the policy needs multiple compound exceptions
to produce matches, you should reconsider the design of the matching conditions.

The system evaluates an inbound message or message component against policy exceptions
before policy rules. If the exception supports cross-component matching (content-based
exceptions), the exception can be configured to match on individual message components.
Otherwise, the exception matches on the entire message.
If an exception is met, the system ejects the entire message or message component containing
the content that triggered the exception. The ejected message or message component is no
longer available for evaluation against policy rules. The system does not discard only the
matched content or data item; it discards the entire message or message component that
contained the excepted item.

Note: Symantec Data Loss Prevention does not support match-level exceptions, only component
or message-level exceptions.

For example, consider a policy that has a detection rule with one condition and an exception
with one condition. The rule matches messages containing Microsoft Word attachments and
generates an incident for each match. The exception excludes from matching messages from
[email protected]. An email from [email protected] that contains a Word attachment is
excepted from matching and does not trigger an incident. The detection exception condition
excluding [email protected] messages takes precedence over the detection rule match
condition that would otherwise match on the message.
See “Policy detection execution” on page 394.
You can implement any condition as an exception, except the EDM condition Content Matches
Exact Data From. In addition, Network Prevent for Web does not support synchronized DGM
exceptions. You can implement IDM as an exception, but the exception excludes exact files
from matching, not file contents. To exclude file contents, you "whitelist" it. VML can be used
as an exception if the content is from the same category.
See “Adding an exception to a policy” on page 424.
See “CAN-SPAM Act policy template” on page 1563.
Overview of policy detection 394
Compound conditions

See “White listing file contents to exclude from partial matching” on page 627.

Compound conditions
A valid policy must declare at least one rule that defines at least one match condition. The
condition matches input data to detect data loss. A rule with a single condition is a simple rule.
Optionally, you can declare multiple conditions within a single detection or group rule. A rule
with multiple conditions is a compound condition.
For compound conditions, each condition in the rule must match to trigger a violation. Thus,
for a single policy that declares one rule with two conditions, if one condition matches but the
other does not, detection does not report a match. If both conditions match, detection reports
a match, assuming that the rule is set to count all matches. In programmatic terms, two or
more conditions in the same rule are ANDed together.
Like rules, you can declare multiple conditions within a single exception. In this case, all
conditions in the exception must match for the exception to apply.
See “Policy detection execution” on page 394.
See “Use compound conditions to improve match accuracy” on page 455.
See “Exception conditions” on page 393.

Policy detection execution


You can include any combination of detection rules, group rules, and exceptions in a single
policy. A detection server evaluates policy exceptions first. If any exception is met, the entire
message or message component matching the exception is ejected and is no longer available
for policy matching.
The detection server evaluates the detection and group rules in the policy on a per-rule basis.
In programmatic terms, where you have a single policy definition, the connection between
conditions in the same rule or exception is AND (compound conditions). The connection
between two or more rules of the same type is OR (for example, 2 detection rules). But, if you
combine rules of different type in a single policy (for example, 1 detection rule and 1 group
rule), the connection between the rules is AND. In this configuration both rules must match to
trigger an incident. However, exception conditions created across the "Detection" and "Groups"
tabs are connected by an implicit OR.
See “Compound conditions” on page 394.
See “Exception conditions” on page 393.
Table 18-10 summarizes the policy condition execution logic for the detection server for various
policy configurations.
Overview of policy detection 395
Two-tier detection for DLP Agents

Table 18-10 Policy condition execution logic

Policy configuration Logic Description

Compound conditions AND If a single rule or exception in a policy contains two or more
match conditions, all conditions must match.

Rules or exceptions of same OR If there are two detection rules in a single policy, or two group
type rules in a single policy, or two exceptions of the same type
(detection or group), the rules or exceptions are independent
of each other.

Rules of different type AND If one or more detection rules is combined with one or more
group rules in a single policy, the rules are dependent.

Exceptions of different type OR If one or more detection exceptions is combined with one or
more group exceptions in a single policy, the exceptions are
independent.

Two-tier detection for DLP Agents


Symantec Data Loss Prevention uses a two-tier detection architecture to analyze activity on
endpoints for some index-based match conditions.
Two-tier detection requires communication and data transfer between the DLP Agent and the
Endpoint Server to detect incidents. If a match condition requires two-tier detection, the condition
is not evaluated locally on the endpoint by the DLP Agent. Instead, the DLP Agent sends the
data to the Endpoint Server for policy evaluation.
See “Guidelines for authoring Endpoint policies” on page 2275.
The effect of two-tier detection is that policy evaluation is delayed for the time it takes the data
to be sent to and evaluated by the Endpoint Server. If the DLP Agent is not connected to the
network or cannot communicate with the Endpoint Server, the condition requiring two-tier
detection is not evaluated until the DLP Agent connects. This delay can impact performance
of the DLP Agent if the message is a large file or attachment.
See “Troubleshooting policies” on page 445.
Two-tier detection has implications for the kinds of policies you author for endpoints. You can
reduce the potential bottleneck of two-tier detection by being aware of the detection conditions
that require two-tier detection and author your endpoint policies in such a way to eliminate or
reduce the need for two-tier detection.
See “Author policies to limit the potential effect of two-tier detection” on page 456.
Table 18-11 lists the detection conditions that require two-tier detection on the endpoint.
Overview of policy detection 396
Two-tier detection for DLP Agents

Note: You cannot combine an Endpoint Prevent: Notify or Block response rule with two-tier
match conditions, including Exact Data Matching (EDM), Directory Group Matching (DGM),
and Indexed Document Matching (IDM) when two-tier detection is enabled. If you do, the
system displays a warning for both the detection condition and the response rule.

Table 18-11 Policy matching conditions requiring two-tier detection

Detection technology Match condition Description

Exact Data Matching (EDM) Content Matches Exact Data from See “Introducing Exact Data Matching
an Exact Data Profile (EDM)” on page 525.

See “About two-tier detection for EDM


on the endpoint” on page 533.

Profiled Directory Group Matching Sender/User based on a Directory See “Introducing profiled Directory
(DGM) from an Exact Data Profile Group Matching (DGM)” on page 942.

Recipient based on a Directory from See “About two-tier detection for


an Exact Data Profile profiled DGM” on page 942.

Synchronized Directory Group Recipient based on a Directory See “Introducing synchronized


Matching (DGM) Server Group Directory Group Matching (DGM)”
on page 935.

See “About two-tier detection for


synchronized DGM” on page 936.

Indexed Document Matching (IDM) Content Matches Document See “Introducing Indexed Document
Signature from an Indexed Document Matching (IDM)” on page 612.
Profile
See “Two-tier IDM detection”
on page 615.
Note: Two-tier detection for IDM only
applies if it is enabled on the Endpoint
Server (two_tier_idm = on). If Endpoint
IDM is enabled (two_tier_idm = off),
two-tier detection is not used.
Chapter 19
Creating policies from
templates
This chapter includes the following topics:

■ Creating a policy from a template

■ US Regulatory Enforcement policy templates

■ General Data Protection Regulation (GDPR) policy templates

■ International Regulatory Enforcement policy templates

■ Customer and Employee Data Protection policy templates

■ Confidential or Classified Data Protection policy templates

■ Network Security Enforcement policy templates

■ Acceptable Use Enforcement policy templates

■ Columbia Personal Data Regulatory Enforcement policy template

■ Choosing an Exact Data Profile

■ Choosing an Indexed Document Profile

Creating a policy from a template


You can create a policy from a system-provided template or from a template you import to the
Enforce Server.
See “Policy templates” on page 371.
See “Policy template import and export” on page 377.
Creating policies from templates 398
Creating a policy from a template

Table 19-1 Create a policy from a template

Action Description

Add a policy from a template. See “Adding a new policy or policy template” on page 412.

Choose the template you want to At the Manage > Policies > Policy List > New Policy - Template List screen the
use. system lists all policy templates.
System-provided template categories:

■ See “US Regulatory Enforcement policy templates” on page 400.


■ See “General Data Protection Regulation (GDPR) policy templates” on page 402.
■ See “International Regulatory Enforcement policy templates” on page 403.
■ See “Customer and Employee Data Protection policy templates” on page 404.
■ See “Confidential or Classified Data Protection policy templates” on page 405.
■ See “Network Security Enforcement policy templates” on page 406.
■ See “Acceptable Use Enforcement policy templates” on page 407.
■ See “Columbia Personal Data Regulatory Enforcement policy template”
on page 408.
Imported Templates appear individually after import:

■ See “Importing policy templates” on page 441.

Click Next to configure the policy. For example, select the Webmail policy template and click Next.

See “Configuring policies” on page 413.

Choose a Data Profile (if If the template relies on one or more Data Profiles, the system prompts you to
prompted). select each:
■ Exact Data Profile
See “Choosing an Exact Data Profile” on page 409.
■ Indexed Document Profile
See “Choosing an Indexed Document Profile” on page 411.
If you do not have a Data Profile, you can either:

■ Cancel the policy definition process, define the profile, and resume creating the
policy from the template.
■ Click Next to configure the policy.
On creation of the policy, the system drops any rules or exceptions that rely on
the Data Profile.

Note: You should use a profile if a template calls for it.


Creating policies from templates 399
Creating a policy from a template

Table 19-1 Create a policy from a template (continued)

Action Description

Edit the policy name or If you intend to modify a system-defined template, you may want to change the
description (optional). name so you can distinguish it from the original.

See “Configuring policies” on page 413.


Note: If you want to export the policy as a template, the policy name must be less
than 60 characters. If it is more, the template does not appear in the Imported
Templates section of the Template List screen.

Note: The Policy Label field is reserved for the Veritas Data Insight Self-Service
Portal.

Select a policy group (if If you have defined a policy group, select it from the Policy Group list.
necessary).
See “Creating and modifying policy groups” on page 436.

If you have not defined a policy group, the system deploys the policy to the Default
Policy Group.

Edit the policy rules or exceptions The Configure Policy screen displays the rules and exceptions (if any) provided
(if necessary). by the policy.

You can modify, add, and remove policy rules and exceptions to meet your
requirements.

See “Configuring policy rules” on page 417.

See “Configuring policy exceptions” on page 426.

Save the policy and export it Click Save to save the policy.
(optional).
You can export policy detection as a template for sharing or archiving.

See “Exporting policy detection as a template” on page 442.

For example, if you changed the configuration of a system-defined policy template,


you may want to export it for sharing across environments.

Test and tune the policy Test and tune the policy using data the policy should and should not detect.
(recommended).
Review the incidents that the policy generates. Refine the policy rules and
exceptions as necessary to reduce false positives and false negatives.

Add response rules (optional). Add response rules to the policy to report and remediate violations.

See “Implementing response rules” on page 1758.


Note: Response rules are not included in policy templates.
Creating policies from templates 400
US Regulatory Enforcement policy templates

US Regulatory Enforcement policy templates


Symantec Data Loss Prevention provides several policy templates supporting US Regulatory
Enforcement guidelines.
See “Creating a policy from a template” on page 397.

Table 19-2 US Regulatory Enforcement policy templates

Policy template Description

CAN-SPAM Act Establishes requirements for sending commercial email.

See “CAN-SPAM Act policy template” on page 1563.

Defense Message System (DMS) GENSER Detects information classified as confidential.


Classification
See “Defense Message System (DMS) GENSER Classification
policy template” on page 1572.

Export Administration Regulations (EAR) Enforces the U.S. Department of Commerce Export Administration
Regulations (EAR).

See “Export Administration Regulations (EAR) policy template”


on page 1576.

FACTA 2003 (Red Flag Rules) Enforces sections 114 and 315 (or Red Flag Rules) of the Fair
and Accurate Credit Transactions Act (FACTA) of 2003.

See “FACTA 2003 (Red Flag Rules) policy template” on page 1577.

Gramm-Leach-Bliley This policy limits sharing of consumer information by financial


institutions.

See “Gramm-Leach-Bliley policy template” on page 1688.

HIPAA and HITECH (including PHI) This policy enforces the US Health Insurance Portability and
Accountability Act (HIPAA).

See “HIPAA and HITECH (including PHI) policy template”


on page 1690.

International Traffic in Arms Regulations (ITAR) This policy enforces the US Department of State ITAR provisions.

See “International Traffic in Arms Regulations (ITAR) policy


template” on page 1696.

Medicare and Medicaid (including PHI) This policy detects protected health information (PHI) associated
with the United States Medicare and Medicaid programs.

See “Medicare and Medicaid (including PHI)” on page 1698.


Creating policies from templates 401
US Regulatory Enforcement policy templates

Table 19-2 US Regulatory Enforcement policy templates (continued)

Policy template Description

NASD Rule 2711 and NYSE Rules 351 and 472 This policy protects the name(s) of any companies that are involved
in an upcoming stock offering.

See “NASD Rule 2711 and NYSE Rules 351 and 472 policy
template” on page 1700.

NASD Rule 3010 and NYSE Rule 342 This policy monitors brokers-dealers communications.

See “NASD Rule 3010 and NYSE Rule 342 policy template”
on page 1702.

NERC Security Guidelines for Electric Utilities This policy detects the information that is outlined in the North
American Electric Reliability Council (NERC) security guidelines
for the electricity sector.

See “NERC Security Guidelines for Electric Utilities policy template”


on page 1703.

Office of Foreign Assets Control (OFAC) This template detects communications involving targeted OFAC
groups.

See “Office of Foreign Assets Control (OFAC) policy template”


on page 1706.

OMB Memo 06-16 and FIPS 199 Regulations This template detects information that is classified as confidential.

See “OMB Memo 06-16 and FIPS 199 Regulations policy template”
on page 1707.

Payment Card Industry Data Security Standard This template detects credit card number data.

See “Payment Card Industry (PCI) Data Security Standard policy


template” on page 1709.

Sarbanes-Oxley This template detects sensitive financial data.

See “Sarbanes-Oxley policy template” on page 1716.

SEC Fair Disclosure Regulation This template detects data disclosure of material financial
information.

See “SEC Fair Disclosure Regulation policy template” on page 1719.

State Data Privacy This template detects breaches of state-mandated confidentiality.

See “State Data Privacy policy template” on page 1723.


Creating policies from templates 402
General Data Protection Regulation (GDPR) policy templates

Table 19-2 US Regulatory Enforcement policy templates (continued)

Policy template Description

US Intelligence Control Markings (CAPCO) and This template detects authorized terms to identify classified
DCID 1/7 information in the US Federal Intelligence community.

See “US Intelligence Control Markings (CAPCO) and DCID 1/7


policy template” on page 1729.

General Data Protection Regulation (GDPR) policy


templates
The General Data Protection Regulation (GDPR) is a regulation by which the European
Commission intends to strengthen and unify data protection for individuals within the EU. It
also addresses export of personal data outside the EU. The primary objectives of the GDPR
are to give citizens back the control of their personal data and to simplify the regulatory
environment for international business by unifying the regulation within the EU. The GDPR
replaces the EU Data Protection Directives as of 25 May 2018.
Symantec Data Loss Prevention provides several policy template for General Data Protection
Regulation (GDPR) compliance.
See “Creating a policy from a template” on page 397.

Table 19-3
Policy template Description

General Data Protection Regulations (Banking and This policy protects personal identifiable information related
Finance) to banking and finance.

See “General Data Protection Regulation (Banking and


Finance)” on page 1583.

General Data Protection Regulation (Digital Identity) This policy protects personal identifiable information related
to digital identity.

See “General Data Protection Regulation (Digital Identity)”


on page 1617.

General Data Protection Regulation (Government This policy protects personal identifiable information related
Identification) to government identification.

See “General Data Protection Regulation (Government


Identification)” on page 1618.
Creating policies from templates 403
International Regulatory Enforcement policy templates

Table 19-3 (continued)

Policy template Description

General Data Protection Regulation (Healthcare and This policy protects personal identifiable information related
Insurance) to healthcare and insurance.

See “General Data Protection Regulation (Healthcare and


Insurance)” on page 1656.

General Data Protection Regulation (Personal Profile) This policy protects personal identifiable information related
to personal profile data.

See “General Data Protection Regulation (Personal


Profile)” on page 1672.

General Data Protection Regulation (Travel) This policy protects personal identifiable information related
to travel.

See “General Data Protection Regulation (Travel)”


on page 1675.

International Regulatory Enforcement policy


templates
Symantec Data Loss Prevention provides several policy templates for International Regulatory
Enforcement.
See “Creating a policy from a template” on page 397.

Table 19-4 International Regulatory Enforcement policy templates

Policy template Description

Caldicott Report This policy protects UK patient information.

See “Caldicott Report policy template” on page 1561.

Data Protection Act 1998 This policy protects personal identifiable information.

See “Data Protection Act 1998 policy template” on page 1568.

EU Data Protection Directives This policy detects personal data specific to the EU directives.

See “Data Protection Directives (EU) policy template” on page 1570.


Note: The EU Data Protection Directives are replaced by the General Data
Protection Regulation (GDPR) on 25 May 2018. See “General Data
Protection Regulation (GDPR) policy templates” on page 402.
Creating policies from templates 404
Customer and Employee Data Protection policy templates

Table 19-4 International Regulatory Enforcement policy templates (continued)

Policy template Description

Human Rights Act 1998 This policy enforces Article 8 of the act for UK citizens.
See “Human Rights Act 1998 policy template” on page 1694.

PIPEDA This policy detects Canadian citizen customer data.

See “PIPEDA policy template” on page 1711.

Customer and Employee Data Protection policy


templates
Symantec Data Loss Prevention provides several policy templates for Customer and Employee
Data Protection.
See “Creating a policy from a template” on page 397.

Table 19-5 Customer and Employee Data Protection policy templates

Policy template Description

Canadian Social Insurance Numbers This policy detects patterns indicating Canadian social insurance
numbers.

See “Canadian Social Insurance Numbers policy template” on page 1562.

Credit Card Numbers This policy detects patterns indicating credit card numbers.

See “Credit Card Numbers policy template” on page 1566.

Customer Data Protection This policy detects customer data.

See “Customer Data Protection policy template” on page 1567.

Employee Data Protection This policy detects employee data.

See “Employee Data Protection policy template” on page 1574.

Individual Taxpayer Identification Numbers This policy detects IRS-issued tax processing numbers.
(ITIN)
See “Individual Taxpayer Identification Numbers (ITIN) policy template”
on page 1695.

SWIFT Codes This policy detects codes banks use to transfer money across
international borders.

See “SWIFT Codes policy template” on page 1726.


Creating policies from templates 405
Confidential or Classified Data Protection policy templates

Table 19-5 Customer and Employee Data Protection policy templates (continued)

Policy template Description

UK Drivers License Numbers This policy detects UK Drivers License Numbers.


See “UK Drivers License Numbers policy template” on page 1727.

UK Electoral Roll Numbers This policy detects UK Electoral Roll Numbers.

See “UK Electoral Roll Numbers policy template” on page 1727.

UK National Insurance Numbers This policy detects UK National Insurance Numbers.

See “UK National Insurance Numbers policy template” on page 1728.

UK National Health Service Number This policy detects personal identification numbers issued by the NHS.

See “UK National Health Service (NHS) Number policy template”


on page 1728.

UK Passport Numbers This policy detects valid UK passports.

See “UK Passport Numbers policy template” on page 1728.

UK Tax ID Numbers This policy detects UK Tax ID Numbers.

See “UK Tax ID Numbers policy template” on page 1729.

US Social Security Numbers This policy detects patterns indicating social security numbers.

See “US Social Security Numbers policy template” on page 1730.

Confidential or Classified Data Protection policy


templates
Symantec Data Loss Prevention provides several policy templates for Confidential or Classified
Data Protection.
See “Creating a policy from a template” on page 397.

Table 19-6 Confidential or Classified Data Protection policy templates

Policy template Description

Confidential Documents This policy detects company-confidential documents.

See “Confidential Documents policy template” on page 1565.

Design Documents This policy detects various types of design documents.

See “Design Documents policy template” on page 1573.


Creating policies from templates 406
Network Security Enforcement policy templates

Table 19-6 Confidential or Classified Data Protection policy templates (continued)

Policy template Description

Encrypted Data This policy detects the use of encryption by a variety of methods.
See “Encrypted Data policy template” on page 1575.

Financial Information This policy detects financial data and information.

See “Financial Information policy template” on page 1581.

Merger and Acquisition Agreements This policy detects information and communications about upcoming merger
and acquisition activity.

See “Merger and Acquisition Agreements policy template” on page 1699.

Price Information This policy detects specific SKU and pricing information.

See “Price Information policy template” on page 1713.

Project Data This policy detects discussions of sensitive projects.

See “Project Data policy template” on page 1713.

Proprietary Media Files This policy detects various types of video and audio files.

See “Proprietary Media Files policy template” on page 1713.

Publishing Documents This policy detects various types of publishing documents.

See “Publishing Documents policy template” on page 1714.

Resumes This policy detects active job searches.


See “Resumes policy template” on page 1716.

Source Code This policy detects various types of source code.

See “Source Code policy template” on page 1722.

Symantec DLP Awareness and This policy detects any communications that refer to Symantec DLP or
Avoidance other data loss prevention systems and possible avoidance of detection.

See “Symantec DLP Awareness and Avoidance policy template”


on page 1726.

Network Security Enforcement policy templates


Symantec Data Loss Prevention provides several policy templates for Network Security
Enforcement.
See “Creating a policy from a template” on page 397.
Creating policies from templates 407
Acceptable Use Enforcement policy templates

Table 19-7 Network Security Enforcement policy templates

Policy template Description

Common Spyware Upload Sites This policy detects access to common spyware upload Web sites.
See “Common Spyware Upload Sites policy template” on page 1564.

Network Diagrams This policy detects computer network diagrams.

See “Network Diagrams policy template” on page 1704.

Network Security This policy detects evidence of hacking tools and attack planning.

See “Network Security policy template” on page 1705.

Password Files This policy detects password file formats.

See “Password Files policy template” on page 1709.

Acceptable Use Enforcement policy templates


Symantec Data Loss Prevention provides several policy templates for allowing acceptable
uses of information.
See “Creating a policy from a template” on page 397.

Table 19-8 Acceptable Use Enforcement policy templates

Policy template Description

Competitor Communications This policy detects forbidden communications with competitors.

See “Competitor Communications policy template” on page 1565.

Forbidden Websites This policy detects access to specified Web sites.

See “Forbidden Websites policy template” on page 1581.

Gambling This policy detects any reference to gambling.

See “Gambling policy template” on page 1582.

Illegal Drugs This policy detects conversations about illegal drugs and controlled
substances.

See “Illegal Drugs policy template” on page 1695.

Media Files This policy detects various types of video and audio files.

See “Media Files policy template” on page 1697.


Creating policies from templates 408
Columbia Personal Data Regulatory Enforcement policy template

Table 19-8 Acceptable Use Enforcement policy templates (continued)

Policy template Description

Offensive Language This policy detects the use of offensive language.


See “Offensive Language policy template” on page 1705.

Racist Language This policy detects the use of racist language.

See “Racist Language policy template” on page 1715.

Restricted Files This policy detects various file types that are generally inappropriate to send
out of the company.

See “Restricted Files policy template” on page 1715.

Restricted Recipients This policy detects communications with specified recipients.

See “Restricted Recipients policy template” on page 1715.

Sexually Explicit Language This policy detects sexually explicit content.

See “Sexually Explicit Language policy template” on page 1721.

Violence and Weapons This policy detects violent language and discussions about weapons.

See “Violence and Weapons policy template” on page 1731.

Webmail This policy detects the use of a variety of Webmail services.

See “Webmail policy template” on page 1731.

Yahoo Message Board Activity This policy detects Yahoo message board activity.
See “Yahoo Message Board Activity policy template” on page 1732.

Yahoo and MSN Messengers on Port This policy detects Yahoo IM and MSN Messenger activity.
80
See “Yahoo and MSN Messengers on Port 80 policy template” on page 1733.

Columbia Personal Data Regulatory Enforcement


policy template
Symantec Data Loss Prevention provides a policy templates for the enforcement of Columbian
personal data regulations.
See “Creating a policy from a template” on page 397.
Creating policies from templates 409
Choosing an Exact Data Profile

Table 19-9 Columbia Personal Data Regulatory Enforcement policy template

Policy template Description

Columbian Personal Data Protection Law 1581 This policy detects violations of the Columbian Personal
Data Protection Law 1581.

See “Colombian Personal Data Protection Law 1581 policy


template” on page 1564.

Choosing an Exact Data Profile


If the policy template you select implements Exact Data Matching (EDM), the system prompts
you to choose an Exact Data Profile. Table 19-10 lists the policy templates that are based on
Exact Data Profiles.
If you do not have an Exact Data Profile, you can cancel policy creation and define a profile.
Or, you can choose not to use an Exact Data Profile. In this case the system disables the
associated EDM detection rules in the policy template. You can use any DCM rules or
exceptions the policy template provides.
See “Introducing Exact Data Matching (EDM)” on page 525.
See “About the Exact Data Profile and index” on page 528.
To choose an Exact Data Profile
1 Select an Exact Data Profile from the list of available profiles.
2 Click Next to continue with creating the policy from the template.
Click Previous to return to the list of policy templates.
See “Creating a policy from a template” on page 397.

Note: When the system prompts you to select an Exact Data Profile, the display lists the data
columns to include in the profile to provide the highest level of accuracy. If data fields in your
Exact Data Profile are not represented in the selected policy template, the system displays
those fields for content matching when you define the detection rule

Table 19-10 Policy templates that implement Exact Data Matching (EDM)

Policy template Description

Caldicott Report See “Caldicott Report policy template” on page 1561.

Customer Data Protection See “Customer Data Protection policy template” on page 1567.

Data Protection Act 1988 See “Data Protection Act 1998 policy template” on page 1568.
Creating policies from templates 410
Choosing an Exact Data Profile

Table 19-10 Policy templates that implement Exact Data Matching (EDM) (continued)

Policy template Description

Employee Data Protection See “Employee Data Protection policy template” on page 1574.

EU Data Protection Directives See “Data Protection Directives (EU) policy template” on page 1570.

Export Administration Regulations (EAR) See “Export Administration Regulations (EAR) policy template”
on page 1576.

FACTA 2003 (Red Flag Rules) See “FACTA 2003 (Red Flag Rules) policy template” on page 1577.

General Data Protection Regulations See “General Data Protection Regulation (Banking and Finance)”
(Banking and Finance) on page 1583.

General Data Protection Regulations See “General Data Protection Regulation (Digital Identity)” on page 1617.
(Digital Identity)

General Data Protection Regulations See “General Data Protection Regulation (Government Identification)”
(Government Identification) on page 1618.

General Data Protection Regulations See “General Data Protection Regulation (Healthcare and Insurance)”
(Healthcare and Insurance) on page 1656.

General Data Protection Regulations See “General Data Protection Regulation (Personal Profile)” on page 1672.
(Personal Profile)

General Data Protection Regulations See “General Data Protection Regulation (Travel)” on page 1675.
(Travel)

Gramm-Leach-Bliley See “Gramm-Leach-Bliley policy template” on page 1688.

HIPAA and HITECH (including PHI) See “HIPAA and HITECH (including PHI) policy template” on page 1690.

Human Rights Act 1998 See “Human Rights Act 1998 policy template” on page 1694.

International Traffic in Arms Regulations See “International Traffic in Arms Regulations (ITAR) policy template”
(ITAR) on page 1696.

Payment Card Industry Data Security See “Payment Card Industry (PCI) Data Security Standard policy
Standard template” on page 1709.

PIPEDA See “PIPEDA policy template” on page 1711.

Price Information See “Price Information policy template” on page 1713.

Resumes See “Resumes policy template” on page 1716.

State Data Privacy See “SEC Fair Disclosure Regulation policy template” on page 1719.
Creating policies from templates 411
Choosing an Indexed Document Profile

Choosing an Indexed Document Profile


If the policy template you chose uses Indexed Document Matching (IDM) detection, the system
prompts you to select the Document Profile.
See “Introducing Indexed Document Matching (IDM)” on page 612.
To use a Document Profile
1 Select the Document Profile from the list of available profiles.
2 Click Next to create the policy from the template.
See “Creating a policy from a template” on page 397.
If you do not have a Document Profile, you can cancel policy creation and define the Document
Profile. Or, you can choose to not use a Document Profile. In this case the system disables
any IDM rules or exceptions for the policy instance. If the policy template contains DCM rules
or exceptions, you may use them.
See “About the Indexed Document Profile” on page 615.

Table 19-11 Policy templates that implement Indexed Document Matching (IDM)

Policy template Description

CAN-SPAM Act (IDM exception) See “CAN-SPAM Act policy template” on page 1563.

NASD Rule 2711 and NYSE Rules 351 See “NASD Rule 2711 and NYSE Rules 351 and 472 policy template”
and 472 on page 1700.

NERC Security Guidelines for Electric See “NERC Security Guidelines for Electric Utilities policy template”
Utilities on page 1703.

Sarbanes-Oxley See “Sarbanes-Oxley policy template” on page 1716.

SEC Fair Disclosure Regulation See “SEC Fair Disclosure Regulation policy template” on page 1719.

Confidential Documents See “Confidential Documents policy template” on page 1565.

Design Documents See “Design Documents policy template” on page 1573.

Financial Information See “Financial Information policy template” on page 1581.

Project Data See “Project Data policy template” on page 1713.

Proprietary Media Files See “Proprietary Media Files policy template” on page 1713.

Publishing Documents See “Publishing Documents policy template” on page 1714.

Source Code See “Source Code policy template” on page 1722.

Network Diagrams See “Network Diagrams policy template” on page 1704.


Chapter 20
Configuring policies
This chapter includes the following topics:

■ Adding a new policy or policy template

■ Configuring policies

■ Adding a rule to a policy

■ Configuring policy rules

■ Defining rule severity

■ Configuring match counting

■ Selecting components to match on

■ Adding an exception to a policy

■ Configuring policy exceptions

■ Configuring compound match conditions

■ Input character limits for policy configuration

Adding a new policy or policy template


As a policy author you can define a new policy from scratch or from a template.
See “Workflow for implementing policies” on page 378.
Configuring policies 413
Configuring policies

To add a new policy or a policy template


1 Click New at the Manage > Polices > Policy List screen.
See “Manage and add policies” on page 432.
2 Choose the type of policy you want to add at the New Policy screen.
Select Add a blank policy to add a new empty policy.
See “Policy components” on page 370.
Select Add a policy from a template to add a policy from a template.
See “Policy templates” on page 371.
3 Click Next to configure the policy or the policy template.
See “Configuring policies” on page 413.
See “Creating a policy from a template” on page 397.
Click Cancel to not add a policy and return to the Policy List screen.

Configuring policies
The Manage > Policies > Policy List > Configure Policy screen is the home page for
configuring policies.
Table 20-1 describes the workflow for configuring policies.

Table 20-1 Configuring policies

Action Description

Define a new policy, or edit an existing policy. Add a new blank policy.

See “Adding a new policy or policy template” on page 412.

Create a policy from a template.

See “Creating a policy from a template” on page 397.

Select an existing policy at the Manage > Policies > Policy


List screen to edit it.

See “Manage and add policies” on page 432.

Enter a policy Name and Description. The policy name must be unique in the policy group you deploy
the policy to.

See “Input character limits for policy configuration” on page 431.


Note: The Policy Label field is reserved for the Veritas Data
Insight Self-Service Portal.
Configuring policies 414
Configuring policies

Table 20-1 Configuring policies (continued)

Action Description

Select the Policy Group from the list where the The Default Policy Group is selected if there is no policy group
policy is to be deployed. configured.

See “Creating and modifying policy groups” on page 436.

Set the Status for the policy. You can enable (default setting) or disable a policy. A disabled
policy is deployed but is not loaded into memory to detect
incidents.

See “Manage and add policies” on page 432.

Add a rule to the policy, or edit an existing rule. Click Add Rule to add a rule.

See “Adding a rule to a policy” on page 415.

Select an existing rule to edit it.

Configure the rule with one or more conditions. For a valid policy, you must configure at least one rule that
declares at least one condition. Compound conditions and
exceptions are optional.

See “Configuring policy rules” on page 417.

Optionally, add one or more policy exceptions, or Click Add Exception to add it.
edit an existing exception.
See “Adding an exception to a policy” on page 424.d

Select an existing exception to edit it.

Configure any exception(s). See “Configuring policy exceptions” on page 426.

Save the policy configuration. Click Save to save the policy configuration to the Enforce Server
database.

See “Policy components” on page 370.

Export the policy as a template. Optionally, you can export the policy rules and exceptions as a
template.

See “Exporting policy detection as a template” on page 442.

Add one or more response rules to the policy. You configure response rules independent of policies.

See “Configuring response rules” on page 1763.

See “Adding an automated response rule to a policy”


on page 442.
Configuring policies 415
Adding a rule to a policy

Adding a rule to a policy


At the Manage > Policies > Policy List > Configure Policy – Add Rule screen you add one
or more rules to a policy.
You can add two types of rules to a policy: detection and group. If two or more rules in a policy
are the same type, the system connects them by OR. If two or more rules in the same policy
are different types, the system connects them by AND.
See “Policy detection execution” on page 394.

Note: Exceptions are added separate from rules. See “Adding an exception to a policy”
on page 424.

To add one or more rules to a policy


1 Choose the type of rule (detection or group) to add to the policy.
To add a detection rule, select the Detection tab and click Add Rule.
To add a group (identity) rule, select the Groups tab and click Add Rule.
See “Policy matching conditions” on page 386.
2 Select the detection or the group rule you want to implement from the list of rules.
See Table 20-2 on page 415.
3 Select the prerequisite component, if required.
If the policy rule requires a Data Profile, Data Identifier, or User Group select it from
the list.
4 Click Next to configure the policy rule.
See “Configuring policy rules” on page 417.

Table 20-2 Adding policy rules

Rule Prerequisite Description

Content match conditions

Content Matches Regular See “Introducing regular expression matching”


Expression on page 852.

Content Matches Exact Data Exact Data Profile See “About the Exact Data Profile and index”
on page 528.

See “Choosing an Exact Data Profile” on page 409.

Content Matches Keyword See “Introducing keyword matching” on page 838.


Configuring policies 416
Adding a rule to a policy

Table 20-2 Adding policy rules (continued)

Rule Prerequisite Description

Content Matches Document Indexed Document See “Introducing Indexed Document Matching (IDM)”
Signature Profile on page 612.

See “Choosing an Indexed Document Profile”


on page 411.

Content Matches Data Identifier Data Identifier See “Introducing data identifiers” on page 717.

See “Selecting a data identifier breadth” on page 739.

Content Matches Classification ICT See “Overview of steps to tie Information Centric
Tagging to Data Loss Prevention” on page 228.

See “Configuring the Content Matches Classification


condition” on page 863.

Detect using Vector Machine VML Profile See “Introducing Vector Machine Learning (VML)”
Learning on page 664.

See “Configuring VML profiles and policy conditions”


on page 668.

Context match conditions

Contextual Attributes (Cloud Cloud Detection Service See “Introducing contextual attributes for cloud
Applications and API Detection or API Detection applications” on page 948.
Appliance only) Appliance

File Properties match conditions

Message Attachment or File See “About file type matching” on page 900.
Type Match

Message Attachment or File See “About file size matching” on page 902.
Size Match

Message Attachment or File See “About file name matching” on page 903.
Name Match

Custom File Type Signature Rule enabled See “About custom file type identification” on page 901.

Custom script See “Enabling the Custom File Type Signature


condition in the policy console” on page 908.

Protocol and Endpoint match conditions

Protocol Monitoring Custom protocols (if any) See “Introducing protocol monitoring for network”
on page 912.
Configuring policies 417
Configuring policy rules

Table 20-2 Adding policy rules (continued)

Rule Prerequisite Description

Endpoint Monitoring See “About endpoint protocol monitoring” on page 915.

Endpoint Device Class or ID Custom device(s) See “About endpoint device detection” on page 917.

Endpoint Location See “About endpoint location detection” on page 917.

Form Recognition

Detect using Form Recognition Form Recognition Profile See “About Form Recognition detection” on page 695.
Profile
See “Configuring the Form Recognition detection rule”
on page 699.

Groups (Identities) match conditions

Sender/User Matches Pattern See “Introducing described identity matching”


on page 925.
Recipient Matches Pattern

Sender/User based on a User Group See “Introducing synchronized Directory Group


Directory Server Group Matching (DGM)” on page 935.

Recipient based on a Directory See “Configuring User Groups” on page 936.


Server Group

Sender/User based on a Exact Data Profile See “Introducing profiled Directory Group Matching
Directory from: (DGM)” on page 942.

Recipient based on a Directory See “Configuring Exact Data profiles for DGM”
from: on page 943.

Configuring policy rules


At the Manage > Policies > Policy List > Configure Policy – Edit Rule screen, you configure
a policy rule with one or more match conditions. The configuration of each rule condition
depends on its type.
See Table 20-4 on page 419.

Table 20-3 Configuring policy rules

Step Action Description

Step 1 Add a rule to a policy, or modify See “Adding a rule to a policy” on page 415.
a rule.
To modify an existing rule, select the rule in the policy builder interface at
the Configure Policy – Edit Rule screen.
Configuring policies 418
Configuring policy rules

Table 20-3 Configuring policy rules (continued)

Step Action Description

Step 2 Name the rule, or modify a In the General section of the rule, enter a name in the Rule Name field,
name. or modify the name of an existing rule.

Step 3 Set the rule severity. In the Severity section of the rule, select or modify a "Default" severity
level.

In addition to the default severity, you can add multiple severity levels to
a rule.

See “Defining rule severity” on page 420.

Step 4 Configure the match condition. In the Conditions section of the rule, you configure one or more match
conditions for the rule. The configuration of a condition depends on its
type.

See Table 20-4 on page 419.

Step 5 Configure match counting (if If the rule calls for it, configure how you want to count matches.
required).
See “Configuring match counting” on page 421.

Step 6 Select components to match on If the rule is content-based, select one or more available content rules to
(if available). match on.

See “Selecting components to match on” on page 423.

Step 7 Add and configure one or more To define a compound rule, Add another match condition from the Also
additional match conditions Match list.
(optional).
Configure the additional condition according to its type (Step 4).

See “Configuring compound match conditions” on page 429.


Note: All conditions in a single rule must match to trigger an incident. See
“Policy detection execution” on page 394.

Step 8 Save the policy configuration. When you are done cofiguring the rule, click OK.

This action returns you to the Configure Policy screen where you can
Save the policy.

See “Manage and add policies” on page 432.

Table 20-4 lists each of the available match conditions and provides links to topics for
configuring each condition.
Configuring policies 419
Configuring policy rules

Table 20-4 Configuring policy match conditions

Rule Description

Content match conditions

Content Matches Regular See “Configuring the Content Matches Regular Expression condition”
Expression on page 854.

Content Matches Exact Data from See “Configuring the Content Matches Exact Data policy condition
an Exact Data Profile for EDM” on page 551.

Content Matches Keyword See “Configuring the Content Matches Keyword condition”
on page 844.

Content Matches Document See “Configuring the Content Matches Document Signature policy
Signature condition” on page 646.

Content Matches Data Identifier See “Configuring the Content Matches data identifier condition”
on page 737.

Detect using Vector Machine See “Configuring the Detect using Vector Machine Learning Profile
Learning profile condition” on page 679.

Content Matches Classification See “Configuring the Content Matches Classification condition”
on page 863.

Detect using Form Recognition See “Configuring the Form Recognition detection rule” on page 699.
profile

C Context

Contextual Attributes (Cloud See “Introducing contextual attributes for cloud applications”
Applications and API Detection on page 948.
Appliance only)

File Properties match conditions

Message Attachment or File Type See “Configuring the Message Attachment or File Type Match
Match condition” on page 904.

Message Attachment or File Size See “Configuring the Message Attachment or File Size Match
Match condition” on page 905.

Message Attachment or File Name See “Configuring the Message Attachment or File Name Match
Match condition” on page 906.

Custom File Type Signature See “Configuring the Custom File Type Signature condition”
on page 908.

Protocol match conditions


Configuring policies 420
Defining rule severity

Table 20-4 Configuring policy match conditions (continued)

Rule Description

Network Monitoring See “Configuring the Protocol Monitoring condition for network
detection” on page 913.

Endpoint Monitoring See “Configuring the Endpoint Monitoring condition” on page 918.

Endpoint Device Class or ID See “Configuring the Endpoint Device Class or ID condition”
on page 920.

Endpoint Location See “Configuring the Endpoint Location condition” on page 919.

Groups match conditions

Sender/User Matches Pattern See “Configuring the Sender/User Matches Pattern condition”
on page 927.

Recipient Matches Pattern See “Configuring the Recipient Matches Pattern condition”
on page 930.

Sender/User based on a Directory See “Configuring the Sender/User based on a Directory Server
Server Group Group condition” on page 939.

Sender/User based on a Directory See “Configuring the Sender/User based on a Profiled Directory
from an Exact Data Profile condition” on page 944.

Recipient based on a Directory See “Configuring the Recipient based on a Directory Server Group
Server Group condition” on page 940.

Recipient based on a Directory from See “Configuring the Recipient based on a Profiled Directory
an Exact Data Profile condition” on page 945.

Defining rule severity


The system assigns a severity level to a policy rule violation. The default setting is "High." You
can configure the default, and add one or more additional severity levels.
See “Policy severity” on page 374.
Policy rule severity works with the Severity response rule condition. If you set the default
policy rule severity level to "High" and define additional severity levels, the system does not
assign the additional severity to the incident based on match count. The result is that if you
have a response rule set to a match count severity level that is less than the default "High"
severity, the response rule does not execute
See “Configuring the Severity response condition” on page 1778.
Configuring policies 421
Configuring match counting

To define policy rule severity


1 Configure a policy rule.
See “Configuring policy rules” on page 417.
2 Select a Default level from the Severity list.
The default severity level is the baseline level that the system reports. The system applies
the default severity level to any rule match, unless additional severity levels override the
default setting.
3 Click Add Severity to define additional severity levels for the rule.
If you add a severity level it is based on the match count.
4 Select the desired severity level, choose the match count range, and enter the match
count.
For example, you can set a Medium severity with X range to match after 100 matches
have been counted.
5 If you add an additional severity level, you can select it to be the default severity.
6 To remove a defined severity level, click the X icon beside the severity definition.

Configuring match counting


Some conditions let you specify how you want to count matches. Count all matches is the
default behavior. You can configure the minimum number of matches required to cause an
incident. Or, you can count all matches as one incident. If a condition supports match counting,
you can configure this setting for both policy rules and exceptions.
See Table 20-6 on page 422.

Table 20-5 Configuring match counting

Parameter Condition Incident description


type

Check for Simple This configuration reports a match count of 1 if there are one or more matches; it
existence does not count multiple matches. For example, 10 matches are one incident.

Compound This configuration reports a match count of 1 if there are one or more matches
and ALL conditions in the rule or exception are set to check for existence.
Configuring policies 422
Configuring match counting

Table 20-5 Configuring match counting (continued)

Parameter Condition Incident description


type

Count all Simple This configuration reports a match count of the exact number of matches detected
matches by the condition. For example, 10 matches count as 10 incidents.

Compound This configuration reports a match count of the sum of all condition matches in
the rule or exception. The default is one incident per condition match and applies
if any condition in the rule or exception is set to count all matches.

For example, if a rule has two conditions and one is set to count all matches and
detects four matches, and the other condition is set to check for existence and
detects six matches, the reported match count is 10. If a third condition in the rule
detects a match, the match count is 11.

Only report You can change the default one incident per match count by specifying the
incidents with minimum number of matches required to report an incident.
at least _
For example, in a rule with two conditions, if you configure one condition to count
matches
all matches and specify five as the minimum number of matches for each condition,
a sum of 10 matches reported by the two conditions generates two incidents. You
must be consistent and select this option for each condition in the rule or exception
to achieve this behavior.
Note: The count all matches setting applies to each message component you
match on. For example, consider a policy where you specify a match count of 3
and configure a keyword rule that matches on all four message components
(default setting for this condition). If a message is received with two instances of
the keyword in the body and one instance of the keyword in the envelope, the
system does not report this as a match. However, if three instances of the keyword
appear in an attachment (or any other single message component), the system
would report it as a match.

Count all unique Only count Unique match counting is available for Data Identifiers, keyword matching, and
matches unique regular expression matching.
matches
See “About unique match counting” on page 734.

Table 20-6 Conditions that support match counting

Condition Description

Content Matches Regular See “Introducing regular expression matching” on page 852.
Expression
See “Configuring the Content Matches Regular Expression condition” on page 854.

Content Matches Keyword See “Introducing keyword matching” on page 838.

See “Configuring the Content Matches Keyword condition” on page 844.


Configuring policies 423
Selecting components to match on

Table 20-6 Conditions that support match counting (continued)

Condition Description

Content Matches Document See “Configuring the Content Matches Document Signature policy condition”
Signature (IDM) on page 646.

Content Matches Data Identifier See “Introducing data identifiers” on page 717.

See “Configuring the Content Matches data identifier condition” on page 737.

See “Configuring unique match counting” on page 775.

Recipient Matches Pattern See “Introducing described identity matching” on page 925.

See “Configuring the Recipient Matches Pattern condition” on page 930.

Selecting components to match on


The availability of one or more message components to match on depends on the type of rule
or exception condition you implement.
See “Detection messages and message components” on page 391.

Table 20-7 Match on components

Component Description

Envelope If the condition supports matching on the Envelope component, select it to match on the message
metadata. The envelope contains the header, transport information, and the subject if the message
is an SMTP email.

If the condition does not support matching on the Envelope component, this option is grayed out.

If the condition matches on the entire message, the Envelope is selected and cannot be deselected,
and the other components cannot be selected.

Subject Certain detection conditions match on the Subject component for some types of messages.

See “Detection messages and message components” on page 391.


For the detection conditions that support subject component matching, you can match on the Subject
for the following types of messages:

■ SMTP (email) messages from Network Monitor or Network Prevent for Email.
■ NNTP messages from Network Monitor.

To match on the Subject component, you must select (check) the Subject component and uncheck
(deselect) the Envelope component for the policy rule. If you select both components, the system
matches the subject twice because the message subject is included in the envelope as part of the
header.
Configuring policies 424
Adding an exception to a policy

Table 20-7 Match on components (continued)

Component Description

Body If the condition matches on the Body message component, select it to match on the text or content
of the message.

Attachment(s) If the condition matches on the Attachment(s) message component, select it to detect content in
files sent by, downloaded with, or attached to the message.

Adding an exception to a policy


At the Manage > Policies > Policy List > Configure Policy – Add Exception screen you
add one or more exception conditions to a policy. Policy exceptions are executed before policy
rules. If there is an exception match, the entire message is discarded.
See “Exception conditions” on page 393.

Note: You can create exceptions for all policy conditions, except the EDM condition Content
Matches Exact Data From. In addition, Network Prevent for Web does not support
synchronized DGM exceptions.

To add an exception to a policy


1 Add an exception to a policy.
To add a detection rule exception, select the Detection tab and click Add Exception.
To add a group rule exception, select the Groups tab and click Add Exception.
2 Select the policy exception to implement.
The Add Detection Exception screen lists all available detection exceptions that you
can add to a policy.
The Add Group Exception screen lists all available group exceptions that you can add
to a policy.
See Table 20-8 on page 425.
3 If necessary, choose the profile, data identifier, or user group.
4 Click Next to configure the exception.
See “Configuring policy exceptions” on page 426.
Configuring policies 425
Adding an exception to a policy

Table 20-8 Selecting a policy exception

Exception Prerequisite Description

Content

Content Matches Regular See “Introducing regular expression matching” on page 852.
Expression

Content Matches Keyword See “Introducing keyword matching” on page 838.

Content Matches Document Indexed Document See “Choosing an Indexed Document Profile” on page 411.
Signature Profile

Content Matches Data Data Identifier See “Introducing data identifiers” on page 717.
Identifier
See “Selecting a data identifier breadth” on page 739.

Detect using Vector Machine VML Profile See “Configuring VML policy exceptions” on page 680.
Learning profile
See “Configuring VML profiles and policy conditions”
on page 668.

Context

Contextual Attributes (Cloud Cloud Detection See “Introducing contextual attributes for cloud applications”
Applications and API Service or API on page 948.
Detection Appliance only) Detection
Appliance

File Properties

Message Attachment or File See “About file type matching” on page 900.
Type Match

Message Attachment or File See “About file size matching” on page 902.
Size Match

Message Attachment or File See “About file name matching” on page 903.
Name Match

Custom File Type Signature Condition enabled See “About custom file type identification” on page 901.

Custom script
added

Protocol and Endpoint

Network Protocol See “Introducing protocol monitoring for network”


on page 912.
Configuring policies 426
Configuring policy exceptions

Table 20-8 Selecting a policy exception (continued)

Exception Prerequisite Description

Endpoint Protocol, See “About endpoint protocol monitoring” on page 915.


Destination, Application

Endpoint Device Class or ID See “About endpoint device detection” on page 917.

Endpoint Location See “About endpoint location detection” on page 917.

Form Recognition

Detect using Form Form Recognition See “About Form Recognition detection” on page 695.
Recognition Profile Profile
See “Configuring the Form Recognition exception rule”
on page 700.

Group (identity)

Sender/User Matches Pattern See “Introducing described identity matching” on page 925.

Recipient Matches Pattern

Sender/User based on a User Group See “Introducing synchronized Directory Group Matching
Directory Server Group (DGM)” on page 935.

Recipient based on a Directory See “Configuring User Groups” on page 936.


Server Group
Note: Network Prevent for Web does not support this type
of exception. Use profiled DGM instead.

Sender/User based on a Exact Data Profile See “Introducing profiled Directory Group Matching (DGM)”
Directory from: on page 942.

Recipient based on a Directory See “Configuring Exact Data profiles for DGM” on page 943.
from:

Configuring policy exceptions


At the Manage > Policies > Policy List > Configure Policy – Edit Exception screen you
configure one or more conditions for a policy exception.
See Table 20-10 on page 428.
If an exception condition matches, the system discards the matched component from the
system. This component is no longer available for evaluation.
See “Exception conditions” on page 393.
Configuring policies 427
Configuring policy exceptions

Table 20-9 Configure policy exceptions

Step Action Description

Step 1 Add a new policy exception, or See “Adding an exception to a policy” on page 424.
edit an existing exception.
Select an existing policy exception to modify it.

Step 2 Name the exception, or edit an In the General section, enter a unique name for the exception, or modify
existing name or description. the name of an existing exception.
Note: The exception name is limited to 60 characters.

Step 3 Select the components to apply If the exception is content-based, you can match on the entire message
the exception to (if available). or on individual message components.

See “Detection messages and message components” on page 391.


Select one of the Apply Exception to options:

■ Entire Message
This option applies the exception to the entire message.
■ Matched Components Only
This option applies the exception to each message component you
select from the Match On options in the Conditions section of the
exception.

Step 4 Configure the exception condition. In the Conditions section of the Configure Policy - Edit Exception
screen, define the condition for the policy exception. The configuration
of a condition depends on the exception type.

See Table 20-10 on page 428.

Step 5 Add one or more additional You can add conditions until the exception is structured as desired.
conditions to the exception
See “Configuring compound match conditions” on page 429.
(optional).
To add another condition to an exception, select the condition from the
Also Match list.

Click Add and configure the condition.

Step 6 Save and manage the policy. Click OK to complete the exception definition process.

Click Save to save the policy.

See “Manage and add policies” on page 432.

Table 20-10 lists the exception conditions that you can configure, with links to configuration
details.
Configuring policies 428
Configuring policy exceptions

Table 20-10 Policy exception conditions available for configuration

Exception Description

Content

Content Matches Regular Expression See “Configuring the Content Matches Regular Expression condition”
on page 854.

Content Matches Keyword See “Configuring the Content Matches Keyword condition” on page 844.

Content Matches Document Signature See “Configuring the Content Matches Document Signature policy
condition” on page 646.

Content Matches Data Identifier See “Configuring the Content Matches data identifier condition”
on page 737.

Detect using Vector Machine Learning See “Configuring VML policy exceptions” on page 680.
Profile

Context

Contextual Attributes (Cloud See “Introducing contextual attributes for cloud applications”
Applications and API Detection on page 948.
Appliance only)

File Properties

Message Attachment or File Type Match See “Configuring the Message Attachment or File Type Match
condition” on page 904.

Message Attachment or File Size Match See “Configuring the Message Attachment or File Size Match condition”
on page 905.

Message Attachment or File Name Match See “Configuring the Message Attachment or File Name Match
condition” on page 906.

Custom File Type Signature See “Configuring the Custom File Type Signature condition”
on page 908.

Protocol and Endpoint

Network Protocol See “Configuring the Protocol Monitoring condition for network
detection” on page 913.

Endpoint Protocol or Destination See “Configuring the Endpoint Monitoring condition” on page 918.

Endpoint Device Class or ID See “Configuring the Endpoint Device Class or ID condition”
on page 920.

Endpoint Location See “Configuring the Endpoint Location condition” on page 919.
Configuring policies 429
Configuring compound match conditions

Table 20-10 Policy exception conditions available for configuration (continued)

Exception Description

Form Recognition

Detect using Form Recognition profile See “Configuring the Form Recognition exception rule” on page 700.

Group (identity)

Sender/User Matches Pattern See “Configuring the Sender/User Matches Pattern condition”
on page 927.

Recipient Matches Pattern See “Configuring the Recipient Matches Pattern condition” on page 930.

Sender/User based on a Directory Server See “Configuring the Sender/User based on a Directory Server Group
Group condition” on page 939.

Recipient based on a Directory Server See “Configuring the Recipient based on a Directory Server Group
Group condition” on page 940.

Sender/User based on a Directory from See “Configuring the Sender/User based on a Profiled Directory
an EDM Profile condition” on page 944.

Recipient based on a Directory from and See “Configuring the Recipient based on a Profiled Directory condition”
EDM Profile on page 945.

Configuring compound match conditions


You can create compound match conditions for policy rules and exceptions.
See “Configuring compound match conditions” on page 429.
The detection engine connects compound conditions with an AND. All conditions in the rule
or exception must be met to trigger or except an incident.
See “Policy detection execution” on page 394.
You are not limited to the number of match conditions you can include in a rule or exception.
However, the multiple conditions you declare in a single rule or exception should be logically
associated. Do not mistake compound rules or exceptions with multiple rules or exceptions in
a policy.
See “Use compound conditions to improve match accuracy” on page 455.
Configuring policies 430
Configuring compound match conditions

Table 20-11 Configure a compound policy rule or exception

Step Action Description

Step 1 Modify or configure an You can add one or more additional match conditions to a policy rule at the
existing policy rule or Configure Policy – Edit Rule screen.
exception.
You can add one or more additional match conditions to a rule or exception
at the Configure Policy – Edit Rule or Configure Policy – Edit Exception
screen.

Step 2 Select an additional match Select the additional match condition from the Also Match list.
condition.
This list appears at the bottom of the Conditions section for an existing rule
or exception.

Step 3 Review the available The system lists all available additional conditions you can add to a policy
conditions. rule or exception.

See “Adding a rule to a policy” on page 415.

See “Adding an exception to a policy” on page 424.

Step 4 Add the additional Click Add to add the additional match condition to the policy rule or exception.
condition.
Once added, you can collapse and expand each condition in a rule or
exception.

Step 5 Configure the additional See “Configuring policy rules” on page 417.
condition.
See “Configuring policy exceptions” on page 426.

Step 6 Select the same or any If the condition supports component matching, specify where the data must
component to match. match to generate or except an incident.

Same Component – The matched data must exist in the same component
as the other condition(s) that also support component matching to trigger a
match.

Any Component – The matched data can exist in any component that you
have selected.

See “About cross-component matching” on page 733.

Step 6 Repeat this process to You can add as many conditions to a rule or exception as you need.
additional match conditions
All conditions in a single rule or exception must match to trigger an incident,
to the rule or exception.
or to trigger the exception.

Step 7 Save the policy. Click OK to close the rule or exception configuration screen.

Click Save to save the policy configuration.


Configuring policies 431
Input character limits for policy configuration

Input character limits for policy configuration


When configuring a policy, consider the following input character limits for policy configuration
components.

Table 20-12 Input character limits for policy configuration

Configuration element Input character limit

Name of a policy component, including: 60 characters

■ Policy Note: To import a policy as a template, the policy name must be less than
■ Rule 60 characters, otherwise it does not appear in the Imported Templates
■ Exception list.

■ Group
■ Condition

Description of policy component. 255 characters

Name of Data Profile, including: 255 characters

■ Exact Data
■ Indexed Document
■ Vector Machine Learning
■ Form Recognition

Data Identifier pattern limits 100 characters per line

See “Using the data identifier pattern language” on page 814.


Chapter 21
Administering policies
This chapter includes the following topics:

■ Manage and add policies

■ Manage and add policy groups

■ Creating and modifying policy groups

■ Importing policies

■ Exporting policies

■ Cloning policies

■ Importing policy templates

■ Exporting policy detection as a template

■ Adding an automated response rule to a policy

■ Removing policies and policy groups

■ Viewing and printing policy details

■ Downloading policy details

■ Troubleshooting policies

■ Updating EDM and IDM profiles to the latest version

■ Updating policies after upgrading to the latest version

Manage and add policies


The Manage > Policies > Policy List screen is the home page for adding and managing
policies. You implement policies to detect and report data loss.
Administering policies 433
Manage and add policies

See “Workflow for implementing policies” on page 378.


Table 21-1 lists and describes the actions you can take at the Policy List screen.

Table 21-1 Policy List screen actions

Action Description

Add a policy Click New to create a new policy.

See “Adding a new policy or policy template” on page 412.

Modify a policy Click the policy name or edit icon to modify an existing policy.

See “Configuring policies” on page 413.

Activate a policy Select the policy or policies you want to activate, then click Activate in the policy list
toolbar.

Make a policy inactive Select the policy or policies you want to make inactive, then click Suspend in the policy
list toolbar.
Note: By default, all solution pack policies are activated on installation of the solution
pack.

Sort policies Click any column header to sort the policy list.

Filter policies You can filter your policy list by Status, Name, Description, or Policy Group.

To filter your policy list, click Filter in the policy list toolbar, then select or enter your filter
criteria in the appropriate column or columns.

To remove filters from your policy list, click Clear in the policy list toolbar.

Remove a policy Select the policy or policies you want to remove, then click Delete in the policy list toolbar.

You can also click the red X icon at the end of the policy row to delete an individual
policy.
Note: You cannot remove a policy that has active incidents.

See “Removing policies and policy groups” on page 443.

Import and export policies You can import and export policies using the Import and Export buttons in the policy
list toolbar.

See “Importing policies” on page 437.

See “Exporting policies” on page 439.

Export and import policy You can export and import policy templates for reuse when authoring new policies.
templates
See “Importing policy templates” on page 441.

See “Exporting policy detection as a template” on page 442.


Administering policies 434
Manage and add policies

Table 21-1 Policy List screen actions (continued)

Action Description

Download policy details Click Download Details in the policy list toolbar to download details for the selected
policies in the Policy List. Symantec Data Loss Prevention exports the policy details
as HTML files in a ZIP archive. Open the archive to view and print policy details.

See “Downloading policy details” on page 444.

View and print policy details To view policy details for a single policy, click the printer icon at the end of the policy
row. To print the policy details, use the print feature of your web browser.

See “Viewing and printing policy details” on page 444.

Clone a policy Select the policy or policies you want to clone, then click Clone in the policy list toolbar.

See “Cloning policies” on page 440.

Assign policies to a policy You can assign individual or multiple policies to a policy group from the policy list page.
group
Select the policy or policies you want to assign to a policy group, then click Assign
Group in the policy list toolbar. Select the policy group from the drop-down list.

See “Policy groups” on page 372.

Table 21-2 lists and describes the display fields at the Policy List screen.

Table 21-2 Policy List screen display fields

Column Description

Status The status column displays one of three states for the policy:

■ Misconfigured Policy:
The policy icon is a yellow caution sign.
See “Policy components” on page 370.
■ Active Policy:
The policy icon is green. An active policy can detect incidents.
■ Suspended Policy
The policy icon is red. A suspended policy is deployed but does not detect incidents.

Name View and sort by the name of the policy.

See “About Data Loss Prevention policies” on page 368.

Description View the description of the policy.

See “Policy templates” on page 371.

Policy Group View and sort by the policy group to which the policy is deployed.

See “Policy groups” on page 372.


Administering policies 435
Manage and add policy groups

Table 21-2 Policy List screen display fields (continued)

Column Description

Last Modified View and sort by the date the policy was last updated.
See “Policy authoring privileges” on page 375.

Manage and add policy groups


The System > Servers and Detectors > Policy Groups screen lists the configured policy
groups in the system.
From the Policy Groups screen you manage existing policy groups and add new ones.

Table 21-3 Policy Groups screen actions

Action Description

Add a policy group Click Add to define a new policy group.

See “Policy groups” on page 372.

Modify a policy group To modify an existing policy group, click the name of the group.

See “Creating and modifying policy groups” on page 436.

Remove a policy group Select the policy group then click Delete.
Note: If you delete a policy group, you delete any policies that are assigned to that group.

See “Removing policies and policy groups” on page 443.

Find a policy group You can search for a policy group by applying entering a search term in the Search bar.
You can filter your results by Name, Description, or Servers by selecting the filter then
clicking Apply Filter.

View policies in a group To view the policies deployed to an existing policy group, navigate to the System > Servers
and Detectors > Policy Groups > Configure Policy Group screen.

See “Creating and modifying policy groups” on page 436.

Table 21-4 Policy Groups screen display fields

Column Description

Name The name of the policy group.

Description The description of the policy group.


Administering policies 436
Creating and modifying policy groups

Table 21-4 Policy Groups screen display fields (continued)

Column Description

Available Servers and The detection server or cloud detector to which the policy group is deployed.
Detectors
See “Policy deployment” on page 373.

Last Modified The date the policy group was last modified.

Creating and modifying policy groups


At the System > Servers and Detectors > Policy Groups screen you configure a new policy
group or modify an existing one.
See “Policy groups” on page 372.
To configure a policy group
1 Add a new policy group, or modify an existing one.
See “Manage and add policy groups” on page 435.
2 Enter the Name of the policy group, or modify an existing name.
Use an informative name. Policy authors and Enforce Server administrators rely on the
policy group name when they associate the policy group with policies, roles, targets.
The name value is limited to 256 characters.
3 Enter a Description of the policy group, or modify an exiting description of an existing
policy group.
4 Select one or more Servers and Detectors to assign the policy group to.
The system displays a check box for each detection server currently configured and
registered with the Enforce Server.
■ Select the All Servers or Detectors option to assign the policy group to all detection
servers and cloud detectors in your system. If you leave this checkbox unselected,
you can assign the policy group to individual servers.
The All Discover Servers entry is not configurable because the system automatically
assigns all policy groups to all Network Discover Servers. This feature lets you assign
policy groups to individual Discover targets.
See “Configuring the required fields for Network Discover targets” on page 2092.
■ Deselect the All Servers or Detectors option to assign the policy group to individual
detection servers.
The system displays a check box for each server currently configured and registered
with the Enforce Server.
Administering policies 437
Importing policies

Select each individual detection server to assign the policy group.

5 Click Save to save the policy group configuration.

Note: The Policies in this Group section of the Polices Group screen lists all the policies in
the policy group. You cannot edit these entries. When you create a new policy group, this
section is blank. After you deploy one or more policies to a policy group (during policy
configuration), the Policies in this Group section displays each policy in the policy group.

See “Configuring policies” on page 413.


See “Policy deployment” on page 373.

Importing policies
You can export policies from an Enforce Server and import them to another Enforce Server.
This feature makes it easier to move policies from one environment to another. For example,
you can export policies from your test environment and import them into your production
environment.

About importing policies


To import policies, you must have the Import Policies privilege. To enable this privilege, you
must also have the Server Administration, Author Policies, Author Response Rules, and
All Policy Groups privileges.
See “Configuring roles” on page 114.
When you import a policy, please note the following points:
■ The policy is imported in the same state in which it was exported. For example, if a policy
was active when it was exported, it will be active when you import it. The only exception
to this behavior is for pre-existing policies on system to which you are importing the policy
(the "target system"). If the existing policy is active, then the imported policy will also be
active, regardless of its state on the exporting system.
■ Imported policies will overwrite existing policies that have the same name. You can change
the name of the exported policy in the XML file if you want to import it without overwriting
the existing policy.
■ If the policy group to which the exported policy belonged exists on the target system, the
policy will be added to that policy group, or overwrite a policy of the same name in that
group. If the policy group does not exist on the target system, it will be created upon import.
If the policy exists on the target system, but it belongs to a different policy group, the
imported policy will be assigned to a newly created policy group on the target system, and
will not overwrite the existing policy.
Administering policies 438
Importing policies

■ When you import a policy, you can choose whether or not to import its response rules if
those rules conflict with existing response rules on the target system.
■ The Policy Import Preview page will display warnings about any policy elements that will
be created or overwritten when you import the policy.
■ You can only import one policy at a time.
To import a policy
1 Navigate to Manage > Policies > Policy List.
2 Click Import.
The Import Policy page appears.
3 Click Browse to select the exported policy file you want to import.
4 Click Import Policy.
The Policy import preview page appears. This page will warn you of any policy elements
that may be overwritten when you import this policy. If the policy you are importing includes
any response rules among the elements that may be overwritten, you can exclude those
response rules from import on this page.
5 Click Proceed with import.
The policy is imported. If the policy has any unresolved references, the Policy References
Check page appears.
You can resolve any unresolved policy references on this page.
See “About policy references” on page 438.

About policy references


Policies are exported in XML format. The XML policy files contain policy metadata, references
to any data profiles, response rules, data identifiers, and the detection and group rules and
exceptions. The files do not contain the actual data profiles, directory connections, credentials,
or FlexResponse plug-ins. You must provide those items on the system into which you are
importing the policy.
When you import a policy, Symantec Data Loss Prevention will alert you to any unresolved
references on the Policy References Check page. The Policy References Check page
displays at the end of the policy import process. You can also view this page by clicking the
unresolved references icon on the Policy List and Policy Edit pages.
To resolve policy references, click the edit (pencil) icon on the Policy References Check
page. Symantec Data Loss Prevention displays the appropriate edit page for each unresolved
reference.Table 21-5 provides information about resolving policy references.
Administering policies 439
Exporting policies

Table 21-5 Resolving policy references

Unresolved policy reference Resolution

Policy group where no detection server is specified: Select detection servers for the policy group.

Directory connection with missing credentials: Provide the credentials for the directory connection.

EDM profile with missing source file and index: Specify the correct data source file.

IDM profile with missing import path and file name: Specify the correct data source.

Remote IDM profile with missing credentials: Provide the credentials for the remote IDM profile.

VML profile with trained profile and related data Provide the trained profile and its related data, train
missing: and accept the VML profile.

Form Recognition profile with missing gallery ZIP Provide the gallery ZIP archive.
archive:

Endpoint quarantine response rule with missing Provide the credentials for the endpoint quarantine
saved credentials: response rule.

Response rule with a missing Server FlexResponse Deploy the Server FlexResponse JAR file on the
plug-in: target system.

See “Deploying a Server FlexResponse plug-in”


on page 2143.

Exporting policies
You can export your policy data to an XML file to easily share policies between Enforce Servers.

About policy export


Policies are exported in XML format. The XML policy files contain policy metadata, references
to any data profiles, response rules, data identifiers, and the detection and group rules and
exceptions. The files do not contain the actual data profiles, directory connections, credentials,
or FlexResponse plug-ins. You must copy those items to the system into which you are importing
the policy.
You can export policies individually or multiply. To export policies, you must have the Author
Policies privilege.
See “Configuring roles” on page 114.
Exported policies include the following items:
■ Policy name, description, and policy group
Administering policies 440
Cloning policies

■ Policy rules, including Form Recognition, EDM, IDM, and VML definitions
■ Endpoint locations and devices
■ Sender and recipient patterns
■ Response rules
■ Data identifiers
■ Custom protocols
Exported policies do not include the following items:
■ Credentials
■ Form Recognition, EDM, IDM, or VML indexes
■ Form Recognition, EDM or IDM data source files
■ VML training files
■ FlexResponse plug-ins
To export policies
1 Navigate to Manage > Policies > Policy List.
2 Take one of the following actions:
■ To export a single policy, click the export icon for that policy.
■ To export multiple policies to a ZIP archive, select the policies you want to export, then
click Export.

3 Symantec Data Loss Prevention exports your policy or policies using the following naming
conventions:
■ For single policies, the naming convention is
ENFORCEHOSTNAME-POLICYNAME-DATE-TIME.XML.

■ For bulk policy export, the naming convention is


ENFORCEHOSTNAME-policies-DATE-TIME.ZIP.

Cloning policies
You can clone policies from the Policy List page.
Cloned policies are exact copies of the original policy. They include the following items:
■ Modified policy name, description, and policy group.
Cloned policies appear in the Policy List as Copy N of original policy name.
■ Policy rules, including Form Recognition, EDM, IDM, and VML definitions
Administering policies 441
Importing policy templates

■ Endpoint locations and devices


■ Sender and recipient patterns
■ Response rules
■ Data identifiers
■ Custom protocols

Note: You must have policy authoring privileges to clone policies.

For information about importing and exporting policies and policy templates, see these topics:
See “Exporting policies” on page 439.
See “Importing policies” on page 437.
See “Exporting policy detection as a template” on page 442.
See “Importing policy templates” on page 441.

Importing policy templates


You can import one or more policy templates to the Enforce Server. You must have policy
system privileges to import policy templates.
See “Policy template import and export” on page 377.
See “Exporting policy detection as a template” on page 442.
To import one or more policy templates to the Enforce Server
1 Place one or more policy templates XML file(s) in the \Program
Files\Symantec\DataLossPrevention\EnforceServer\15.5\Protect\config\templates
directory on the Enforce Server host.
You can import multiple policy templates by placing them all in the templates directory.
2 Make sure that the directory and file(s) are readable by the "protect" system user.
3 Log on to the Enforce Server Administration Console with policy authoring privileges.
4 Navigate to Manage > Policies > Policy List and click Add Policy.
5 Choose the option Add a policy from a template and click Next.
6 Scroll down to the bottom of the template list to the Imported Templates section.
You should see an entry for each XML file you placed in the templates directory.
7 Select the imported policy template and click Next to configure it.
See “Configuring policies” on page 413.
Administering policies 442
Exporting policy detection as a template

Exporting policy detection as a template


You can export policy detection rules and exceptions in a template (XML file). You cannot
export policy response rules. You can only export one policy template at a time.
See “Policy template import and export” on page 377.
To export a policy as a template
1 Log on to the Enforce Server administration console with administrator privileges.
2 Navigate to the Manage > Policies > Policy List > Configure Policy screen for the policy
you want to export.
3 At the bottom of the Configure Policy screen, click the Export this policy as a template
link.
4 Save the policy to a local or network destination of your choice.
For example, the system exports a policy named Webmail to the policy template file
Webmail.xml which you can save to your local drive.

See “Importing policy templates” on page 441.


For information about importing, exporting, and cloning policies, see these topics:
See “Exporting policies” on page 439.
See “Importing policies” on page 437.
See “Cloning policies” on page 440.

Adding an automated response rule to a policy


You can add one or more automated response rules to a policy to take action when that policy
is violated.
See “About response rules” on page 1738.

Note: Smart response rules are executed manually and are not deployed with policies.

To add an automated response rule to a policy


1 Log on to the Enforce Server administration console with policy authoring privileges.
See “Policy authoring privileges” on page 375.
2 Navigate to the Manage > Policies > Policy List > Configure Policy screen for the policy
you want to add a response rule to.
Administering policies 443
Removing policies and policy groups

3 Select the response rule you want to add from those available in the drop-down menu.
Policies and response rules are configured separately. To add a response rule to a policy,
the response rule must first be defined and saved independently.
See “Implementing response rules” on page 1758.
4 Click Add Response Rule to add the response rule to the policy.
5 Repeat the process to add additional response rules to the policy.
6 Save the policy when you are done adding response rules.
7 Verify that the policy status is green after adding the response rule to the policy.
See “Manage and add policies” on page 432.

Note: If the policy status is a yellow caution sign, the policy is misconfigured. The system does
not support certain pairings of detection rules and automated response rule actions. See
Table 81-2 on page 2276.

Removing policies and policy groups


Consider the following guidelines before you delete a policy or a policy group from the Enforce
Server.

Table 21-6 Guidelines for removing policies and policy groups

Action Description Guideline

Remove a If you attempt to delete a policy that has If you want to delete a policy, you must first delete all
policy associated incidents, the system does incidents that are associated with that policy from the
not let you remove the policy. Enforce Server.

See “Manage and add policies” on page 432.

An alternative is to create an undeployed policy group (one


that is not assigned to any detection servers). This method
is useful to maintain legacy policies and incidents for review
without keeping these policies in a deployed policy group.

See “Policy template import and export” on page 377.


Administering policies 444
Viewing and printing policy details

Table 21-6 Guidelines for removing policies and policy groups (continued)

Action Description Guideline

Remove a If you attempt to delete a policy group Before you delete a policy group, remove any policies from
policy group that contains one or more policies, the that group by either deleting them or assigning them to
system displays an error message. And, different policy groups.
the policy group is not deleted.
See “Manage and add policy groups” on page 435.

If you want to remove a policy group, create a maintenance


policy group and move the policies you want to remove to
the maintenance group.

See “Creating and modifying policy groups” on page 436.

See “About Data Loss Prevention policies” on page 368.


See “Policy groups” on page 372.

Viewing and printing policy details


You can view and print policy details for a single policy from the Policy List screen.
You must have the Author Policies privilege for the policies you want to view and print.
See “Policy authoring privileges” on page 375.
See “Viewing, printing, and downloading policy details” on page 379.
To view and print policy details
1 Navigate to Manage > Policies > Policy List and click the printer icon at the end of the
policy row.
The Policy Snapshot screen appears.
2 View the general policy information, detection rules, and response rules on the Policy
Snapshot screen.
3 To print the policy details, use the Print command in your web browser from the Policy
Snapshot screen.

Downloading policy details


You can download a ZIP archive of details for policies in the Policy List. The ZIP archive
contains HTML documents with details for each selected policy on the Policy List, as well as
an index file to make it easier to find the policy details you want. The files are titled using the
policy ID, such as 123.html. The index file is titled downloaded_policies_DATE.html, and it
Administering policies 445
Troubleshooting policies

contains the policy name, description, status, policy group, and last modified date of all selected
policies in the download, as well as links to the policy details.
You must have the Author Policies privilege for the policies you want to download.
See “Policy authoring privileges” on page 375.
See “Viewing, printing, and downloading policy details” on page 379.
To download policy details
1 Navigate to Manage > Policies > Policy List, select the policy or policies you want, then
click Download Details.
2 In the Open File dialog box, click select Save File, then click OK.
3 To view details for a policy, extract the files from the ZIP archive, then open the file you
want to view. Use the index file to search through the downloaded policies by policy name,
description, status, policy group, or last modified date.
The Policy Snapshot screen appears.
4 To print the policy details, use the Print command in your web browser from the Policy
Snapshot screen.

Troubleshooting policies
Table 21-7 lists log files to consult for troubleshooting policies.

Table 21-7 Log files for troubleshooting policies

Log file Description

SymantecDLPDetectionServer.log Logs when policies and profiles are sent from the Enforce Server to
detection servers and endpoint servers. Displays JRE errors.

See “Debug log files” on page 337.

detection_operational.log Log the loading of policies and detection execution.

detection_operational_trace.log See “Operational log files” on page 334.

FileReader.log Logs when an index file is loaded into memory. For EDM, look for the
line "loaded database profile." For IDM look for the line: "loaded
document profile."

See “Debug log files” on page 337.

Indexer.log Logs the operations of the Indexer process to generate EDM and IDM
indexes.

See “Debug log files” on page 337.


Administering policies 446
Updating EDM and IDM profiles to the latest version

See “About log files” on page 333.


See “Log collection and configuration screen” on page 343.
See “Configuring server logging behavior” on page 343.
See “Collecting server logs and configuration files” on page 347.
See “Log files for troubleshooting VML training and policy detection” on page 686.
See “Advanced server settings” on page 285.
See “Advanced agent settings” on page 2372.

Updating EDM and IDM profiles to the latest version


You must reindex your data and document sources when you upgrade. Before deploying an
index into production, test the updated profile and policies based on the profile to ensure that
they detect data loss as expected on the upgraded system.
Table 21-8 lists the reindexing requirements for updating your EDM and IDM profiles and
provides links for more information.

Table 21-8 Reindexing requirements for EDM and IDM data profiles

Technology and features Required action(s) More information

Exact Data Matching (EDM) If you have existing Exact Data profiles supporting See “Updating EDM indexes to the
EDM policies and you want to use new EDM latest version” on page 574.
■ Multi-token matching
features, before upgrading the detection server(s)
■ Proportional proximity
you must:
range
■ Reindex each structured data source using
the latest EDM indexer, and
■ Load each index into a newly-generated Exact
Data profile.

Indexed Document If you have existing Indexed Document profiles


Matching (IDM) supporting IDM policies and you want to use
Agent IDM, after upgrading you must:
■ Exact match IDM on the
endpoint (Agent IDM) ■ Disable two-tier detection on the Endpoint
Server, and
■ Reindex each document data source so that
the endpoint index is generated and deployed
to the Endpoint Server for download by the
DLP Agent.
Administering policies 447
Updating policies after upgrading to the latest version

Updating policies after upgrading to the latest version


Several policy templates were updated at Symantec Data Loss Prevention 15.1. When you
upgrade to version 15.1, the system updates the system-defined policy templates. Policies
you have created based on an upgraded policy template are not changed so that configurations
you have made are not overwritten. If you have created policies based on one or more of the
updated policy templates, you should update your policies so that they are current.
The General Data Protection Regulation (GDPR) policy templates were updated to include
several new European data identifiers. The keyword lists were also updated.
Policy templates that use data identifier patterns to detect Social Security Numbers (SSNs)
were updated to use the Randomized US SSN data identifier in Symantec Data Loss Prevention
12.5. The Radomized US SSN data identifier detects both traditional and randomized SSNs.
Symantec recommends that you update your SSN policies to use the Randomized US SSN
data identifier if you have not done so already.
See “Updating policies to use the Randomized US SSN data identifier” on page 810.
Table 21-9 lists the policy templates updated for this release of Symantec Data Loss Prevention.

Table 21-9 Policy templates updated in Data Loss Prevention version 12.5

Updated template Updated component(s) Policy description

General Data Protection Data identifiers This policy protects personal identifiable information
Regulations (Banking and related to banking and finance.
Keyword lists
Finance)
See “General Data Protection Regulation (Banking
and Finance)” on page 1583.

General Data Protection Data identifiers This policy protects personal identifiable information
Regulation (Digital Identity) related to digital identity.
Keyword lists
See “General Data Protection Regulation (Digital
Identity)” on page 1617.

General Data Protection Data identifiers This policy protects personal identifiable information
Regulation (Government related to government identification.
Keyword lists
Identification)
See “General Data Protection Regulation
(Government Identification)” on page 1618.

General Data Protection Data identifiers This policy protects personal identifiable information
Regulation (Healthcare and related to healthcare and insurance.
Keyword lists
Insurance)
See “General Data Protection Regulation (Healthcare
and Insurance)” on page 1656.
Administering policies 448
Updating policies after upgrading to the latest version

Table 21-9 Policy templates updated in Data Loss Prevention version 12.5 (continued)

Updated template Updated component(s) Policy description

General Data Protection Data identifiers This policy protects personal identifiable information
Regulation (Personal Profile) related to personal profile data.
Keyword lists
See “General Data Protection Regulation (Personal
Profile)” on page 1672.

General Data Protection Data identifiers This policy protects personal identifiable information
Regulation (Travel) related to travel.
Keyword lists
See “General Data Protection Regulation (Travel)”
on page 1675.
Chapter 22
Best practices for authoring
policies
This chapter includes the following topics:

■ Best practices for authoring policies

■ Develop a policy strategy that supports your data security objectives

■ Use a limited number of policies to get started

■ Use policy templates but modify them to meet your requirements

■ Use the appropriate match condition for your data loss prevention objectives

■ Test and tune policies to improve match accuracy

■ Start with high match thresholds to reduce false positives

■ Use a limited number of exceptions to narrow detection scope

■ Use compound conditions to improve match accuracy

■ Author policies to limit the potential effect of two-tier detection

■ Use policy groups to manage policy lifecycle

■ Follow detection-specific best practices

Best practices for authoring policies


This section provides general policy authoring best practices for Symantec Data Loss
Prevention. This section assumes that the reader has general familiarity with policy authoring,
including the configuration, testing, and deployment of policies, detection rules, match
conditions, and policy exceptions
Best practices for authoring policies 450
Best practices for authoring policies

See “About Data Loss Prevention policies” on page 368.


See “Detecting data loss” on page 381.
Best practices are not intended to provide detailed troubleshooting guidance. Rather, it is goal
of this section to provide best practices that, when followed, will help to reduce the need for
policy troubleshooting and support.

Table 22-1 Summary of policy authoring best practices

Best practice Description

Develop a policy strategy that supports your data security See “Develop a policy strategy that supports your data
objectives. security objectives” on page 451.

Use a limited number of policies to get started. See “Use a limited number of policies to get started”
on page 451.

Use policy templates but modify them to meet your See “Use policy templates but modify them to meet your
requirements. requirements” on page 452.

Use policy groups to manage policy lifecycle. See “Use policy groups to manage policy lifecycle”
on page 457.

Use the appropriate match condition for your data loss See “Use the appropriate match condition for your data
prevention objectives. loss prevention objectives” on page 452.

Test and tune policies to improve match accuracy. See “Test and tune policies to improve match accuracy”
on page 453.

Start with high match thresholds to reduce false positives. See “Start with high match thresholds to reduce false
positives” on page 454.

Use a limited number of exceptions to narrow detection See “Use a limited number of exceptions to narrow
scope. detection scope” on page 455.

Use compound conditions to improve match accuracy. See “Use compound conditions to improve match
accuracy” on page 455.

Author policies to limit the potential effect of two-tier See “Author policies to limit the potential effect of two-tier
detection. detection” on page 456.

Follow detection-specific best practices. See “Follow detection-specific best practices” on page 457.
Best practices for authoring policies 451
Develop a policy strategy that supports your data security objectives

Develop a policy strategy that supports your data


security objectives
The goal of detection is to achieve accurate results based on true policy matches. Well-authored
policies should accurately detect the data you want to protect with minimal false positives.
Through the use of well-defined policies that implement the right type and combination of rules,
conditions, and exceptions, you can achieve accurate detection results and prevent the loss
of the most critical data in your enterprise
There are two general approaches to developing a data loss prevention policy strategy:
■ Information-driven – Identify sensitive data and author policies to prevent it from being lost.
■ Regulation-driven– Review government and industry regulations and author policies to
comply with them.
Table 22-2 describes these two approaches in more detail.

Table 22-2 Policy detection approaches

Approach Description

Information-driven With this approach you start by identifying specific data items and data combinations you
want to protect. Examples of such data may include fields profiled from a database, a list of
keywords, a set of users, or a combination of these elements. You then group similar data
items together and create policies to identify and protect them. This approach works best
when you have limited access to the data or no particular concerns about a given regulation.

Regulation-driven With this approach you begin with a policy template based on the regulations with which you
must comply. Examples of such templates may include HIPAA or FACTA. Also, begin with
a large set of data (such as customer or employee data). Use the high-level requirements
stipulated by the regulations as the basis for this approach. Then, decide what sensitive data
items and documents in your enterprise meet these requirements. These data items become
the conditions for the detection rules and exceptions in your policies.

Use a limited number of policies to get started


The policy detection rules you implement are based on your organization's information security
objectives. The actions you take in response to policy violations are based on your organization's
compliance requirements. In general you should start small with policy detection. Enable one
or two policy templates, or a few simple conditions, such as keyword matching. Review the
incidents each policy detects. Tune the results before you implement response rules to take
action.
Generally it is better to have fewer policies that are configured to address specific data loss
prevention objectives rather than many policies that attempt to address all of your security
Best practices for authoring policies 452
Use policy templates but modify them to meet your requirements

requirements. Having too many policies can impact the performance of the system and can
lead to too many false positives.
See “Test and tune policies to improve match accuracy” on page 453.

Use policy templates but modify them to meet your


requirements
Policy templates provide an excellent starting point for authoring policies. Symantec Data Loss
Prevention provides 65 pre-built policy templates that contain detection rules and conditions
for many different types of use cases, including regulatory compliance, data protection, security
enforcement, and acceptable use scenarios.
You should use the system-provided policy templates as starting points for your policies. Doing
so will save time and help you avoid errors and information gaps in your policies since the
detection methods are predefined. However, for most situations you will want to modify the
policy template and tailor it for your specific environment. Deploying a policy template
out-of-the-box without configuring it for your environment is not recommended.
See “Creating a policy from a template” on page 397.

Use the appropriate match condition for your data


loss prevention objectives
To prevent data loss, it is necessary to accurately detect all types of confidential data wherever
that data is stored, copied, or transmitted. To meet your data security objectives, you need to
implement the appropriate detection methods for the type of data you want to protect. The
recommendation is to determine the detection methods that work best for you, and tune the
policies as necessary based on the results of your detection testing.
Table 22-3 describes the primary use case for each type of policy match condition provided
by Data Loss Prevention.

Table 22-3 Match conditions compared

Type of data you want to protect Condition Matching

Personally Identifiable Information (PII), such as EDM Exact profiled data


SSNs, CCNs, and Driver's License numbers
Data Identifiers Described, validated data patterns
Best practices for authoring policies 453
Test and tune policies to improve match accuracy

Table 22-3 Match conditions compared (continued)

Type of data you want to protect Condition Matching

Confidential documents, such as Microsoft Word, IDM Exact file contents


PowerPoint, PDF, etc.
Partial file contents (derivative)

VML Similar file contents

Confidential files and images, such as CAD IDM Exact file


drawings
File Properties File context (type, name, size)

Words and phrases, such as "Confidential" or Keywords Exact words, phrases, proximity
"Proprietary"

Characters, strings, text Regular Expressions Described text

Network and endpoint communications Protocol and Endpoint Protocols, destinations, monitoring

Determined by the identity of the user, sender, Synchronized DGM Exact identity from LDAP server
recipient
Profiled DGM Exact profiled identity

Sender/user, recipient Described identity patterns

Describes a document, such as author, title, date, Content-based conditions File type metadata
etc.

Test and tune policies to improve match accuracy


When you create detection policies, there are two common detection problems to avoid. If you
create a policy that is too general or too broad, it generates incidents when no real match has
occurred (false positive). On the other hand, if a policy has rules that are too specific or narrow
about the data it detects, the policy may miss some of the matches you intend to catch (false
negatives). Table 22-4 describes these common problems in more detail.
To reduce false positives and negatives, you need to tune your policies. The best way to tune
detection is to identify a single, specific use case that is a priority, such as protecting source
code for a particular product. You then create a single policy—either from scratch or based
on a template, depending on your DLP strategy—containing one or two detection rules and
test the policy to see how many (quantity) and the types (quality) of incidents the policy
generates. Based on these initial results, you adjust the detection rule(s) as needed. If the
policy generates more false positives than you want, make the detection rule(s) more specific
by fine-tuning the existing match conditions, adding additional match conditions, and creating
policy exceptions. If the policy does not detect some incidents, make the detection condition(s)
less specific.
Best practices for authoring policies 454
Start with high match thresholds to reduce false positives

As your policies mature, it is important to continuously test and tune them to ensure ongoing
accuracy.
See “Follow detection-specific best practices” on page 457.

Table 22-4 Common detection problems to avoid

Problem Cause Description

False positives Policy rules too False positives create high costs in time and resources that are required to
general or broad investigate and resolve apparent incidents that are not actual incidents. Since
many organizations do not have the capacity to manage excess false positives,
it is important that your policies define contextual rules to improve accuracy.

For example, a policy is designed to protect customer names and generates an


incident for anything that contains a first and last name. Since most messages
contain a name—in many cases both first and last names—this policy is too broad
and general. Although it may catch all instances of customer names being sent
outside the network, this policy will return too many false positives by detecting
email messages that do not divulge protected information. First and last names
require a much greater understanding of context to determine if the data is
confidential

False Policy rules too False negatives obscure gaps in security by allowing data loss, the potential for
negatives tight or narrow financial losses, legal exposure, and damage to the reputation of an organization.
False negatives are especially dangerous because you do not know you have
lost sensitive data.

For example, a policy that contains a keyword match on the word "confidential"
but also contains a condition that excludes all Microsoft Word documents would
be too narrow and be suspect to false negatives because it would likely miss
detecting many actual incidents contained in such documents

See “Start with high match thresholds to reduce false positives” on page 454.
See “Use a limited number of exceptions to narrow detection scope” on page 455.
See “Use compound conditions to improve match accuracy” on page 455.

Start with high match thresholds to reduce false


positives
For content-based detection rules, there is a configuration setting that lets you "count all
matches" but only report an incident after a threshold number of matches has been reached.
The general recommendation is to start with high match thresholds for your content-based
detection policies. As you tune your policies you can reduce the match thresholds to be more
precise.
Best practices for authoring policies 455
Use a limited number of exceptions to narrow detection scope

See “Configuring match counting” on page 421.

Use a limited number of exceptions to narrow


detection scope
You can implement exception conditions for any detection rule, except EDM rules. The limited
use of exception conditions can help to reduce false positives by narrowing the scope of policy
detection. However, if you need to use several exceptions in a single policy to achieve the
desired detection results, reconsider the design of the policy. Make sure the policy is
well-defined and uses the proper match conditions.

Caution: Too many compound exceptions in a policy can cause system performance issues.
You should avoid the use of compound exceptions as much as possible.

It is important to understand how exception conditions work so you can use them properly.
Exception conditions disqualify messages from creating incidents. Exception conditions are
checked first by the detection server before match conditions. If the exception condition matches,
the system immediately discards the entire message or message component that met the
exception. There is no support for match-level exceptions. Once the message or message
component is discarded by meeting an exception, the data is no longer available for policy
evaluation.
See “Exception conditions” on page 393.
See “Use compound conditions to improve match accuracy” on page 455.

Use compound conditions to improve match accuracy


Compound conditions can help you improve the match accuracy of your policies. Suppose
you are concerned about Microsoft Word documents leaving the network. Initially, you add a
policy that uses an attachment type condition to catch all Word files. You quickly discover that
too many messages contain Word file attachments that do not divulge protected information.
When you examine the incidents more closely, you realize that you are more concerned with
Word files that contain the word CONFIDENTIAL. In this case you can convert the attachment
type condition to a compound rule by adding a keyword rule for the word CONFIDENTIAL.
Such a configuration would achieve more accurate detection results.
See “Compound conditions” on page 394.
Best practices for authoring policies 456
Author policies to limit the potential effect of two-tier detection

Author policies to limit the potential effect of two-tier


detection
The Exact Data Matching (EDM) and profiled Directory Group Matching (DGM) conditions
require two-tier detection. For these conditions, the DLP Agent must send the data to the
Endpoint Server for evaluation. Indexed Document Matching (IDM) uses two-tier detection if
it is enabled.
See “Two-tier detection for DLP Agents” on page 395.
On the endpoint the DLP Agent executes the least expensive rules first. If you are deploying
a policy to the endpoint that requires two-tier detection, you can author the policy in such a
way to limit the potential effect of two-tier detection.
Table 22-5 provides some considerations for authoring policies to limit the potential effect of
two-tier detection.
See “Detection messages and message components” on page 391.

Table 22-5 Policy configurations for two-tier detection rules

Two-tier match condition Policy configuration

Exact Data Matching (EDM) For EDM policies, consider including Data Identifier rules OR'd with EDM rules.
For example, for a policy that uses an EDM condition to match social security
numbers, you could add a second rule that uses the SSN Data Identifier condition.
The Data Identifier does not require two-tier detection and is evaluated locally by
the DLP Agent. If the DLP Agent is not connected to the Endpoint Server when
the DLP Agent receives the data, the DLP Agent can still perform SSN pattern
matching based on the Data Identifier condition.

See “Combine Data Identifiers with EDM rules to limit the impact of two-tier
detection” on page 610.

For example policy configurations, each of the policy templates that provide EDM
conditions also provide corresponding Data Identifier conditions.

See “Choosing an Exact Data Profile” on page 409.


Best practices for authoring policies 457
Use policy groups to manage policy lifecycle

Table 22-5 Policy configurations for two-tier detection rules (continued)

Two-tier match condition Policy configuration

Indexed Document Matching For IDM policies that match file contents, consider using VML rules OR'd with IDM
(IDM) rules. VML rules do not require two-tier detection and are executed locally by the
DLP Agent. If you do not need to match file contents exactly, you may want to use
VML instead of IDM.

See “Use the appropriate match condition for your data loss prevention objectives”
on page 452.

If you are only concerned with file matching, not file contents, consider using
compound file property rules instead of IDM. File property rules do not require
two-tier detection.

See “Use compound file property rules to protect design and multimedia files”
on page 909.

Directory Group Matching (DGM) For the synchronized DGM Recipient condition, consider including a Recipient
Matches Pattern condition OR'd with the DGM condition. The pattern condition
does not require two-tier detection and is evaluated locally by the DLP Agent.

See “About two-tier detection for synchronized DGM” on page 936.

Use policy groups to manage policy lifecycle


Use policy groups to test policies before using them in production. Create a test policy group
to which only you have access. Then, create policies and add them to the test policy group.
Review the incidents your test policies capture. After you tune the policies and confirm that
they capture the expected incidents, you can rename the policy group and grant the appropriate
roles access to it. You can also use policy groups to manage legacy policies, as well as policies
you want to import or export.
See “Policy groups” on page 372.
See “Removing policies and policy groups” on page 443.

Follow detection-specific best practices


In additional to these general policy authoring considerations, you should be aware of and
keep in mind policy tuning considerations specific to each type of match condition.
Table 22-6 lists detection specific considerations, with links to topics for more information.
Best practices for authoring policies 458
Follow detection-specific best practices

Table 22-6 Best practices for specific detection methods

Detection method Description

EDM See “Best practices for using EDM” on page 601.

IDM See “Best practices for using IDM” on page 648.

VML See “Best practices for using VML” on page 687.

Data identifiers See “Best practices for using data identifiers” on page 833.

Keywords See “Best practices for using keyword matching” on page 849.

Regular expressions See “Best practices for using regular expression matching” on page 855.

Non-English language See “Best practices for detecting non-English language content” on page 867.
detection

File properties See “Best practices for using file property matching” on page 909.

Network protocols See “Best practices for using network protocol matching” on page 914.

Endpoint events See “Best practices for using endpoint detection” on page 923.

Described identities See “Best practices for using described identity matching” on page 932.

Synchronized DGM See “Best practices for using synchronized DGM” on page 941.

Profiled DGM See “Best practices for using profiled DGM” on page 946.

Metadata detection See “Best practices for using metadata detection” on page 991.
Chapter 23
Increasing the Inspection
Content Size
This chapter includes the following topics:

■ Increasing the inspection content size

Increasing the inspection content size


Data Loss Prevention provides an easier way for you to increase the inspection content size.
The default maximum file inspection size is unchanged (30 MB), but you can easily adjust the
inspection size to higher values. The adjustments can be made using a slider at the System
> Servers and Detectors > Overview > Configure Server page under the Detection tab for
detection servers. Currently, the highest limit for the servers (except Discover Exchange
Crawler and Web Prevent) is 2 GB.
The adjustments can be made using a slider at the System > Agents > Agent Configuration
page under the Settings tab for agent configurations. Currently the highest limit for the DLP
Agent is 150 MB.
There are different content inspection file size limits for different channels. Table 23-1 lists the
different channels that Symantec has tested and the corresponding supported file size limits.

Table 23-1 Channel-specific content inspection file size limits

Channel File size limit

Endpoint Prevent 150 MB

EDAR 150 MB

Discover

Discover Exchange Crawler 150 MB


Increasing the Inspection Content Size 460
Increasing the inspection content size

Table 23-1 Channel-specific content inspection file size limits (continued)

Channel File size limit

Discover File System 2 GB

Discover Sharepoint 2 GB

Appliance - REST 1.2 GB

1.7 GB (Base64 encoded)

Web Prevent

Web Prevent FTP 150 MB

Web Prevent HTTPS/HTTP 100 MB

SMTP Prevent 150 MB

Increasing the maximum inspection size limit for files means that larger files are inspected.
Inspection of larger files takes longer and requires more memory for the inspection to complete.
Also, timeout limits increase, so the detection engine takes longer to timeout in the case of
detection failures.
Depending on the content inspection size you choose, certain advanced settings are
automatically adjusted. The Inspection Content Size feature only shows the inspection size
options that you can enable based on your existing system memory.

Note: To complete the update, you must restart the service after you have increased the
maximum inspection size limit using the slider or edited any properties files.

The behavior of the "Increasing the maximum inspection size limit" feature is enabled or
disabled depending on many factors:
■ For a new detection server, the slider is disabled by default and the box is not checked.
■ For a new Agent, the slider is enabled at 30 MB by default and the box is checked.
■ Memory limits on the server are different from memory limits on the agent.
■ You cannot use the slider to increase the maximum inspection size limit if the detection
server is not connected an Enforce Server.

Note: The maximum inspection size limit for the DLP cloud services is not
customer-configurable. These limits are enumerated in the Service Description for the DLP
cloud services. This feature is only available for detection servers, appliances, and the DLP
Agent.
Increasing the Inspection Content Size 461
Increasing the inspection content size

To customize the inspection content size


1 Go to System > Servers and Detectors > Configure a Server for detection servers or
System > Agents > Agent Configuration > Settings for DLP Agents.
2 Click the Detection tab for detection servers or go to the Setting section for DLP Agents.
3 Click Customize settings, under Inspection Content Size.
Move the slider to the size you want. These values that follow are examples only; you
only see the options that can be enabled based on your system memory.
■ 30 MB, 50 MB, 100 MB, or 150 MB for DLP Agents
■ 30 MB, 100 MB, 150 MB, 500 MB, or 2 GB for detection servers and appliances
When you select a new size, Symantec Data Loss Prevention automatically updates
Advanced Server or Advanced Agent settings to implement your selection. If your settings
are different from the preferred and recommended settings, a link to Preview updated
settings appears.
4 Click Preview updated settings to see the Advanced Setting Name, Current Value,
and Preferred Value.
5 For the detection servers only, if you need to change properties file settings, a Tuning
Guidelines link appears. You can click the link and review the Symantec Support Center
article Guidelines for editing properties files to scan large files.You do not need to edit
properties files for the DLP Agent.
6 Restart the service. To complete the update, you must restart the service after you have
adjusted the maximum inspection size limit using the slider or edited any properties files.

System Event Codes


System events are shown whenever the Advanced Settings are updated. For a list of system
events that you might see after Advanced Settings have been updated, see Table 23-2

Table 23-2 System Events for changes in Advanced Settings for larger files.

System event code Description/Message Server or Agent

5306 Agent advanced settings update Agent


is complete.

5307 Agent advanced settings have Agent


been updated.

5308 Agent advanced settings update Agent


has failed.

5309 Server advanced settings update Server


is complete.
Increasing the Inspection Content Size 462
Increasing the inspection content size

Table 23-2 System Events for changes in Advanced Settings for larger files. (continued)

System event code Description/Message Server or Agent

5310 Advanced settings have been Server


updated for the server.

5311 Advanced settings update has Server


failed for the server {0}.

If you choose a setting of 500 MB or greater on the detection server, Symantec recommends
that you enable external storage for incident attachments (blob externalization). To enable
external storage for incident attachments during installation or upgrade, see "External storage
for incident attachments,” in the Symantec Data Loss Prevention Installation Guide and
Symantec Data Loss Prevention Upgrade Guide. You can find the Symantec Data Loss
Prevention Installation Guide at the Symantec Support Center at
https://www.symantec.com/docs/doc9257.html. You can find the Symantec Data Loss
Prevention Upgrade Guide at the Symantec Support Center at
https://www.symantec.com/docs/doc9258.html.
To enable external storage for incident attachments after installation or upgrade, see "About
the incident attachment external storage directory" in the Symantec Data Loss Prevention
System Maintenance Guide. You can find the Symantec Data Loss Prevention System
Maintenance Guide at the Symantec Support Center at
https://www.symantec.com/docs/doc9267.html.
Chapter 24
Installing remote indexers
This chapter includes the following topics:

■ About installing remote indexers

■ Installing a remote indexer on Windows

■ Installing a remote indexer on Linux

■ Configuring a remote indexer on Linux

About installing remote indexers


You install remote indexers on one or more systems where the confidential files you want to
index are stored. The steps to install remote indexers are different depending on the operating
system.

Note: The indexer that is available on the Enforce Server administration console does not
require separate installation. It is installed when you install the Enforce Server.

If you install a remote indexer on Windows, you can perform a Silent Mode installation, or you
run the graphical user interface method to install.
See “Installing a remote indexer on Windows” on page 464.
On Linux, you install RPM files, then you configure the installation. You can configure the
installation using the Silent Mode method or by running a command prompt to enter
configuration parameters.
See “Installing a remote indexer on Linux” on page 466.
You can install the Remote EDM, the Remote EMDI, and the Remote IDM Indexer on all
supported Windows and Linux platforms. See the Symantec Data Loss Prevention System
Requirements Guide for platform details.
Installing remote indexers 464
Installing a remote indexer on Windows

Note: You must be logged on as administrator (Windows) or root (Linux) to install the remote
indexers. There is an issue with the permissions that are needed to run the remote indexers.
You need to follow a workaround procedure to assure that users other than administrator or
root can run the remote indexers.

See “Installing a remote indexer on Windows” on page 464.

Installing a remote indexer on Windows


Follow this procedure to install the remote indexer software on a remote indexer computer.
You specify the type of remote indexer during the configuration process that follows this
installation process.

Note: The following instructions assume that the indexer installer (Indexers.msi) has been
copied from the Enforce Server to a local directory on the remote computer. The Indexers.msi
file is included in your software download (DLPDownloadHome) directory. It should have been
copied to a local directory on the Enforce Server during the Enforce Server installation process.

Using the graphical user interface method to install does not generate log information. To
generate log information, run the installation using the following command:
C:\msiexec /i Indexers.msi /L*v c:\indexers_install.log

You can complete the installation using Silent Mode. Enter values with information specific to
your installation for the following:

Table 24-1 Indexer Silent Mode installation parameters for Windows

Command Description

INSTALLATION_DIRECTORY Specifies where the remote indexer is installed. The


default location is C:\Program
Files\Symantec\DataLossPrevention.

DATA_DIRECTORY Defines where Symantec Data Loss Prevention


stores files that are updated while the indexer is
running (for example, logs and licenses). The default
location is

C:\ProgramData\Symantec\DataLossPrevention
\Indexer\.

JRE_DIRECTORY Specifies where the JRE resides.


Installing remote indexers 465
Installing a remote indexer on Windows

Table 24-1 Indexer Silent Mode installation parameters for Windows (continued)

Command Description

FIPS_OPTION Defines whether to disable (Disabled) or enable


(Enabled) FIPS encryption.

The following is an example of what the completed command might look like:

msiexec /i Indexers.msi /qn /norestart /L*v Indexers.log


FIPS_OPTION=Disabled
INSTALLATION_DIRECTORY="C:\Program Files\Symantec\DataLossPrevention"
DATA_DIRECTORY="C:\ProgramData\Symantec\DataLossPrevention\Indexer\"

To install a remote indexer on Windows


1 Log on as Administrator to the system on which you intend to install the remote indexer.
2 Go to the folder where you copied the Indexers.msi file.

Note: Using the graphical user interface method to install does not generate log information.
To generate log information, run the installation using the following command:
C:\msiexec /i Indexers.msi /L*v c:\indexer_install.log

3 Double-click Indexers.msi to open the file, and click OK.


4 In the Welcome panel, click Next.
5 After you review the license agreement, select I accept the agreement, and click Next.
6 In the Destination Folder panel, accept the default destination directory, or enter an
alternate directory, and click Next. The default installation directory is:
c:\Program Files\Symantec\DataLossPrevention\

Symantec recommends that you use the default destination directory. References to the
"installation directory" in Symantec Data Loss Prevention documentation are to this default
location.
7 In the JRE Directory panel, accept the default JRE location (or click Browse to locate
it), and click Next.
8 In the FIPS Cryptography Mode panel, select whether to disable or enable FIPS
encryption.
9 Click Next.
10 Click Install.
See “About the Remote EDM Indexer” on page 586.
Installing remote indexers 466
Installing a remote indexer on Linux

See “About the Remote EMDI Indexer” on page 505.


See “About the Remote IDM Indexer” on page 655.
See “Installing a remote indexer on Linux” on page 466.

Installing a remote indexer on Linux


Follow this procedure to install the remote indexer software on a remote indexer computer.
You specify the type of remote indexer during the configuration process that follows this
installation process.

Note: The following instructions assume that the Indexers.zip file has been copied into the
/opt/temp/ directory on the server computer.

To install an indexer on Linux


1 Log on as root to the computer on which you intend to install the remote indexer.
2 Copy the remote indexer installer (Indexers.zip) from the Enforce Server to a local
directory on the remote indexer computer. The Indexers.zipfile is included in your
software download (DLPDownloadHome) directory. It should have been copied to a local
directory on the Enforce Server during the Enforce Server installation process.
3 Navigate to the directory where you copied the Indexers.zip file (/opt/temp/).
4 Unzip the file to the same directory.
5 Confirm file dependencies for RPM files by running the following command:
rpm -qpR symantec-dlp-15-1-indexers-15.5-1.el6.x86_64.rpm

6 Run the following command to install all RPM files in the folder:
rpm -ivh *.rpm

See “Configuring a remote indexer on Linux” on page 466.

Configuring a remote indexer on Linux


After you install a remote indexer, you configure it by running the Remote indexer configuration
utility.
You can compete the installation using Silent Mode. Table 24-2 lists the installation parameters
you use during the remote indexer Silent Mode installation.
Installing remote indexers 467
Configuring a remote indexer on Linux

Table 24-2 Indexer Silent Mode installation parameters on Linux

Command Description

jreDirectory Specifies where the JRE resides.

fipsOption Defines whether to disable (Disabled) or enable


(Enabled) FIPS encryption.

The following is an example of what the completed command might look like:

./IndexersConfigurationUtility -silent
-jreDirectory=/opt/Symantec/DataLossPrevention/Server\ JRE/1.8.0_181/
-fipsOption=Disabled

To configure a remote indexer on Linux


1 Navigate to the installation directory:
/opt/Symantec/DataLossPrevention/Indexers/15.5/Protect/install

2 Run the remote indexer configuration utility. Use the following command to launch the
utility:
./IndexersConfigurationUtility

3 Enter the following information in the Remote indexer configuration utility:

JRE directory Enter the JRE directory.

The default directory is


/opt/Symantec/DataLossPrevention/Server
JRE/[JRE version].
Note: If you install the JRE before running
./IndexersConfigurationUtility, then you do not
enter the JRE directory. The Remote Indexer
Configuration Utility automatically defines the
JRE path.

FIPS encryption Select whether to disable or enable FIPS


encryption.

See “About the Remote EDM Indexer” on page 586.


See “About the Remote EMDI Indexer” on page 505.
See “About the Remote IDM Indexer” on page 655.
Chapter 25
Detecting content using
Exact Match Data
Identifiers (EMDI)
This chapter includes the following topics:

■ Introducing Exact Match Data Identifiers (EMDI)

■ Configuring Exact Match Data Identifier profiles

■ Using multi-token matching with EMDI

■ Memory requirements for EMDI

■ Remote EMDI indexing

■ Properties file settings for EMDI

■ Best practices for using EMDI

■ EMDI Troubleshooting

Introducing Exact Match Data Identifiers (EMDI)


Exact Match Data Identifier (EMDI) detection is a powerful exact matching detection technology
that enables you to detect structured data, especially personally-identifiable information (PII),
with a high degree of accuracy. You can use EMDI to exactly match indexed records across
all Data Loss Prevention channels. Fast performing and secure, EMDI can help you reduce
false positives when compared to data identifiers and regular expressions. EMDI provides
better matching performance and greater memory efficiency than Exact Data Matching (EDM).
Detecting content using Exact Match Data Identifiers (EMDI) 469
Introducing Exact Match Data Identifiers (EMDI)

Before you proceed with EMDI, it's important for you to have a good understanding of data
identifiers and how they are used in Symantec Data Loss Prevention.
See “About using EMDI to protect content” on page 469.

About using EMDI to protect content


EMDI works as an additional validation check against data identifier pattern matchers. With
EMDI, Data Loss Prevention doesn't rely on the Credit Card Number data identifier to match
any pattern that looks like a credit card number and passes a Luhn check. Instead, EMDI
enables customers to exactly match only the credit card numbers that are contained within
their index of records. To exactly match, you can use the Credit Card Number and at least
one additional column of identifying information within the index of records, such as the Issuing
Bank Number that corresponds to that record in the data source that the EMDI profile uses.
Since data sources can contain more than two kinds of information, you could also use the
Card Expiration Date as a third field to ensure an accurate match. Both system (built-in) and
custom data identifiers are supported.
EMDI covers every EDM detection use case that involves two or more columns with at least
one column that has highly unique data that matches a highly discriminatory pattern (that is
expressible with a data identifier). These columns are known as "key columns."
EMDI supports up to 4 million rows and 32 columns per index. These larger indexes are always
deployed to detection servers, appliances, and cloud services. Indexes larger than 100 MB
are not distributed to DLP Agents by default, but this maximum limit can be configured. All
existing system data identifiers and most custom data identifiers are supported.
You configure EMDI at Manage Data Profiles > Exact Data > Add Exact Match Data Identifier
Profile. provides the steps you need to take for implementation.
To configure EMDI
1 You identify and prepare the data you want to protect.
2 You create an Exact Match Data Identifier profile and identify data source columns as
Required, Optional, or Ignore to generate a match. Required columns must be mapped
to either a built-in system data identifier or a custom data identifier.
3 You enable the index as an Exact Match Data Identifier validator either inline in a policy
as part of a data identifier condition, or as part of the configuration of the data identifier.
4 When you add an EMDI validator to an existing data identifier validator, EMDI is used
each time the existing validator is used in a policy.
5 You index the structured data source using the Enforce Server administration console,
or remotely using the Remote EMDI Indexer. During the indexing process, the system
indexes record data that is contained within tabular CSV files. You can schedule indexing
on a regular basis to ensure that the EMDI index reflects the current data.
Detecting content using Exact Match Data Identifiers (EMDI) 470
Introducing Exact Match Data Identifiers (EMDI)

See “About EMDI and key columns” on page 470.

About EMDI and key columns


An important concept for EMDI is the "key column." When using EMDI, you must specify two
or more columns with at least one "key column" that has highly unique and discriminatory
values that matches a distinctive pattern (that is expressible with a data identifier).
In the following examples the data in the first (bold) "key" column is used as a data identifier
pattern that must be in a match.
■ Detect two (or more) out of
(Account Number, Routing Number First Name, Last Name, Last 4 SSN)
■ Detect two (or more) out of
(Driver's License Number, First Name, Last Name, DOB, Address, City, State)
■ Detect two (or more) out of
(Medical Record Number, First Name, Last Name, Last 4 SSN)
■ Detect two (or more) out of
(Credit Card Number, Issuing Bank Name, CVV, Card Expiration Date)
■ Detect both of
(Part Number, Part Description)
See “About EMDI policy features” on page 470.

About EMDI policy features


EMDI policy matching includes validation of matching data identifier patterns using an indexed
data source. It searches for indexed content in a given message or file. Then it generates an
incident if a match is found within a proximity window before and after the data identifier match.
A proximity window of 50 tokens before and 50 tokens after the data identifier match is the
default value and maximum value. This value is configurable; you can change it from 1 to 50.
Policy matching requirements and features of EMDI include the following:
■ You must specify one required column that can be matched by a highly discriminating data
identifier. This column is referred to as the "key column."
■ The key column must be highly variable (with few repeating values).
■ A minimum of two columns are required for a match; a required "key" column and an
optional column.
■ For highly variable data (with few repeated values in the index) the EMDI algorithm
generates fewer than one false positive per 1000 data identifier matches. Common repeated
values in key or non-key columns may result in higher rates of false positives.
Detecting content using Exact Match Data Identifiers (EMDI) 471
Introducing Exact Match Data Identifiers (EMDI)

■ The number of rows per index is limited to 4 million.


■ The system provides match highlighting at the incident snapshot screen. Tokens from
matching rows are highlighted, not only the matching data identifier value.
■ EMDI supports single-token and multi-token cell indexing and matching. A multi-token is
a cell that contains two or more words. Since a single CJK (Chinese, Japanese, Korean)
character is regarded as a token, two or more CJK characters are treated as a multi-token.
See “EMDI compared to EDM” on page 471.

EMDI compared to EDM


EMDI relies on a different underlying detection technology than EDM, and is neither a substitute
nor a replacement for EDM. However, one of the advantages of EMDI over EDM is that EMDI
is available as a locally-executed exact matching technology on the DLP Agent. EDM is only
available on the DLP Agent in two-tier detection mode.Table 25-1 lists comparisons between
EMDI and EDM.

Table 25-1 EMDI compared to EDM

EMDI EDM

EMDI can support EDM detection scenarios that involve matching There is no requirement that EDM
against two or more columns of a data source when at least one of must match against a column that
those columns matches a data identifier. EMDI supports both system can be represented by a data
and custom data identifiers. identifier.

EMDI scans an entire data source, within the stated limits. By default, EDM scans only the first
30,000 tokens for inspected
content, though this limit can be
increased.

EMDI performs matching locally on the DLP Agent, so there is no EDM is only available on the DLP
need to implement two-tier detection. Agent in two-tier detection mode.

Available on all channels, including detection servers, appliances, EDM is available on detection
the cloud, and DLP Agents (including disconnected DLP Agents). servers, appliances, and the cloud.
EDM is only available on the
endpoint in two-tier detection mode.

Supports blocking, user notification, and encryption on the DLP EDM is only available on the DLP
Agent. Agent in two-tier detection mode.
When operating in two-tier detection
mode, the DLP Agent does not
support synchronous response
actions such as blocking, user
notification, or encryption.
Detecting content using Exact Match Data Identifiers (EMDI) 472
Introducing Exact Match Data Identifiers (EMDI)

Table 25-1 EMDI compared to EDM (continued)

EMDI EDM

The memory footprint for EMDI is 1/5 of the memory footprint for EDM memory footprint is about 5
EDM for the same indexed data source. times that of the memory footprint
for EMDI.

EMDI supports up to 4 million rows x 32 columns per index up to EDM supports hundreds of millions
128 million cells per index. of rows x 32 columns up to 6 billion
cells per index.

EMDI has a stringent security model that makes it suitable for profile EDM profiles are never deployed
deployment on the DLP Agent. on the DLP Agent.

There is no natural language processing for Chinese, Japanese, EDM supports natural language
and Korean for EMDI matching. processing for Chinese, Japanese,
and Korean.

You can use either EMDI or EDM for some exact matching cases that have at least two source
columns and where one column has values that can be expressed with a data identifier. The
following recommendations detail when it is better to use EMDI rather than EDM, and vice
versa.

Use EMDI instead of EDM if:


■ You already use data identifiers and you want to improve detection accuracy with exact
matching.
■ You need exact matching and detection-time enforcement on your DLP Agents, such as
blocking, user notification, or encryption.
■ You have a need to be more flexible with the identifier detection. For example, you need
to detect identifiers with nonstandard separator characters (for example, match 123*456
or 123/456 or 123_456).
■ You need to use exact matching in an exception.

Use EDM instead of EMDI if:


■ You need to exclude specific combinations of columns from a match. For example, you
need to match three of the following four columns: Identification Number, Last Name, City,
and Postal Code; but you need to exclude the Last Name, City, and Postal Code
combination.
■ You need to use more discriminating policy features, such as data owner exception and
the where clause.
■ You need to protect against indexes with a large number of rows (greater than 4 million).
Detecting content using Exact Match Data Identifiers (EMDI) 473
Introducing Exact Match Data Identifiers (EMDI)

See “About the Exact Match Data Identifier profile and index” on page 473.

About the Exact Match Data Identifier profile and index


The Exact Match Data Identifier Profile is the user-defined configuration that you create to
index the data source. The index is a secure file that contains hashes of the exact data values
from each field in your data source, along with information about those data values. The index
does not contain the data values themselves.
The index that is generated consists of one binary source file called EmdiDataSource.rdx. By
default, Symantec Data Loss Prevention stores index files in
C:\ProgramData\Symantec\DataLossPrevention\ServerPlatformCommon\15.5\Protect\index
(on Windows) or in
/var/Symantec/DataLossPrevention/ServerPlatformCommon/15.5/Protect/index (on
Linux) on the Enforce Server. Symantec Data Loss Prevention automatically deploys all EMDI
indexes (*.rdx files) to the index directory on all detection servers.
The system deploys the endpoint index (EmdiDataSource.rdx) to each designated Endpoint
Server. When a DLP Agent connects to the Endpoint Server, the DLP Agent downloads the
latest version of the endpoint index; if the agent already has the latest version of the index,
nothing happens. The indexes are saved in an encrypted binary format in the endpoint database.
When an active policy that references an EMDI profile is deployed to a detection server, the
detection server loads the corresponding EMDI index into RAM. If a new detection server is
added after an index has been created, the *.rdx files in the index folder on the Enforce
Server are deployed to the index folder on the new detection server. You cannot manually
deploy index files to detection servers.
See “About the Exact Match Data Identifier source file” on page 473.

About the Exact Match Data Identifier source file


The data source file is a tabular file containing data in a standard delimited format (comma,
semicolon, pipe, or tab). You extract the data from a database, spreadsheet, or other structured
data source. You also cleanse the data for profiling. You upload the data source file to the
Enforce Server when you define the Exact Match Data Identifier Profile. For example, you
can convert an Excel spreadsheet to a comma-separated values (CSV) format. The resulting
*.csv file can be used as the data source for your EMDI profile.
See “Cleanse the EMDI data source file of blank columns and duplicate rows” on page 519.
See “Creating the Exact Match Data Identifier source file” on page 477.
You can use the SQL preindexer to index the data source directly. However, this approach
has limitations because in most cases the data must first be cleansed before it is indexed.
See “Remote EMDI indexing” on page 504.
Detecting content using Exact Match Data Identifiers (EMDI) 474
Introducing Exact Match Data Identifiers (EMDI)

The data source file must contain at least one key column that contains largely unique values
that can be expressed as a data identifier. The parameters affecting the uniqueness of the
key columns can be edited in the Indexer.properties file located at \Program
Files\Symantec\Data Loss
Prevention\EnforceServer\15.5\Protect\config\Indexer.properties (Windows)
or/Symantec/DataLossPrevention/EnforceServer/15.5/Protect/config/Indexer.properties
(Linux).

These parameters are listed in Table 25-2.

Table 25-2 Parameters affecting indexer sensitivity to key-column uniqueness

Parameters in Indexer.properties Function

EMDI.MaxDuplicateCellsPercentage=1 Maximum percentage of duplicated key column cells


in the index; the default value is 1%.

EMDI.MaxNonMatchingDIPercentage=1 Maximum percentage of key column cells that don’t


match the data identifier that is assigned to this
profile; the default value is 1%.

Non-configurable limits for EMDI: The same value can appear no more than five times in a
key column in a given EMDI index. This is a different number than
EMDI.MaxDuplicateCellsPercentage, which instead indicates the total number of duplicates
in the index.
See “Best practices for using EMDI” on page 517.

Note: The format for the data source file should be a text-based format using commas,
semicolons, pipes, or tabs as delimiters. You should avoid using a spreadsheet format for the
data source file (such as XLS or XLSX) because such programs use scientific notation to
render numbers.

See “About cleansing the Exact Match Data Identifier source file” on page 474.

About cleansing the Exact Match Data Identifier source file


Once you have created the data source file, you must prepare the data for indexing by cleansing
it. You must cleanse the data source file to ensure that your EMDI policies are as accurate as
possible. You can use tools such as Stream Editor (sed) and awk to cleanse the data source
file. Melissa Data provides tools for normalizing data in the data source, such as addresses.
Table 25-3 provides the steps you must take to cleanse the data source file for indexing.
Detecting content using Exact Match Data Identifiers (EMDI) 475
Introducing Exact Match Data Identifiers (EMDI)

Table 25-3 Workflow for cleansing the data source file

Step Action Description

1 Prepare the data source file for indexing. See “Preparing the Exact Match Data Identifier
source for indexing” on page 478.

2 Ensure that you have specified a key See “About EMDI and key columns” on page 470.
column that can be matched by a highly
variable data identifier. Ensure that the
key column contains reasonably unique
data.

4 Remove incomplete and duplicate See “About cleansing the Exact Match Data Identifier
records. Do not fill empty cells with fake source file” on page 474.
data.

5 Remove improper characters. See “Remove ambiguous character types from the
EMDI data source file” on page 520.

6 Verify that the data source file is below See “Preparing the Exact Match Data Identifier
the error threshold. The error threshold source for indexing” on page 478.
is the maximum percentage of rows that
contain errors before indexing stops.

See “About EMDI index scheduling” on page 475.

About EMDI index scheduling


After you have indexed an exact data source extract, its schema cannot be changed. If the
data source changes, or the number of columns or data mapping of the exact data source file
changes, you must create a new EMDI index and update the policies that reference the changed
data. In this case you can schedule the indexing to keep the index in sync with the data source.
Here is a typical use case: You extract data from a database to a file and cleanse it to create
your data source file. Using the Enforce Server administration console you define an Exact
Match Data Identifier profile and index the data source file. The system generates the *.rdx
index files and deploys them to one or more detection servers, appliances, cloud services,
and agents. If you know that the data changes frequently, you need to generate a new data
source file regularly to keep up with the changes to the database. In this case, you can use
index scheduling to automate the indexing of the data source file so you do not have to return
to the Enforce Server administration console and reindex the updated data source. Your only
task is to provide an updated and cleansed data source file to the Enforce Server for scheduled
indexing.
See “Configuring Exact Match Data Identifier profiles” on page 476.
Detecting content using Exact Match Data Identifiers (EMDI) 476
Configuring Exact Match Data Identifier profiles

Configuring Exact Match Data Identifier profiles


To implement EMDI, you create the Exact Match Data Identifier Profile and index the data
source. You also need to edit an existing data identifier or create a new custom data identifier.
Then, for each data identifier breadth, you must add and configure EMDI as an optional validator
and enable an EMDI validation check during policy creation or on the Manage > Policies >
Data Identifiers page. Table 25-4 details the steps in this process.
See “About the Exact Match Data Identifier profile and index” on page 473.

Table 25-4 Implementing Exact Match Data Identifier matching

Step Action Description

1 Create the data source file. Export the source data from the database (or other data repository) to
a tabular text file with delimited fields.

See “About the Exact Match Data Identifier source file” on page 473.

See “Creating the Exact Match Data Identifier source file” on page 477.

2 Prepare the data source file for Cleanse the data source file.
indexing.
See “Cleanse the EMDI data source file of blank columns and duplicate
rows” on page 519.

3 Upload the data source file to the You can copy or upload the data source file to the Enforce Server, or
Enforce Server. access it remotely.

See “Uploading the Exact Match Data Identifier source files to the
Enforce Server” on page 480.

4 Edit an existing data identifier or See “Adding an EMDI check to a built-in or custom data identifier
create a new custom data condition in a policy” on page 487.
identifier to add EMDI as a
validator.

5 Create an Exact Match Data An Exact Match Data Identifier profile is required to use Exact Match
Identifier profile. Data Identifier matching. The Exact Match Data Identifier profile
specifies the data source, data field types, and the indexing schedule.

See “Adding Exact Match Data Identifier Profiles” on page 482.

See “Creating and modifying the Exact Match Data Identifier profiles”
on page 483.
Detecting content using Exact Match Data Identifiers (EMDI) 477
Configuring Exact Match Data Identifier profiles

Table 25-4 Implementing Exact Match Data Identifier matching (continued)

Step Action Description

6 Mark each column in the data Use the slider to mark each index column (data source field) as Ignore,
source as Ignore, Optional, or Optional, or Required. Each index must contain at least one required
Required, in the data source. ("key") column that is mapped to a system data identifier or custom
data identifier. It must also contain at least one optional column.

See “Adding Exact Match Data Identifier Profiles” on page 482.

See “Creating and modifying the Exact Match Data Identifier profiles”
on page 483.

7 Enable the policy as an Exact After the policy is created, it must be enabled as an Exact Match Data
Match Data Identifier check. Identifier Check for data identifier validation.

See “Adding an EMDI check to a built-in or custom data identifier


condition in a policy” on page 487.

8 Index the data source, or Schedule the indexing to keep the index in sync with the data source.
schedule indexing.
See “About EMDI index scheduling” on page 475.

See “Scheduling EMDI profile indexing” on page 485.

See “Creating the Exact Match Data Identifier source file” on page 477.

Creating the Exact Match Data Identifier source file


The first step in the EMDI indexing process is to create the data source. A data source is a
tabular file containing data in a standard delimited format, with data delimited by commas,
semicolons, pipes, or tabs.
See Table 25-5 for instructions.

Table 25-5 Create the exact match data identifier source file

Step Description

1 Export the data you want to protect from a database or other tabular data format, such as an Excel
spreadsheet, to a tabular text file. The data source file you create must be a tabular text file that contains
rows of data from the original source. Each row from the original source is included as a row in the data
source file. Delimit columns using a tab, a comma, a semi-colon, or a pipe. Pipe is preferred. Comma
should not be used if your data source fields contain numbers.

See “About the exact data source file” on page 529.

The data source file cannot exceed 32 columns or 4 million rows. If you plan to upload the data source
file to the Enforce Server, browser capacity limits the data source size to 2 GB. For file sizes larger than
this size you can copy the file to the Enforce Server using FTP/S, SCP, SFTP, CIFS, or NFS.
Detecting content using Exact Match Data Identifiers (EMDI) 478
Configuring Exact Match Data Identifier profiles

Table 25-5 Create the exact match data identifier source file (continued)

Step Description

2 For all EMDI implementations, make sure that the data source contains at least one column of unique
data values (Required column) and one Optional column. Three or more columns (including one Required
column) are recommended.

3 Prepare the exact match data identifier source file for indexing.

See “Preparing the Exact Match Data Identifier source for indexing” on page 478.

See See “Preparing the Exact Match Data Identifier source for indexing” on page 478. for
instructions.

Preparing the Exact Match Data Identifier source for indexing


Once you create the Exact Match Data Identifier source file, you must prepare it so that you
can index your data. When you index an EMDI profile, the Enforce Server keeps track of empty
cells and any misplaced data that count as errors.
EMDI is designed to detect combinations of globally unique data fields. Your EMDI index must
include at least one column of data that contains nearly unique values for each record in the
row. Column data such as account numbers, social security numbers, and credit card numbers
are often highly unique. On the other hand, states or ZIP Codes are not unique, nor are names.
If you do not include at least one column of unique data (a key column) in your index, your
EMDI profile does not accurately detect the data you want to protect.
Table 1-17 describes the various types of unique data to include in your EMDI indexes, as
well as fields that are not unique. You can include the non-unique fields in your EMDI indexes
as long as you have at least one unique column field.

Table 25-6 Examples of unique data for EMDI policies

Unique data for EMDI Non-unique data for EMDI

The following data fields are often unique: The following data fields are not unique:

■ Account number ■ First name


■ Bank Card number ■ Last name
■ Phone number ■ City
■ Social security number ■ State
■ Tax ID number ■ ZIP Code
■ Drivers license number ■ Password
■ Employee number ■ PIN
■ Insurance number
Detecting content using Exact Match Data Identifiers (EMDI) 479
Configuring Exact Match Data Identifier profiles

When you index an EMDI profile, the Enforce Server keeps track of empty cells and any
misplaced data which count as errors. For example, an error may be a name that appears in
a column for phone numbers. Errors can constitute a certain percentage of the data in the
profile (five percent, by default). If this default error threshold is met, Symantec Data Loss
Prevention stops indexing. It then displays an error to warn you that your data may be
unorganized or corrupted.
To prepare the exact match data identifier source for EMDI indexing
1 Make sure that the data source file is formatted as follows:
■ The data source must have at least two columns and at least one column that can be
mapped to a data identifier. One of the columns should contain unique values. For
example, credit card numbers, driver’s license numbers, or account numbers (as
opposed to first and last names, which are generic).
See “Ensure data source has at least one column of unique data (EDM)” on page 602.
■ Verify that you have delimited the data source using commas, pipes ( | ), tabs, or
semicolons. If the data source file uses commas as delimiters, remove any commas
that do not serve as delimiters.
See “Do not use the comma delimiter if the data source has number fields (EDM)”
on page 605.
■ Verify that data values are not enclosed in quotes.
■ Remove single-character and abbreviated data values from the data source. For
example, remove the column name and all values for a column in which the possible
values are Y and N. You should also remove values such as "CA" for California, or
other abbreviations for states.
■ Remove columns with frequently repeating values.
■ Optionally, remove any columns that contain numeric values with fewer than five digits,
as these can cause false positives in production deployments.
See “Remove ambiguous character types from the data source file (EDM)” on page 604.
■ A field delimiter should not appear in a field value.
■ Eliminate duplicate records.
See “Cleanse the data source file of blank columns and duplicate rows (EDM)”
on page 603.

2 Once you have prepared the exact match data identifier source file, proceed with the next
step in the EMDI process: upload the exact data source file to the Enforce Server for
profiling the data you want to protect.
See “Uploading the Exact Match Data Identifier source files to the Enforce Server” on page 480.
Detecting content using Exact Match Data Identifiers (EMDI) 480
Configuring Exact Match Data Identifier profiles

Uploading the Exact Match Data Identifier source files to the Enforce
Server
After you have prepared the data source file for indexing, load it to the Enforce Server so the
data source can be indexed.
See “Creating and modifying the Exact Match Data Identifier profiles” on page 483.
Listed here are the options you have for making the data source file available to the Enforce
Server. Consult with your database administrator to determine the best method for your needs.

Table 25-7 Uploading the exact match data identifier source file to the Enforce Server for
indexing

Upload option(s) Use case Description

Upload Data Source Data source If you have a smaller data source file (less than 50 MB), upload the data source
to Server Now file is less than file to the Enforce Server using the Enforce Server administration console.
50 MB. When creating the Exact Match Data Identifier Profile, you can specify the
file path or browse to the directory and upload the data source file.
Note: Due to browser capacity limits, the maximum file size that you can upload
is 2 GB. However, uploading any file over 50 MB is not recommended, since
files over this size can take a long time to upload. If your data source file is
over 50 MB, consider copying the data source file to the datafiles directory
using the next option.

Reference Data Data source If you have a large data source file (over 50 MB), copy it to the datafiles
Source on Manager file is over 50 directory on the host where the Enforce Server is installed.
Host MB.
On Windows this directory is located at

C:\ProgramData\Symantec\DataLossPrevention
\ServerPlatformCommon\15.5\Protect\datafiles.

On Linux this directory is located at

/var/Symantec/DataLossPrevention/
ServerPlatformCommon/15.5/datafiles.

This option is convenient because it makes the data file available through a
drop-down list during configuration of the Exact Match Data Identifier Profile.
If it is a large file, use a third-party solution (such as Secure FTP) to transfer
the data source file to the Enforce Server.
Note: Ensure that the Enforce Server user (usually called "protect") has modify
permissions (on Windows) or rw permissions (on Linux) for all files in the
datafiles directory.
Detecting content using Exact Match Data Identifiers (EMDI) 481
Configuring Exact Match Data Identifier profiles

Table 25-7 Uploading the exact match data identifier source file to the Enforce Server for
indexing (continued)

Upload option(s) Use case Description

Use This File Name Data source You may want to create an EMDI profile before you have created the exact
file is not yet match data identifier source file. In this case you can create a profile template
created. and specify the name of the data source file you plan to create. This option lets
you define EMDI policies using the EMDI profile template before you index the
data source. The policies do not operate until the data source is indexed.

When you have created the data source file you place it in the

\ProgramData\Symantec\DataLossPrevention
\ServerPlatformCommon\15.5\Protect\datafiles

directory on Windows or

/var/Symantec/DataLossPrevention/
ServerPlatformCommon/15.5/Protect/datafiles

on Linux and index the data source immediately on save or schedule indexing.

See “Creating and modifying the Exact Match Data Identifier profiles”
on page 483.
Detecting content using Exact Match Data Identifiers (EMDI) 482
Configuring Exact Match Data Identifier profiles

Table 25-7 Uploading the exact match data identifier source file to the Enforce Server for
indexing (continued)

Upload option(s) Use case Description

Use This File Name Data source is In some environments it may not be secure or feasible to copy or upload the
to be indexed data source file to the Enforce Server. In this situation you can index the data
and
remotely and source remotely using the Remote EMDI Indexer.
Load Externally copied to the
See “Remote EMDI indexing” on page 504.
Generated Index Enforce
Server. This utility lets you index an exact match data identifier source on a computer
other than the Enforce Server host. This feature is useful when you do not want
to copy the data source file to the same computer as the Enforce Server. As
an example, consider a situation where the originating department wants to
avoid the security risk of copying the data to an extra-departmental host. In
this case you can use the Remote EMDI Indexer.

First you create an EMDI profile template where you choose the Use this File
Name and the Number of Columns options. You must specify the name of
the exact match data identifier source file and the number of columns it contains.

See “Creating an EMDI profile template for remote indexing” on page 508.

You then use the Remote EMDI Indexer to remotely index the data source and
copy the index files to the Enforce Server host and load the externally generated
index. The Load Externally Generated Index option is only available after
you have defined and saved the profile. Remote indexes are loaded on Windows
from these directories:

\ProgramData\Symantec\DataLossPrevention
\EnforceServer\15.5\Protect\index

and on Linux from the

/var/Symantec\DataLossPrevention/EnforceServer/15.5/Protect/index

on the Enforce Server host.

See “Uploading the Exact Match Data Identifier source files to the Enforce
Server” on page 480.

See “Adding Exact Match Data Identifier Profiles” on page 482.

Adding Exact Match Data Identifier Profiles


The Manage > Data Profiles > Exact Data > Add Exact Match Data Identifier Profile screen
is the home page for managing and adding Exact Match Data Identifier profiles. An Exact
Match Data Identifier profile is required to implement data identifier conditions with EMDI
optionally enabled as a validator. An Exact Match Data Identifier profile specifies the data
source, the indexing parameters, and the indexing schedule. Once you have created the EMDI
profile, you index the data source and add the data identifier validation on the Manage >
Detecting content using Exact Match Data Identifiers (EMDI) 483
Configuring Exact Match Data Identifier profiles

Policies > Data Identifiers page or on the Manage > Data Profiles > Exact Data > Add
Exact Match Data Identifier Profile page.
See “Creating and modifying the Exact Match Data Identifier profiles” on page 483.

Creating and modifying the Exact Match Data Identifier profiles


See “Configuring Exact Match Data Identifier profiles” on page 476.

Note: If you use the Remote EMDI Indexer to generate the Exact Match Data Identifier profile,
refer to See “Creating an EMDI profile template for remote indexing” on page 508.

To create or modify an Exact Match Data Identifier Profile


1 Make sure that you have created the data source file.
See “Creating the Exact Match Data Identifier source file” on page 477.
2 Make sure that you have prepared the data source file for indexing.
See “Preparing the Exact Match Data Identifier source for indexing” on page 478.
3 In the Enforce Server administration console, navigate to Manage > Data Profiles >
Exact Data.
4 Click Add Exact Match Data Identifier Profile.
5 Enter a unique, descriptive Name for the profile (limited to 256 characters).
For easy reference, choose a name that describes the data content and the index type
(for example, Employee Data EMDI).
If you modify an existing Exact Match Data Identifier profile you can change the profile
name.
6 Select one of the following Data Source options to make the data source file available to
the Enforce Server:
■ Upload Data Source to Server Now
If you want to create a new profile, click Browse and select the data source file, or
enter the full path to the data source file.
If you want to modify an existing profile, select Upload Now.
See “Uploading the Exact Match Data Identifier source files to the Enforce Server”
on page 480.
■ Reference Data Source on Manager Host
If you copied the data source file to the datafiles directory on the Enforce Server, it
appears in the drop-down list for selection.
Detecting content using Exact Match Data Identifiers (EMDI) 484
Configuring Exact Match Data Identifier profiles

See “Uploading the Exact Match Data Identifier source files to the Enforce Server”
on page 480.
■ Use This File Name
Select this option if you have not yet created the data source file but want to configure
EMDI policies using a placeholder EMDI profile. Enter the file name of the data source
you plan to create, including the Number of Columns it is to have. When you do
create the data source, you must copy it to the datafiles directory.

Note: Use this option with caution. Be sure to remember to create the data source file
and copy it to the datafiles directory. Name the data source file exactly the same
as the name you enter here and include the exact number of columns you specify
here.

■ Load Externally Generated Index


Select this option if you have created an index on a remote computer using the Remote
EMDI Indexer. This option is only available after you have defined and saved the
profile. Profiles are loaded on Windows from the
\ProgramData\Symantec\DataLossPrevention\EnforceServer\15.5\Protect\index
directory and on Linux from the
/var/Symantec\DataLossPrevention/EnforceServer/15.5/Protect/index directory
on the Enforce Server host.

7 If the first row of your data source contains Column Names, select Read first row as
column names.
8 Specify the Error Threshold, which is the maximum percentage of rows that contain
errors before indexing stops.
A data source error is either an empty cell, a cell with the wrong type of data, or extra
cells in the data source. For example, a name in a column for phone numbers is an error.
If errors exceed a certain percentage of the overall data source (by default, 5%), the
system quits indexing and displays an indexing error message. The index is not created
if the data source has more invalid records than the error threshold value allows. Although
you can change the threshold value, more than a small percentage of errors in the data
source can indicate that the data source is corrupt, is in an incorrect format, or cannot be
read. If you have a significant percentage of errors (10% or more), stop indexing and
cleanse the data source.
See “Preparing the Exact Match Data Identifier source for indexing” on page 478.
9 Select the Column Separator Char (delimiter) that you have used to separate the values
in the data source file. The delimiters you can use are tabs, commas, semicolons, or
pipes.
Detecting content using Exact Match Data Identifiers (EMDI) 485
Configuring Exact Match Data Identifier profiles

10 Select one of the following encoding values for the content to analyze, which must match
the encoding of your data source:
■ ISO-8859-1 (Latin-1) (default value)
Standard 8-bit encoding for Western European languages using the Latin alphabet.
■ UTF-8
Use this encoding for all languages that use the Unicode 4.0 standard (all single- and
double-byte characters), including those in East Asian languages.
■ UTF-16
Use this encoding for all languages that use the Unicode 4.0 standard (all single- and
double-byte characters), including those in East Asian languages.

Note: Make sure that you select the correct encoding. The system does not prevent you
from creating an EMDI profile using the wrong encoding. The system only reports an error
at run-time when the EMDI policy attempts to match inbound data. To make sure that you
select the correct encoding, after you click Next, verify that the column names appear
correctly. If the column names do not look correct, you chose the wrong encoding.

11 Click Next to go to the second Add Exact Match Data Identifier Profile screen.
See “Scheduling EMDI profile indexing” on page 485.

Scheduling EMDI profile indexing


When you configure an Exact Match Data Identifier profile, you can set a schedule for indexing
the data source (Submit Indexing on Job Schedule).
See “About EMDI index scheduling” on page 475.
Before you set up a schedule, consider the following recommendations:
■ If you update your data sources occasionally (for example, less than once a month), there
is no need to create a schedule. Index the data each time you update the data source.
■ Schedule indexing for times of minimal system use. Indexing affects performance throughout
the Symantec Data Loss Prevention system, and large data sources can take time to index.
■ Index a data source as soon as you add or modify the corresponding exact data profile,
and re-index the data source whenever you update it. For example, consider a scenario
whereby every Wednesday at 2:00 A.M. you update the data source. In this case you
should schedule indexing every Wednesday at 3:00 A.M. Do not index data sources daily
as daily indexing can degrade performance.
■ If you need to update indexes frequently (for example, daily), Symantec recommends that
you use the Remote EMDI Indexer.
Detecting content using Exact Match Data Identifiers (EMDI) 486
Configuring Exact Match Data Identifier profiles

■ Monitor results and modify your indexing schedule accordingly. If performance is good and
you want more timely updates, schedule more frequent data updates and indexing.
The Indexing section lets you index the Exact Match Data Identifier profile as soon as you
save it (recommended). You can also index on a regular schedule as follows:

Table 25-8 Scheduling indexing for Exact Match Data Identifier Profiles

Parameter Description

Submit Indexing Select this option to index the Exact Match Data Identifier profile.
Job on Save

Submit Indexing Select this option to schedule an indexing job. The default option is No Regular Schedule. If you
Job on Schedule want to index according to a schedule, select a desired schedule period, as described.

Index Once On – Enter the date to index the document profile in the format MM/DD/YY. You can also click the
date widget and select a date.

At – Select the hour to start indexing.

Index Daily At – Select the hour to start indexing.

Until – Select this check box to specify a date in the format MM/DD/YY when the indexing should
stop. You can also click the date widget and select a date.

Index Weekly Day of the week – Select the day(s) to index the document profile.

At – Select the hour to start indexing.

Until – Select this check box to specify a date in the format MM/DD/YY when the indexing should
stop. You can also click the date widget and select a date.

Index Monthly Day – Enter the number of the day of each month you want the indexing to occur. The number
must be 1 through 28.

At – Select the hour to start indexing.

Until – Select this check box to specify a date in the format MM/DD/YY when the indexing should
stop. You can also click the date widget and select a date.

See “Associating data identifiers with your data source (EMDI)” on page 486.

Associating data identifiers with your data source (EMDI)


On this screen you associate data identifiers with your data source.
Detecting content using Exact Match Data Identifiers (EMDI) 487
Configuring Exact Match Data Identifier profiles

To continue configuring your Exact Match Data Identifier profiles


1 Designate columns in your data source as Required, Optional, or Ignored. You must
associate Required columns with an existing data identifier.
Confirm that the column names in your data source are accurately represented in the
Data Source Field column. If you selected the Column Names option, the Data Source
Field column lists the names in the first row of your data source. If you did not select the
Column Names option, the column lists Col 1, Col 2, and so on.
2 In the Indexing section of the screen, select one of the following options:
■ Submit Indexing Job on Save
Select this option to begin indexing the data source when you save the exact data
profile.
■ Submit Indexing Job on Schedule
Select this option to index the data source according to a specific schedule. Make a
selection from the Schedule drop-down list and specify days, dates, and times as
required.
See “Scheduling EMDI profile indexing” on page 485.

3 Click Finish.
After Symantec Data Loss Prevention finishes indexing, it deletes the original data source
from the Enforce Server. After you index a data source, you cannot change its schema.
If you change column designations for a data source after you index it, you must create
a new EMDI profile.
You can add Exact Match Data Identifier validators to existing data identifier policies.
See “Adding an EMDI check to a built-in or custom data identifier condition in a policy”
on page 487.

Adding an EMDI check to a built-in or custom data identifier condition


in a policy
You can add an EMDI validation check to an existing data identifier, or you can create a custom
data identifier that includes an EMDI validation check.
To add an EMDI validation check to an existing policy
1 Go to Manage > Policies > Policy List.
2 Check the box to choose an existing policy.
3 Double-click the policy to begin editing.
4 Rename the policy to indicate that uses EMDI as a validator.
5 Verify the Wide, Medium, or Narrow breadth.
Detecting content using Exact Match Data Identifiers (EMDI) 488
Using multi-token matching with EMDI

6 Click Optional Validators.


7 Click Exact Match Data Identifier Check.
8 Select a Profile. When you scroll to view profiles, you only see profiles where the key
column matches the data identifier in use.
9 Select at least one Required column that must be matched.
10 Choose how many other optional columns to match. You must have at least one optional
column.
11 Select the desired Proximity using the slider. The maximum proximity for EMDI is 50
tokens before or after the data identifier or pattern match. You can select a lower level.
12 Verify a Match Counting value. Your options are:
Check for existence (don't count multiple matches)
Count all matches
Count all unique matches.
13 Select a value for Only report incidents with at least [n] matches.
14 Click what to match on:
Envelope
Subject
Body
Attachments.
15 Click OK.
16 Click Save.
You can also create a custom data identifier that includes an EMDI validation check. To review
the steps to create a custom data identifier, see the "Detecting content using data identifiers"
topic in the Symantec Data Loss Prevention Administration Guide or Help. Then follow the
steps to add an EMDI validator. For more information on configuring policies, see the
"Configuring policies" topic in the Symantec Data Loss Prevention Administration Guide or
Help.
See “Using multi-token matching with EMDI” on page 488.

Using multi-token matching with EMDI


EMDI validation occurs after a pattern in a file or message matches a data identifier. The EMDI
validator then searches within the defined proximity window (by default, plus or minus 50
tokens) for both individual tokens and multi-token strings. It then validates whether any of
those tokens in combination with the matching data identifier pattern correspond to a row in
Detecting content using Exact Match Data Identifiers (EMDI) 489
Using multi-token matching with EMDI

the EMDI index. If the Required column matches and there are enough Optional column
matches within the proximity window, then an EMDI match is generated.
A multi-token cell is a cell in the index that contains multiple words separated by spaces,
leading or trailing punctuation, or alternative Latin and Chinese, Japanese, or Korean language
characters. The sub-token parts of a multi-token cell obey the same rules as single-token cells:
they are normalized according to their pattern where normalization can apply. Messages and
files that are inspected must match a multi-token cell exactly, including whitespace and
punctuation (assuming the default settings).
For example, an indexed cell containing the string "Bank of America" is a multi-token comprising
three sub-token parts. During detection, "bank of america" (normalized) matches the multi-token
cell, but "bank america" does not.
See “Characteristics of multi-token cells for EMDI” on page 489.

Characteristics of multi-token cells for EMDI


Table 25-9 lists and describes characteristics of multi-token matching.

Table 25-9 Characteristics of multi-tokens

Characteristic Description

The number of tokens in a single cell is limited to 100 With CJK tokens, each character is treated as a single
tokens. token and the number of CJK characters is limited to 100.
If more than 100 tokens are found in a single cell during
indexing, indexing is terminated.

Whitespace in Latin multi-token cells is considered, but See “Multi-token with spaces for EMDI” on page 490.
multiple white spaces are normalized to 1.

Punctuation immediately preceding and following a token See “Multi-token with punctuation for EMDI” on page 491.
or sub-token is always ignored.
See “Additional examples for multi-token cells with
punctuation for EMDI” on page 492.

You can configure how punctuation within a token or Lexer.IncludePunctuationInWords = true


multi-token is treated during detection. For most cases the
Note: This setting can only be set to false on the server,
default setting (true) is appropriate. With the false
not on the DLP Agent. On the DLP Agent, this setting is
setting, punctuation is treated as whitespace.
fixed to true.

See “Configuring Advanced Settings for EDM policies”


on page 557.

For proximity range checking the sub-token parts of a See “Proximity matching example for EMDI” on page 496.
multi-token are counted as single tokens.

See “Multi-token with spaces for EMDI” on page 490.


Detecting content using Exact Match Data Identifiers (EMDI) 490
Using multi-token matching with EMDI

Multi-token with spaces for EMDI


Table 25-10 shows examples of multi-tokens with spaces for EMDI.

Table 25-10 Multi-token cell with spaces examples

Description Indexed content Detected content Explanation

Cell contains space. Bank of America Bank of America Cell with spaces is
multi-token.

Multi-token must match


exactly.

Cell contains multiple Bank of America Bank of America Multiple spaces are
spaces. normalized to one.

Cells contain space between 傠傫 傠傫 傠傫 傠傫 White spaces between CKJ


CKJ characters. characters are ignored.
傠傫傠傫

Cells contain space between EMDI 傠傫 EMDI 傠傫 White spaces between Latin
Latin and CJK characters. and CJK characters are
EMDI傠傫
ignored.

See “Multi-token with mixed language characters for EMDI” on page 490.

Multi-token with mixed language characters for EMDI


Table 25-11 shows examples of multi-tokens with mixed Latin and CJK characters.

Table 25-11 Multi-token cell with Latin and CJK characters examples for EMDI

Description Cell content Should match Explanation

Cell includes Latin and CJK ABC傠傫 ABC傠傫 Mixed Latin-CJK cell is
characters with no spaces. multi-token.
傠傫ABC 傠傫ABC
Whitespace between Latin
Also matches with:
and CJK characters is
ABC 傠傫 ignored.
傠傥 ABC

EMDI ignores whitespace


between the Latin
characters and the CJK
token.
Detecting content using Exact Match Data Identifiers (EMDI) 491
Using multi-token matching with EMDI

Table 25-11 Multi-token cell with Latin and CJK characters examples for EMDI (continued)

Description Cell content Should match Explanation

Cell includes Latin and CJK ABC 傠傫 ABC 傠傫 Multiple spaces are ignored.
with one or more spaces.
傠傥 ABC 傠傥 ABC

Also matches with:

ABC傠傫

傠傫ABC

Cell contains Latin or CJK 什仁 仂仃 仄仅 仇仈仉 什仁 仂仃 仄仅 仇仈仉 Single-token cell.


with numbers. 147(什仂仅 51-1) 147(什仂仅 51-1)

See “Multi-token with punctuation for EMDI” on page 491.

Multi-token with punctuation for EMDI


Punctuation is always ignored if it comes at the beginning (leading) or end (trailing) of a token
or multi-token. Whether punctuation that is included in a token or multi-token is required for
matching depends on the Advanced server setting Lexer.IncludePunctuationInWords,
which by default is set to true (enabled). For the DLP Agent, this value is set to true and
cannot be changed.
See “Multi-token punctuation characters (EDM)” on page 569.

Note: For convenience purposes the Lexer.IncludePunctuationInWords parameter is referred


to by the three-letter acronym "WIP" throughout this section.

The WIP setting operates at detection-time to alter how matches are reported. For most EMDI
policies you should not change the WIP setting. For a few limited situations, such as account
numbers or addresses, you may need to set IncludePunctuationInWords = false depending
on your detection requirements.
See “Multi-token punctuation characters (EDM)” on page 569.
Table 25-12 lists and explains how multi-token matching works with punctuation.
Detecting content using Exact Match Data Identifiers (EMDI) 492
Using multi-token matching with EMDI

Table 25-12 Multi-token punctuation table for EMDI

Indexed Detected WIP setting Match Explanation


content content

a.b a.b TRUE Yes The indexed content and the detected content are
exactly the same.

FALSE No The detected content is treated as "a b" and is therefore


not a match.

a.b ab TRUE No The indexed content and the detected content are
different.

FALSE No The indexed content and the detected content are


different.

ab a.b TRUE No The indexed content and the detected content are
different.

FALSE Yes The detected content is treated as "a b" and is therefore
a match.

ab ab TRUE Yes The indexed content and the detected content are
exactly the same.

FALSE Yes The indexed content and the detected content are
exactly the same.

See “Additional examples for multi-token cells with punctuation for EMDI” on page 492.

Additional examples for multi-token cells with punctuation for EMDI


Table 25-13 lists and describes some additional examples for multi-token cells with punctuation.
Keep in mind is that during indexing, if a token includes punctuation marks between characters,
the punctuation is always retained. This means that EMDI cannot detect that cell if the WIP
setting is false. In other words, if indexed data has a cell that has a token with internal
punctuation, the WIP setting should be set to true.
Detecting content using Exact Match Data Identifiers (EMDI) 493
Using multi-token matching with EMDI

Table 25-13 Additional use cases for multi-token cells with punctuation for EMDI

Description Indexed content Detected content Explanation

Cell contains a physical 346 Guerrero St., Apt. #2 346 Guerrero St., Apt. #2 The indexed content is a
address with punctuation. multi-token cell.
346 Guerrero St Apt 2
Both match because the
punctuation comes at the
beginning or end of the
sub-token parts and is
therefore ignored.

Cell contains internal O'NEAL ST. O'NEAL ST The indexed content is a


punctuation with no space multi-token cell.
before or after.
Internal punctuation is
included (assuming WIP is
true), and leading or trailing
punctuation is ignored
(assuming there is a space
delimiter after the
punctuation).

Cell contains Asian 傠傫##傠傫 傠傫##傠傫 (if WIP true) The indexed content is a
language characters (CJK) single token cell.
with indexed internal
During detection, Asian
punctuation.
language characters (CJK)
with internal punctuation are
affected by the WIP setting.
Thus, in this example 傠傫
##傠傫 matches only if the
WIP setting is true.

If the WIP setting is false, 傠


傫##傠傫 is considered a
multi-token because the
internal punctuation is
treated as whitespace. Thus,
no content can match.
Detecting content using Exact Match Data Identifiers (EMDI) 494
Using multi-token matching with EMDI

Table 25-13 Additional use cases for multi-token cells with punctuation for EMDI (continued)

Description Indexed content Detected content Explanation

Cell contains Asian 傠傫 傠傫 傠傫 傠傫 The indexed content is a


language characters (CJK) multi-token cell.
傠傫##傠傫 (if WIP false)
without indexed internal
The detected content
punctuation.
matches as indexed. If the
WIP setting is false, the
detected content matches
傠傫##傠傫 because internal
punctuation is ignored.

Cell contains mix of Latin EMDI##傠傫 EMDI 傠傫 The indexed content is a


and CJK characters with multi-token cell.
punctuation separating the
A cell with alternate Latin
Latin characters and Asian
and CJK characters is
characters.
always a multi-token.
Punctuation between Latin
and Asian characters is
always treated as a single
white space regardless of
the WIP setting.

Cell contains mix of Latin DLP##EMDI 傠傫##傠傥 DLP##EMDI##傠傫##傠傥 The indexed content is a
and CJK characters with (if WIP true) multi-token cell.
internal punctuation.
DLP##EMDI 傠傫##傠傥 (if During detection,
WIP true) punctuation between the
Latin and characters and the
Asian characters is treated
as a single whitespace.
Leading and trailing
punctuation is ignored.

If the WIP setting is true the


punctuation internal to the
Latin characters and internal
to the Asian character is
retained.

If the WIP setting is false, no


content can match because
internal punctuation is
ignored.
Detecting content using Exact Match Data Identifiers (EMDI) 495
Using multi-token matching with EMDI

Table 25-13 Additional use cases for multi-token cells with punctuation for EMDI (continued)

Description Indexed content Detected content Explanation

Cell contains mix of Latin DLP EMDI 傠傫 傠傥 DLP EMDI 傠傫 傠傥 The indexed content is a
and CJK characters with multi-token cell.
DLP#EMDI 傠傫#傠傥 (if
internal punctuation.
WIP false) During detection,
punctuation between the
DLP#EMDI##傠傫#傠傥 (if
Latin characters and the
WIP false)
Asian characters is treated
as a single whitespace.
Leading and trailing
punctuation is ignored.
Thus, it matches as indexed.

If the WIP setting is false, it


matches DLP;EMDI##傠傫
#傠傥 because internal
punctuation is ignored.

See “Multi-token punctuation characters for EMDI” on page 495.

Multi-token punctuation characters for EMDI


In EMDI, a multi-token cell is any cell that has been indexed that contains punctuation (as well
as spaces or alternative Latin words and CJK characters).
Table 25-14 lists the symbols that are identified and treated as punctuation during EMDI
indexing.

Table 25-14 Characters treated as punctuation for indexing for EMDI

Punctuation name Character representation

Apostrophe '

Tilde ~

Exclamation point !

Ampersand &

Dash -

Single quotation mark '

Double quotation mark "

Period (dot) .
Detecting content using Exact Match Data Identifiers (EMDI) 496
Using multi-token matching with EMDI

Table 25-14 Characters treated as punctuation for indexing for EMDI (continued)

Punctuation name Character representation

Question mark ?

At sign @

Dollar sign $

Percent sign %

Asterisk *

Caret symbol ^

Open parenthesis (

Close parenthesis )

Open bracket [

Close bracket ]

Open brace {

Close brace }

Forward slash /

Back slash \

Pound sign #

Equal sign =

Plus sign +

See “Proximity matching example for EMDI” on page 496.

Proximity matching example for EMDI


EMDI protects confidential data by correlating uniquely identifiable information, such as a
social security number, with data that is not unique, such as a last name. When you correlate
data, it is important to ensure that terms are related. In natural languages, it is more likely that
when two words appear close together they are used in the same context and are therefore
related.
Based on the premise that word proximity indicates relatedness, EMDI employs a
proximity-matching radius to limit how much free-form content the system examines when it
Detecting content using Exact Match Data Identifiers (EMDI) 497
Using multi-token matching with EMDI

searches for matches. EMDI proximity matching is designed to reduce false positives by
ensuring that matched terms are proximate.
EMDI supports up to 50 tokens before and after the data identifier match. This limit can be
modified during policy creation. No dependency exists on the number of columns in the policy.
Table 25-15 shows a proximity matching example based on the default proximity radius setting.
In this example, the detected content produces one unique token set match, described as
follows:
■ The proximity range window is 100 tokens (50 tokens before and after the matching data
identifier pattern).
■ The total number of tokens from "Stevens" to the first token of "Bank of America" is within
100 tokens.
■ "Bank of America" is a multi-token. Each sub-token part of a multi-token is counted as a
single token for proximity purposes.
■ If a multi-token begins within the proximity window, it is matched even if it ends after the
proximity window. For example, "Bank of America" is matched if "Bank" is in the proximity
window, even if "of America" is not within the window.
Detecting content using Exact Match Data Identifiers (EMDI) 498
Memory requirements for EMDI

Table 25-15 Proximity example for EMDI

Indexed data Data Policy Proximity Detected content


Identifier
Match

Last_Name | SSN | Social Match 3 of 3 Radius = 50 Zendrerit inceptos Kathy Stevens


Employer Security tokens (default) lorem ipsum pharetra convallis leo
Number suscipit ipsum sodales rhoncus, vitae
Stevens |
dui nisi volutpat augue maecenas in,
123-45-6789 | Bank
luctus id risus magna arcu maecenas
of America
leo quisque. Rutrum convallis tortor
urna morbi elementum hac curabitur
morbi, nunc dictum primis elit
senectus faucibus convallis surfrent.
Aptentnour gravida adipiscing iaculis
himenaeos, 123-45-6789. Dictumst
lorem eget ipsum. Hendrerit inceptos
other sagittis quisque. Leo mollis per
nisl per felis, nullam cras mattis augue
turpis integer pharetra convallis
suscipit hendrerit? Lubilia en mictumst
horem eget ipsum. Inceptos urna
sagittis quisque dictum odio hendrerit
convallis suscipit ipsum wrdsrf
Zendrerit inceptos Kathy lorem ipsum
Bank of America.

See “Memory requirements for EMDI” on page 498.

Memory requirements for EMDI


Using EMDI for DLP Symantec Data Loss Prevention deployments affects hardware memory
requirements for Symantec Data Loss Prevention. In particular, EMDI affects the memory
required to index the data size as well as the memory required to load the index on the detection
server, the appliance, and the endpoint.
Once you have established what your specific EMDI memory requirements are, you can
evaluate how those requirements affect the general system requirements for your Data Loss
Prevention deployment. See the Symantec Data Loss Prevention System Requirements and
Compatibility Guide for details about general requirements.
See “EMDI memory configuration and limitations” on page 499.
Detecting content using Exact Match Data Identifiers (EMDI) 499
Memory requirements for EMDI

EMDI memory configuration and limitations


The memory requirements for EMDI are related to several factors, including:
■ Number of indexes you are building
■ Total size of the indexes
■ Number of cells in each index
These size limitations apply to EMDI indexes:
■ The maximum number of rows supported is 4 million. This count does not include invalid
rows.
■ The maximum number of columns is 32.
■ The number of invalid entries allowed is configurable in the Indexer.properties file. The
default is 1%.
■ A value from the required column can have a maximum of 5 duplicates. A specific value
in the required column cannot appear more than 5 times. This is not configurable to a
greater value. The total number of duplicate values in each required (or "key") column
cannot exceed 1% of the values. This is configurable by editing
EMDI.MaxDuplicateCellsPercentage=1 in the properties file.

■ The maximum number of supported cells is 128 million.


■ If any of these limits are exceeded the index creation is terminated.
Table 25-16 gives an overview of the steps that you can follow to determine and set memory
requirements for EMDI.

Table 25-16 Workflow for determining memory requirements for EMDI indexes

Step Action For more information

1 Determine the memory See “Determining requirements for both local indexers and
that is required to index remote indexers for EMDI” on page 500.
the data source.

2 Determine the memory See “Detection server memory requirements for EMDI”
that is required to load the on page 501.
index on the detection
server or the endpoint.

3 Increase the detection See “Increasing the memory for the detection server (File
server or endpoint Reader) for EMDI” on page 503.
memory according to your
See Properties file settings for EMDI on page 515.
calculations.
Detecting content using Exact Match Data Identifiers (EMDI) 500
Memory requirements for EMDI

Table 25-16 Workflow for determining memory requirements for EMDI indexes (continued)

Step Action For more information

4 Repeat for each EMDI


index you want to deploy.

See “Overview of configuring memory and indexing the data source for EMDI” on page 500.

Overview of configuring memory and indexing the data source for


EMDI
Table 25-17 provides the steps for determining how much memory is needed to index the data
source.

Table 25-17 Memory requirements for indexing the data source for EMDI

Step Action Details

1 Estimate the memory requirements See “Determining requirements for both local indexers and
for the indexer. remote indexers for EMDI” on page 500.

2 Increase the indexer memory. The next step is to increase the memory allocated to the
indexer. The procedure for increasing the indexer memory
differs depending on whether you use the EMDI indexer local
to the Enforce Server or the Remote EMDI Indexer.

3 Restart the Symantec DLP Manager You must restart this service after you have changed the
service. memory allocation.

4 Index the data source. The last step is to index the data source. You need to index
before you calculate remaining memory requirements.

See “Configuring Exact Data profiles for EDM” on page 534.

See “Determining requirements for both local indexers and remote indexers for EMDI”
on page 500.

Determining requirements for both local indexers and remote indexers


for EMDI
This topic provides an overview of memory requirements for both the EMDI indexer that is
local to the Symantec Data Loss Prevention Enforce Server and for the Remote EMDI indexer.
You do not need to change the EMDI indexer default value of 2048 MB. Make sure that the
system has enough free additional memory in case of parallel indexing. The additional memory
Detecting content using Exact Match Data Identifiers (EMDI) 501
Memory requirements for EMDI

that is required depends on the number of required and optional columns as well as the number
of cells. In the following examples,
R – Number of required columns
P – Number of optional columns
B – Bytes per cell
The general formula is: B = 4 * R * P / (P+1)

Example 1
For an index with 5 million cells (1 million rows x 5 columns), 1 required column, and 4 optional
columns:
The formula is: B = 4 * 1 * 4/5 = 3.2 bytes x cell
The total memory that is required for this index = 5 million * 3.2 = 16 MB

Example 2
For an index with 40 million cells (4 million rows x 10 columns), 1 required column, and 9
optional columns:
The formula is: B = 4 * 1 * 9/10 = 3.6 bytes x cell
The total memory that is required for this index = 40 million * 3.6 = 144 MB

Example 3
For an index with 128M cells (4M rows x 32 columns), 1 required column, and 31 option
columns:
The formula is B = 4 * 1 * 31/32 = 3.875 bytes x cell
The total memory that is required for this index = 128 million * 3.875 = 496 MB
See “Detection server memory requirements for EMDI” on page 501.

Detection server memory requirements for EMDI


The detection server should not use more than 60% of the memory of the computer. For
example, if your detection server needs 6 GB of memory to run, make sure that you have 10
GB on that server.

Default configuration for a detection server


The default configuration for detection server has 4 GB and eight message chains. See the
following formulas and Table 25-18 to determine how to calculate your actual memory
requirements.
Detecting content using Exact Match Data Identifiers (EMDI) 502
Memory requirements for EMDI

To load the index, the detection server needs, on average, 3.5 bytes per cell for system memory
plus 1 GB Java heap memory for each message chain in the detection server. The following
examples show scenarios for a customer who has three indexes that are all under the same
schedule.
For Java heap memory requirements, the formula is:
Java heap memory requirement = the number of message chains * 1 GB.
For system memory requirements, the general formula is:
System memory requirement = number of cells * 3.5 bytes.

Detection Server memory settings


The Advanced Server settings property for the number of message chains is:
MessageChain.NumChains.

The Java heap memory settings for a detection server are set in the Enforce Server
administration console at the Server Detail - Advanced Server Settings page, using the
BoxMonitor.FileReaderMemory property. The format is -Xrs -Xms1200M -Xmx4G. You don't
need to change the system memory setting, but make sure that the detection server has
enough free memory available.

Note: When you update this setting, only change the -Xmx value in this property. For example,
only change "4G." to a new value, and leave all other values the same.

The examples in Table 25-18 show the settings for five different situations.

Table 25-18 EMDI detection server Java heap memory settings and additional system
memory examples

Example Calculation Boxmonitor.FileReaderMemory Additional system


setting memory required

Example 1: Java heap memory -Xmx2G 16 MB


requirement:
2 message chains, a 5
million cell index 2 GB (default

system memory
requirement:

5 million * 3.2 = 16 MB
Detecting content using Exact Match Data Identifiers (EMDI) 503
Memory requirements for EMDI

Table 25-18 EMDI detection server Java heap memory settings and additional system
memory examples (continued)

Example Calculation Boxmonitor.FileReaderMemory Additional system


setting memory required

Example 2: Java heap memory -Xmx4G 720 MB


requirement:
4 message chains, five
40 million cell indexes 4 * 1 GB = 4 GB

system memory
requirement:

5 * 40 million * 3.6 =
720 MB

Example 3: Java heap memory -Xmx24G 4.96 GB


requirement:
24 message chains,
five 40 million cells 24 * 1 GB = 24 GB
indexes
system memory
requirement:

10 * 128 million =
3.875 = 4960 MB

See “Increasing the memory for the detection server (File Reader) for EMDI” on page 503.

Increasing the memory for the detection server (File Reader) for
EMDI
This topic provides instructions for increasing the File Reader memory allocation for a detection
server. These instructions assume that you have performed the necessary calculations.
See “Determining requirements for both local indexers and remote indexers for EMDI”
on page 500.
To increase the memory for detection server processing
1 In the Enforce Server administration console, navigate to the Server Detail - Advanced
Server Settings screen for the detection server where the EMDI index is deployed or to
be deployed.
2 Locate the following setting: BoxMonitor.FileReaderMemory.
Detecting content using Exact Match Data Identifiers (EMDI) 504
Remote EMDI indexing

3 Change the -Xmx4G value in the following string to match the calculations you have made.
-Xrs -Xms1200M -Xmx4G -XX:PermSize=128M -XX:MaxPermSize=256M
For example: -Xrs -Xms1200M -Xmx11G -XX:PermSize=128M -XX:MaxPermSize=256M
4 Save the configuration and restart the detection server.
See “Profile size limitations on the DLP Agent for EMDI ” on page 504.

Profile size limitations on the DLP Agent for EMDI


By default, no profiles larger than 100 MB are sent to the DLP Agent. To change this default,
edit the EMDI.MaxEndpointProfileMemoryInMB = in the Protect.properties file.
See Properties file settings for EMDI on page 515.
There is no limit on the number of 100 MB profiles that are sent to the agent. If you increase
the default value for the index or plan to deploy multiple indexes, you need to provision extra
memory on your DLP Agents to accommodate these increases.

Note: By default, deployment of EMDI profiles to DLP Agents is set to false. To enable EMDI
deployments to DLP Agents, set EMDI.EnabledOnAgents property in the Protect.properties
file to true for each DLP Agent.

Remote EMDI indexing


An EMDI index maps the data you want to protect to the Exact Match Data Identifier profile.
Here's the typical EMDI workflow for creating the EMDI index:
■ Upload the data source file to the Enforce Server.
■ Create the Exact Match Data Identifier profile.
■ Index the data source.
Instead of uploading the data source file to the Enforce Server for indexing, you can index the
data source locally and securely using the Remote EMDI Indexer.
For example, if copying the confidential data source file to the Enforce Server presents a
potential security or logistical issue, you can use the Remote EMDI Indexer to create the
cryptographic index directly on the data source host before moving the index to the Enforce
Server.
See “About the Remote EMDI Indexer” on page 505.
See “About the SQL Preindexer and EMDI” on page 505.
Detecting content using Exact Match Data Identifiers (EMDI) 505
Remote EMDI indexing

The Remote EMDI Indexer is a standalone tool that lets you index the data source file directly
on the data source host.
See “System requirements for remote EMDI indexing” on page 505.

About the Remote EMDI Indexer


The Remote EMDI Indexer utility converts a data source file to an EMDI index. The utility is
similar to the local EMDI Indexer that you can use on the Enforce Server. However, the Remote
EMDI Indexer is designed for use on a computer that is not part of the Symantec Data Loss
Prevention server configuration.
The Remote EMDI Indexer has the following advantages over using the EMDI Indexer on the
Enforce Server:
■ It enables the owner of the data, rather than the Symantec Data Loss Prevention
administrator, to index the data. The Symantec Data Loss Prevention administrator does
not need to have access to the original data source that is indexed.
■ It shifts the system load that is required for indexing onto another computer. The CPU and
RAM on the Enforce Server is reserved for other tasks.
See “About the SQL Preindexer and EMDI” on page 505.
See “Workflow for remote EMDI indexing” on page 506.

About the SQL Preindexer and EMDI


You use the SQL Preindexer utility with the Remote EMDI Indexer to run SQL queries against
Oracle databases. Then you pipe the resulting data to the Remote EMDI Indexer for indexing.
See “System requirements for remote EMDI indexing” on page 505.
The SQL Preindexer utility is installed in the C:\Program
Files\Symantec\DataLossPrevention\ServerPlatformCommon\Indexer\15.5\Protect\bin
directory during installation of the Remote EMDI Indexer. The SQL Preindexer utility generates
an index directly from an Oracle SQL database. The SQL Preindexer processes the database
query and passes it to the standard input of the Remote EMDI Indexer utility.
To use the SQL Preindexer the data source must be relatively clean since the query result
data is piped directly to the Remote EMDI Indexer.
See “About the Remote EMDI Indexer” on page 505.

System requirements for remote EMDI indexing


The Remote EMDI Indexer runs on the Windows and Linux operating system versions that
are supported for Symantec Data Loss Prevention servers. See the Symantec Data Loss
Detecting content using Exact Match Data Identifiers (EMDI) 506
Remote EMDI indexing

Prevention System Requirements and Compatibility Guide for more information about operating
system support.
The SQL Preindexer supports Oracle databases and requires a relatively clean data source.
See “About the SQL Preindexer and EMDI” on page 505.
The RAM requirements for using the Remote EMDI Indexer vary according to the size of the
data source being indexed.
See “Memory requirements for EMDI” on page 498.

Workflow for remote EMDI indexing


This section summarizes the steps to index a data file on a remote machine and then use the
index in Symantec Data Loss Prevention.
See “About the Exact Data Profile and index” on page 528.

Table 25-19 Steps to use the Remote EMDI Indexer

Step Action Description

Step 1 Install the Remote EMDI See “About installing the Remote EMDI indexer” on page 507.
Indexer on a computer that
is not part of the Symantec
Data Loss Prevention
system.

Step 2 Create an Exact Match Data On the Enforce Server, generate an EMDI Profile template using the *.emdi
Identifier profile on the file name extension and specifying the exact number of columns to be indexed.
Enforce Server to use with
See “Creating an EMDI profile template for remote indexing” on page 508.
the Remote EMDI Indexer.

Step 3 Copy the Exact Match Data Download the profile template from the Enforce Server and copy it to the
Identifier Profile file to the remote data source host computer.
computer where the Remote
See “Downloading and copying the EMDI profile file to a remote system”
EMDI Indexer resides.
on page 509.

Step 4 Run the Remote EMDI If you have a cleansed data source file, use the RemoteEMDIIndexer with
Indexer and create the index the -data, -profile, and -result options.
files.
If the data source is an Oracle database, use the SqlPreindexer and the
RemoteEMDIIndexer to index the data source directly with the -alias
(oracle DB host), -username and -password credentials, and the -query
string or -query_path.

See “Generating remote index files for EDM” on page 591.


Detecting content using Exact Match Data Identifiers (EMDI) 507
Remote EMDI indexing

Table 25-19 Steps to use the Remote EMDI Indexer (continued)

Step Action Description

Step 5 Copy the index files from the Copy the resulting *.pdx and *.rdx files from the remote machine to the
remote machine to the Enforce Server host on Windows at
Enforce Server. C:\ProgramData\Symantec\DataLossPrevention
\ServerPlatformCommon\15.5\index or on Linux at
/var/Symantec/DataLossPrevention
/ServerPlatformCommon/15.5/index.

See “Copying and loading remote EDM index files to the Enforce Server”
on page 594.

Step 6 Load the index files into the Update the EMDI profile by loading the externally generated index.
Enforce Server.
Submit the profile for indexing.

See “Copying and loading remote EDM index files to the Enforce Server”
on page 594.

Step 7 Troubleshoot any problems Verify that indexing is started and completes.
that occur during the
Check the system events for Code 2926 ("Created Exact Data Profile" and
indexing process.
"Data source saved").

The ExternalEmdiDataSource.<name>.pdx and *.rdx files are removed


from the index directory and replaced by the file EmdiDataSource.<tenant
id>.<profile id>.<version>.rdx.

See “Troubleshooting remote indexing errors for EDM” on page 599.

Step 8 Create policy with EMDI You should see the column data for defining the EMDI condition.
condition.
See “Configuring the Content Matches Exact Data policy condition for EDM”
on page 551.

About installing the Remote EMDI indexer


You install the remote indexer on one or more systems where the confidential files that you
want to protect are stored. The process for installing a remote indexer is the same for EMDI,
EDM, and IDM.
See “About installing remote indexers” on page 589.
You can install the Remote EMDI indexer on all supported Windows and Linux platforms. See
the Symantec Data Loss Prevention System Requirements Guide for platform details.
Detecting content using Exact Match Data Identifiers (EMDI) 508
Remote EMDI indexing

Creating an EMDI profile template for remote indexing


The EMDI Indexer uses an Exact Match Data Identifier Profile when it runs to ensure that the
data is correctly formatted. You must create the Exact Data Profile before you use the Remote
EMDI Indexer. The profile is a template that describes the columns that are used to organize
the data. The profile does not need to contain any data. After creating the profile, copy it to
the computer that runs the Remote EMDI Indexer.
To create an EMDI profile for remote indexing
1 From the Enforce Server administration console, navigate to the Manage > Data Profiles
> Exact Data screen.
2 Click Add Exact Match Data Identifier Profile.
3 In the Name field, enter a name for the profile.
4 In the Data Source field, select Use This File Name, and enter the name of the index
file to create with the *.emdi extension.
You must select this option when you just create the profile template. Later, you create
then index the profile with the data source using the Remote EMDI Indexer. Enter the file
name of the data source you plan to create for remote EMDI indexing. Be sure to name
the data source file exactly the same as the name you enter here.
After you copy the generated remote index back to the Enforce Server, use the Load
Externally Generated Index option to load the remote index into the profile template.
See “Copying and loading remote EDM index files to the Enforce Server” on page 594.
5 For remote EMDI indexing purposes you must specify the exact Number of Columns
the index is to have. Be sure to include the exact number of columns you specify here in
the data source file.
See “Uploading exact data source files for EDM to the Enforce Server” on page 539.
6 If the first row of the data source contains the column names, select the option Read first
row as column names.
7 In the Error Threshold text box, enter the maximum percentage of rows that can contain
errors.
If, during indexing of the data source, the number of rows with errors exceeds the
percentage that you specify here, the indexing operation fails.
8 In the Column Separator Char field, select the type of character that is used in your data
source to separate the columns of data.
9 In the File Encoding field, select the character encoding that is used in your data source.
If Latin characters are used, select the ISO-8859-1 option. For East Asian languages, use
either the UTF-8 or UTF-16 options.
Detecting content using Exact Match Data Identifiers (EMDI) 509
Remote EMDI indexing

10 Click Next to map the column headings from the data source to the profile.
11 At least one field must be selected as Required and mapped to a Data Identifier. At least
one field must be Optional.
12 Do not select any Indexing option available at this screen, since you intend to index
remotely.
13 Click Finish to complete the profile creation process.

Downloading and copying the EMDI profile file to a remote system


Download and copy the EMDI profile to the remote system
1 Configure an Exact Match Data Identifier Profile.
See “Creating an EDM profile template for remote indexing” on page 589.
2 Download the EMDI profile by selecting the download profile link at the Manage > Data
Profiles > Exact Data screen.
The system prompts you to save the EMDI profile as a file. The file extension is *.emdi.
3 Save the file.
If the data source host computer where you intend to run the Remote EMDI Indexer is
available on the same subnet as the Enforce Server, you can browse to that computer
and select it as the destination. Otherwise, manually copy the profile to the remote system.
4 Use the profile to index the data source using the Remote EMDI Indexer.
See “Generating remote index files for EDM” on page 591.

Generating remote index files for EMDI


You use the command-line Remote EMDI Indexer utility to generate an EMDI index for importing
to the Enforce Server. You can use the Remote EMDI Indexer to index a data source file that
you have generated and cleansed. Or you can pipe the output from the SQL Preindexer to
the standard input of the Remote EMDI Indexer. The SQL Preindexer requires an Oracle DB
data source and clean data.
When the indexing process completes, the Remote EMDI Indexer generates several files in
the specified result directory. These files are named after the data file that was indexed, with
one file having the .pdx extension and one or more files with the .rdx extension.
The remote EMDI indexer creates one .pdx file and one or more .rdx files:
■ ExternalEmdiDataSource.<DataSourceName>.pdx

■ ExternalEmdiDataSource.<DataSourceName>.<EmdiDataSourceID>.rdx
Detecting content using Exact Match Data Identifiers (EMDI) 510
Remote EMDI indexing

The number of .rdx files depends upon on how many columns you selected as key columns
when you created a profile.
For example, if you choose two columns, such as the CCN and SSN, you get two .rdx files.

Table 25-20 Options for generating remote EMDI indexes

Use case Description Remarks

Remote EMDI Indexer with data Specify data source file, EMDI profile, Use when you have a cleansed data
source file. output directory. source file; use for upgrading to DLP
15.5.

See “Remote indexing examples using


data source file (EDM)” on page 592.

Remote EMDI Indexer with SQL Query DB and pipe output to stdin of Requires Oracle DB and clean data.
Preindexer Remote EMDI Indexer.
See “Remote indexing examples using
SQL Preindexer (EDM)” on page 593.

Remote EMDI indexing examples using data source file


To use the Remote EMDI Indexer to index a flat data source file that you have generated and
cleansed, specify the local data source file name and path (-data), the local EMDI profile file
name and path (-profile), and the output directory for the generated index files (-result).
The syntax for using the Remote EMDI Indexer to generate an index from a cleansed data
source tabular text file is as follows:

RemoteEMDIIndexer -data=<local data source filename and path>


-profile=<local *.emdi profile file name and path>
-result=<local output directory for *.rdx and *pdx index files>

For example:

RemoteEMDIIndexer -data=C:\EMDIIndexDirectory\CustomerData.dat
-profile=C:\EMDIIndexDirectory\RemoteEMDIProfile.emdi
-result=C:\EMDIIndexDirectory\

This command generates an EMDI index using the local data source tabular text file
CustomerData.dat and the local RemoteEMDIProfile.emdi file that you generated and copied
from the Enforce Server to the remote host, where \EMDIIndexDirectory is the directory for
placing the generated index files.
When the generation of the indexes is successful, the utility displays the message "Successfully
created index" as the last line of output.
The remote EMDI indexer creates one .pdx file and one or more .rdx files:
Detecting content using Exact Match Data Identifiers (EMDI) 511
Remote EMDI indexing

■ ExternalEmdiDataSource.<DataSourceName>.pdx

■ ExternalEmdiDataSource.<DataSourceName>.<EmdiDataSourceID>.rdx

The number of .rdx files depends upon on how many columns you selected as key columns
when you created a profile.
For example, if you choose two columns, such as the CCN and SSN, you get two .rdx files.
See “Remote EDM Indexer command options” on page 597.

Remote EMDI Indexer command options


On install, the Remote EMDI Indexer utility is available at \Program
Files\Symantec\DataLossPrevention\Indexer\15.5\Protect\bin (Windows) and
opt/Symantec/DataLossPrevention/Indexer/15.5/Protect/bin (Linux).

If you are on Linux, change users to “SymantecDLP” before running the Remote EMDI Indexer.
The installation program creates the “SymantecDLP” user.
The Remote EMDI Indexer provides a command line interface. The syntax for running the
utility is as follows:

RemoteEMDIIndexer -profile=<file *.emdi> -result=<out_dir> [options]

Note the following about the syntax:


■ The Remote EMDI Indexer requires the -profile and -result arguments.
■ If you use a flat data source file as input, you must specify the file name and local path
using the -data option.
■ The -data option is omitted when you use the SQL Preindexer to pipe the data to the
Remote EMDI Indexer.
See “Remote indexing examples using data source file (EDM)” on page 592.
Table 25-21 describes the command options for the Remote EMDI Indexer.

Table 25-21 Remote EMDI Indexer command options

Option Summary Description

-data Data source to be indexed Specifies the data source to be indexed. If this option is not
(stdin) specified, the utility reads data from stdin.

Required if you use a Required if using data source file and not the SQL Preindexer.
tabular text file.
Detecting content using Exact Match Data Identifiers (EMDI) 512
Remote EMDI indexing

Table 25-21 Remote EMDI Indexer command options (continued)

Option Summary Description

-encoding Character encoding of data Specifies the character encoding of the data to index. The
to be indexed (ISO-8859-1). default is ISO-8859-1.

Use UTF-8 or UTF-16 if the data contains non-English


characters.

-ignore_date Ignore expiration date of the Overrides the expiration date of the Exact Data Profile if the
EMDI profile. profile has expired. By default, an Exact Data Profile expires
after 30 days.

-profile File containing the EMDI Specifies the Exact Match Data Identifier profile to use. This
profile profile is selected by clicking the download link on the Exact
Match Data Identifier screen in the Enforce Server
Required
administration console

-result Directory to place the Specifies the directory where the index files are generated.
resulting indexes.

Required

-verbose Display verbose output Displays a statistical summation of the indexing operation
when the index is complete.

See “Troubleshooting preindexing errors for EDM”


on page 598.

Remote EMDI indexing examples using the SQL Preindexer


If your data source is an Oracle DB and has clean data, you can index the data source directly
using the SQL Preindexer with the Remote EMDI Indexer.
The syntax is as follows:

SqlPreindexer -alias=<oracle connect string: //host:port/SID>


-username=<DB user> -password=<DB password> -query=<sql to run> |
RemoteEMDIIndexer -profile=<*.emdi profile file name and path>
-result=<output directory for index files>

For example:

SqlPreindexer -alias=@//myhost:1521/orcl -username=scott -password=tiger


-query="SELECT name, salary FROM employee" |
RemoteEMDIIndexer -profile=C:\ExportEMDIProfile.emdi -result=C:\EMDIIndexDirectory\

With this command the SQL Preindexer utility connects to the Oracle database and runs the
SQL query to retrieve name and salary data from the employee table. The SQL Preindexer
Detecting content using Exact Match Data Identifiers (EMDI) 513
Remote EMDI indexing

returns the result of the query to stdout (the command console). The SQL query must be in
quotes. The Remote EMDI Indexer command runs the utility and reads the query result from
the stdin console. The Remote EMDI Indexer indexes the data using the
ExportEMDIProfile.emdi profile as specified by the profile file name and local file path.

When the generation of the indexes is successful, the utility displays the message "Successfully
created index" as the last line of output.
In addition, the utility places the following generated index files in the EMDIIndexDirectory
-result directory:
■ ExternalEmdiDataSource.<DataSourceName>.pdx
■ ExternalEmdiDataSource.<DataSourceName>.<EmdiDataSourceID>.rdx
The number of .rdx files depends upon on how many columns you selected as key columns
when you created a profile.
For example, if you choose two columns, such as the CCN and SSN, you get two .rdx files.
Here is an example using SQL Preindexer and Remote EMDI Indexer commands:

SqlPreindexer -alias=@//localhost:1521/CUST -username=cust_user -password=cust_pword


-query="SELECT account_id, amount_owed, available_credit FROM customer_account" -verbose |
RemoteEMDIIndexer -profile=C:\EMDIIndexDirectory\CustomerData.emdi
-result=C:\EMDIIndexDirectory\ -verbose

Here the SQL Preindexer command queries the CUST.customer_account table in the database
for the account_id, amount_owed, and available_credit records. The result is piped to the
Remote EMDI Indexer which generates the index files based on the CustomerData.emdi
profile. The -verbose option is used for troubleshooting.
As an alternative to the -query SQL string you can use the -query_path option and specify
the file path and name for the SQL query (*.sql). If you do not specify a query or a query path
the entire DB is queried.

SqlPreindexer -alias=@//localhost:1521/cust -username=cust_user -password=cust_pwrd


-query_path=C:\EMDIIndexDirectory\QueryCust.sql -verbose |
RemoteEMDIIndexer -profile=C:\EMDIIndexDirectory\CustomerData.emdi
-result=C:\EMDIIndexDirectory\ -verbose

Copying and loading EMDI remote index files to the Enforce Server
The system creates one .pdx file and one or more .rdx files in the -result directory when
you remotely index a data source:
■ ExternalEmdiDataSource.<DataSourceName>.pdx

■ ExternalEmdiDataSource.<DataSourceName>.<EmdiDataSourceID>.rdx
Detecting content using Exact Match Data Identifiers (EMDI) 514
Remote EMDI indexing

One .rdx file is created for every key column. For example, the .rdx file can be
ExternalEmdiDataSource.MyProfile.3.rdx.

After you create the index file on a remote machine, you must copy the file to the Enforce
Server, load it into the previously created remote EMDI profile, and submit the indexing job.
See “Creating an EMDI profile template for remote indexing” on page 508.
To copy and load the files on the Enforce Server
1 Go to the directory where the index files were generated. (This directory is the one specified
in the -result option.)
2 Copy all of the index files with .pdx and .rdx extensions to the index directory on the
Enforce Server. This directory is located at
C:\ProgramData\Symantec\DataLossPrevention\ServerPlatformCommon\15.5\index
(Windows) or /var/Symantec/DataLossPrevention/ServerPlatformCommon/15.5/index
(Linux).
3 From the Enforce Server administration console, navigate to the Manage > Policies >
Exact Data screen.
This screen lists all the Exact Match Data Identifier profiles in the system.
4 Click the name of the Exact Match Data Identifier profile you used with the Remote EMDI
Indexer.
5 To load the new index files, go to the Data Source section of the Exact Data Profile and
select Load Externally Generated Index.
6 In the Indexing section, select Submit Indexing Job on Save.
As an alternative to indexing immediately on save, you can set up a job on the remote
machine to run the Remote EMDI Indexer on a schedule. The job should also copy the
generated files to the index directory on the Enforce Server. You can then schedule loading
the updated index files on the Enforce Server from the profile by selecting Load Externally
Generated Index and Submit Indexing Job on Schedule and configuring an indexing
schedule.
See “Use scheduled indexing to automate profile updates (EDM)” on page 607.
7 Click Save.

Troubleshooting EMDI preindexing errors


If you receive an error that the SQL Preindexer was unable to perform query or failed to prepare
for indexing, verify that the -query string is in quotes. You can test your -query string by
running only the SQL Preindexer command. If the command is correct, the data queried from
the database is displayed to the console as stdout.
Detecting content using Exact Match Data Identifiers (EMDI) 515
Properties file settings for EMDI

You may encounter errors when you index large amounts of data. Often the set of data contains
a data record that is incomplete, inconsistent, or inaccurate. Data rows that contain more
columns than expected or incorrect column data types often cannot be properly indexed and
are unrecognized.
The SQL Preindexer can be configured to provide a summary of information about the indexing
operation when it completes. To do so, specify the verbose option when running the SQL
Preindexer.
To see the rows of data that the Remote EMDI Indexer did not index, adjust the configuration
in the Indexer.properties file using the following procedure.
To record those data rows that were not indexed
1 Locate the Indexer.properties file at \Program Files\Symantec\Data Loss
Prevention\Indexer\15.5\Protect\config\Indexer.properties (Windows) or
/Symantec/DataLossPrevention/Indexer/15.5/Protect/config/Indexer.properties
(Linux).
2 Open the file in a text editor.
3 Locate the create_error_file property and change the “false” setting to “true.”
4 Save and close the Indexer.properties file.
The Remote EMDI Indexer logs errors in a file with the same name as the data file being
indexed and the .err suffix.
The rows of data that are listed in the error file are not encrypted. Safeguard the error file
to minimize any security risk from data exposure.
See “About the SQL Preindexer for EDM” on page 586.

Properties file settings for EMDI


The settings for EMDI in Table 25-22 can be configured in the Index.properties,
ProfileIndexConfiguration.properties, and Protect.properties files. These settings
enable EMDI on the DLP Agent, and control other EMDI metrics for columns, cells, log files,
and profile memory usage.
The Protect.properties and the ProfileIndexConfiguration.properties files are available
on the Enforce Server and the detection server.
The Indexer.properties file is available on the Enforce Server and only if you install the
Remote Indexer for EMDI, or IDM, or EDM.
After you edit the properties file settings, make sure that you restart the service to implement
your changes.
Detecting content using Exact Match Data Identifiers (EMDI) 516
Properties file settings for EMDI

Note: The EMDI.MaxEndpointProfileMemoryInMB = setting in the Protect.properties file


can be adjusted both on the Enforce Server and on the detection server. The setting on the
Enforce Server is used by the UI to indicate if the profile is too large to be shipped to the DLP
Agent. The setting on the detection server is the actual profile limit. You must keep both settings
identical on the Enforce Server and on the detection servers to avoid confusion.

Table 25-22 EMDI parameters configurable in properties files

EMDI parameter and file location Default Description

Protect.properties

On the Enforce Server:

C:\Program Files\Symantec\DataLossPrevention\
EnforceServer\15.5\Protect\config\Protect.properties
(Windows)

/opt/Symantec/DataLossPrevention/EnforceServer/
15.5/Protect/config/Protect.properties (Linux)

On the detection server:

C:\Program Files\Symantec\DataLossPrevention
\DetectionServer\15.5\Protect\config\Protect.properties
(Windows)

/opt/Symantec/DataLossPrevention/DetectionServer
/15.5/Protect/config/Protect.properties (Linux)

EMDI.EnabledOnAgents = false EMDI is disabled by default


on DLP Agents. To enable
EMDI on DLP Agents, set
this property to true.

EMDI.MaxEndpointProfileMemoryInMB = 100 Endpoint EMDI per profile


maximum memory usage in
megabytes. This limit is per
profile; not for all profiles
combined.

Indexer.properties

On the Enforce Server:

C:\Program Files\Symantec\DataLossPrevention\
EnforceServer\15.5\Protect\config\Indexer.properties
(Windows)

opt/Symantec/DataLossPrevention/EnforceServer/15.5
/Protect/config/Indexer.properties (Linux)
Detecting content using Exact Match Data Identifiers (EMDI) 517
Best practices for using EMDI

Table 25-22 EMDI parameters configurable in properties files (continued)

EMDI parameter and file location Default Description

emdi_indexer_log_max_files = 100 The maximum number of log


files for the EMDI indexer.

MaxDuplicateCellsPercentage = 1 The maximum integer


percentage of duplicate cells
in an index as a function of
the number of rows EMDI.

MaxNonMatchingDIPercentage = 1 The maximum integer


percentage of key column
values that don't match a
profile data identifier as a
function of the number of
rows EMDI.

ProfileIndexConfiguration

On the Enforce Server:

C:\Program Files\Symantec\DataLossPrevention\
EnforceServer\15.5\Protect\config\ProfileIndex
Configuration.properties (Windows)

/opt/Symantec/DataLossPrevention/EnforceServer/
15.5/Protect/config/ProfileIndexConfiguration.properties
(Linux)

On the detection server:

C:\Program Files\Symantec\DataLossPrevention\
DetectionServer\15.5\Protect\config\ProfileIndex
Configuration.properties (Windows)

/opt/Symantec/DataLossPrevention/EnforceServer/
15.5/Protect/config/ProfileIndexConfiguration.properties
(Linux)

emdi_matcher_log_max_files = 100 The maximum number of log


files for the EMDI matcher.

Best practices for using EMDI


Consider the recommendations in this section when you implement EMDI, to ensure that your
EMDI policies are as accurate as possible. Best practices are not intended to provide detailed
troubleshooting guidance. Following these best practices enables you to create a solid
implementation and reduces the need for troubleshooting and support.
Detecting content using Exact Match Data Identifiers (EMDI) 518
Best practices for using EMDI

Table 25-23 Summary of EMDI Best Practices

Best Practice More information

Never use any personally identifiable information See “Never use a personal identifier as an optional
(PII) as an optional column. column in EMDI” on page 519.

Use three or more columns in a match. See “Use three or more columns in a match for
EMDI” on page 519.

Don’t use EMDI validators as both optional and See “Don’t use EMDI validators as both optional
required for a given data identifier in a policy. and required for a given data identifier in a policy”
on page 519.

Use additional validators with EMDI where possible. See “Use additional validators with EMDI where
possible” on page 519.

Limit the required number of columns to no more See “Limit the required number of columns to two
than two or three. or three for EMDI” on page 519.

When matching with only a single optional column, See “When matching with only a single optional
avoid adding low-variability values as optional column, avoid adding low-variability values as
columns. optional columns with EMDI” on page 519.

Use full disk encryption on endpoint deployments. See “Use full disk encryption on EMDI endpoint
deployments” on page 519.

Eliminate duplicate rows and blank columns before See “Cleanse the EMDI data source file of blank
indexing. columns and duplicate rows” on page 519.

To reduce false positives, avoid single characters, See “Remove ambiguous character types from the
quotes, abbreviations, numeric fields with fewer EMDI data source file” on page 520.
than 5 digits, and dates.

Clean up your data source for multi-token cell See “Clean up your EMDI data source for
matching. multi-token matching” on page 521.

Use the pipe (|) character to delimit columns in your See “Do not use the comma delimiter if the EMDI
data source. data source has number fields” on page 521.

Ensure that the EMDI data source is clean for See “Ensure that the EMDI data source is clean for
indexing. indexing” on page 522.

Include the column headers as the first row of the See “Include column headers as the first row of the
data source file. EMDI data source file” on page 522.

Check the system alerts to tune Exact Match Data See “Check the EMDI system alerts to tune profile
Identifier profiles. accuracy” on page 522.

Automate profile updates with scheduled indexing. See “Use scheduled indexing to automate EMDI
profile updates” on page 523.
Detecting content using Exact Match Data Identifiers (EMDI) 519
Best practices for using EMDI

Never use a personal identifier as an optional column in EMDI


Map any personal identifier as a required column. Never use any personal identifier such as
an SSN, Credit Card Number, or Bank Account Number as an optional column.

Use three or more columns in a match for EMDI


Use three or more columns in a match to minimize false positives.

Don’t use EMDI validators as both optional and required for a given
data identifier in a policy
Do not use an EMDI validator in-line in a policy for a data identifier condition when the data
identifier has already been configured to use an EMDI validator.

Use additional validators with EMDI where possible


Use an additional validator, such as a Luhn check for a Credit Card. These additional validators
are applied before the EMDI lookup and reduce the number of false positives, as well as
improving performance.

Limit the required number of columns to two or three for EMDI


Try to limit the required number of columns to no more than two or three. The memory used
by a profile grows linearly with the number of required columns.

When matching with only a single optional column, avoid adding


low-variability values as optional columns with EMDI
When matching with a single optional column, avoid adding very low-variability values such
as States or 5-digit ZIP Codes as optional columns. Low variability values increase the likelihood
of false positives.

Use full disk encryption on EMDI endpoint deployments


For endpoint deployments, we recommend full disk encryption on the device.

Cleanse the EMDI data source file of blank columns and duplicate
rows
The data source file should be as clean as possible before you create the EMDI index, otherwise
the resulting profile may create false positives.
Detecting content using Exact Match Data Identifiers (EMDI) 520
Best practices for using EMDI

When you create the data source file, avoid including empty cells or blank columns. Blank
columns or fields count as errors when you generate the EMDI profile. A data source error is
either an empty cell or a cell with the wrong type of data (a name appearing in a phone number
column). The error threshold is the maximum percentage of rows that contain errors before
indexing stops. If the errors exceed the error threshold percentage for the profile (by default,
5%), the system stops indexing and displays an indexing error message.
The best practice is to remove blank columns and empty cells from the data source file, rather
than increasing the error threshold. Keep in mind that if you have many empty cells, it may
require a 100% error threshold for the system to create the profile. If you specify 100% as the
error threshold, the system indexes the data source without checking for errors.
In addition, do not fill empty cells or blank fields with fake data so that the error threshold is
met. Adding fake or "null" data to the data source file reduces the accuracy of the EMDI profile
and is discouraged. Content you want to monitor should be legitimate and not null.
See “Do not use the comma delimiter if the EMDI data source has number fields” on page 521.

Remove ambiguous character types from the EMDI data source file
You cannot have extraneous spaces, punctuation, and inconsistently populated fields in the
data source file. You can use tools such as Stream Editor (sed) and AWK to remove these
items from your data source file or files before indexing them.
Table 25-24 list characters to avoid in the data source file.

Table 25-24 Characters to avoid in the EMDI data source file

Characters to avoid Second column header: Explanation

Single characters Single character fields should be eliminated from


the data source file. These are more likely to cause
false positives, since a single character appears
frequently in normal communications.

Abbreviations Abbreviated fields should be eliminated from the


data source file for the same reason as single
characters.

Quotes Text fields should not be enclosed in quotes.

Small numbers Indexing numeric fields that contain fewer than 5


digits is not recommended because it likely yields
many false positives.
Detecting content using Exact Match Data Identifiers (EMDI) 521
Best practices for using EMDI

Table 25-24 Characters to avoid in the EMDI data source file (continued)

Characters to avoid Second column header: Explanation

Dates Date fields are also not recommended. Dates are


treated like a string, so if you index a date, such as
12/6/2007, the string has to match exactly. The
indexer only matches 12/6/2007, and not any other
date formats, such as Dec 6, 2007, 12-6-2007, or
6 Dec 2007. It must be an exact match.

Clean up your EMDI data source for multi-token matching


An EMDI validator performs a full-text search in a proximity of 50 tokens from a Data Identifier
match, checking each token (except those that are excluded because of ignored columns in
the data source) for potential matches.
If a cell in the data profile contains multiple words that are separated by spaces, punctuation,
or alternative Latin and Chinese, Japanese, and Korean (CJK) language characters, the cell
is a multi-token cell. The sub-token parts of a multi-token cell obey the same rules as
single-token cells: they are normalized according to their pattern where normalization can
apply.
If a cell contains a multi-token, the multi-token must match exactly. For example, a column
field with the value “Joe Brown” is a multi-token cell (assuming that multi-token matching is
enabled). At run-time the processor looks to match the exact string "Joe Brown,” including the
space (multiple spaces are normalized to one). The system does not match on "Joe" and
"Brown" if they are detected as single tokens.
Finally, do not change the WIP setting from "true" to "false" unless you are sure that is the
result you want to achieve. You should only set WIP = false when you need to loosen the
matching criteria, such as account numbers where formatting may change across messages.
Make sure that you test detection results to ensure that you get the matches that you expect.

Note: For the sake of brevity, the Lexer.IncludePunctuationInWords parameter is referred


to by the three-letter acronym "WIP."

Do not use the comma delimiter if the EMDI data source has number
fields
Of the four types of column delimiters that you can choose from for separating the fields in the
data source file (pipe, tab, semicolon, or comma), the pipe, semicolon, or tab (default) are
recommended. The comma delimiter is ambiguous and should not be used, especially if one
Detecting content using Exact Match Data Identifiers (EMDI) 522
Best practices for using EMDI

or more fields in your data source contain numbers. If you use a comma-delimited data source
file, make sure there are no commas in the data set other than those used as column delimiters.

Note: The system also treats the pound sign, equals sign, plus sign, and colon characters as
separators, but you should not use these because like the comma their meaning is ambiguous.

Ensure that the EMDI data source is clean for indexing


The following list summarizes a cleansed data source that is ready for indexing:
■ It contains at least one Required (key) column and one Optional column.
■ It is not a single-column data source; it has two or more columns.
■ Empty cells and rows and blank columns are removed.
■ Incomplete and duplicate records are removed.
■ The number of faulty cells is below the default error rate (5%) for indexing.
■ Fake data is not used to fill in blank cells or rows.
■ Improper and ambiguous characters are removed.
■ Multi-tokens comply with space and memory requirements.
■ Column fields are validated against the system-defined patterns that are available.
■ Mappings are validated against policy templates where applicable.

Include column headers as the first row of the EMDI data source file
When you extract the source data to the data source file, you should include the column
headers as the first row in the data source file. Including the column headers makes it easier
for you to identify the data you want to use in your policies.
The column names reflect the column mappings that were created when the exact data profile
was added. If there is an unmapped column, it is called Col X, where X is the column number
(starting with 1) in the original data profile.

Check the EMDI system alerts to tune profile accuracy


You should always review the system alerts after creating the Exact Match Data Identifier
profile. The system alerts provide very specific information about problems you encounter
when you create the profile. For example, an SSN in an address field affects accuracy.
Detecting content using Exact Match Data Identifiers (EMDI) 523
EMDI Troubleshooting

Use scheduled indexing to automate EMDI profile updates


When you configure an Exact Match Data Identifier Profile, you can set a schedule for
indexing the data source file. Index scheduling lets you decide when you want to index the
data source file. For example, instead of indexing the data source at the same time that you
define the profile, you can schedule it for a later date. Alternatively, if you need to reindex the
data source on a regular basis, you can schedule indexing to occur on a regular basis. Before
you set up an index schedule, consider the following:
■ If you update your data sources occasionally (for example, less than once a month),
generally there is no need to create a schedule. Index the data each time you update the
data source.
■ Schedule indexing for times of minimal system use. Indexing affects performance throughout
the Symantec Data Loss Prevention system, and large data sources can take time to index.
■ Index a data source as soon as you add or modify the corresponding exact data profile,
and re-index the data source whenever you update it. For example, consider a scenario
whereby every Wednesday at 2:00 P.M. you generate an updated data source file. In this
case you could schedule indexing every Wednesday at 3:00 P.M. This would give you
enough time to cleanse the data source file and copy it to the Enforce Server.
■ Do not index data sources daily, Daily indexing can degrade performance.
■ Monitor results and modify your indexing schedule accordingly. If performance is good and
you want more timely updates. For example, schedule more frequent data updates and
indexing.

EMDI Troubleshooting
Scan the following problems and solutions before you call Symantec support. Also, follow
EMDI Best Practices to avoid problems in your EMDI deployment.
See “Best practices for using EMDI” on page 517.

The EMDI index doesn’t get published to the Endpoint Agent


Solution: Verify that the parameter EMDI.EnabledOnAgents = true in the Protect.properties
file on each endpoint server.

The EMDI index doesn’t get published to the Endpoint Agent and
the EnabledOnAgents setting is true
Solution: Verify that the EMDI.MaxEndpointProfileMemoryInMB parameter in the
Protect.properties file on each endpoint server is set to a value larger than the index size.
Detecting content using Exact Match Data Identifiers (EMDI) 524
EMDI Troubleshooting

A key column that is in an EMDI index doesn’t generate an incident


Solution: If the Data Identifier in the key (required) column is associated with other validators,
make sure that the value passes these validators. Disable the validation against the EMDI
profile to see if an incident is generated against the same file or message.

EMDI generates an unexpectedly high number of false positives


Solution: Increase the minimum number of optional columns required for a match or remove
any optional columns that contain a large number of repeated values (for example, state or
ZIP Code).
Chapter 26
Detecting content using
Exact Data Matching (EDM)
This chapter includes the following topics:

■ Introducing Exact Data Matching (EDM)

■ Configuring Exact Data profiles for EDM

■ Configuring EDM policies

■ Using multi-token matching with EDM

■ Updating EDM indexes to the latest version

■ Memory requirements for EDM

■ Remote EDM indexing

■ Best practices for using EDM

Introducing Exact Data Matching (EDM)


Exact Data Matching (EDM) is designed to protect your most sensitive content. You can use
EDM to detect structured, tabular data, including personally identifiable information (PII). EDM
is designed to find records that are part of an indexed data source in either structured or
unstructured targets. Some examples are social security numbers, bank account numbers,
and credit card numbers. You can also detect confidential customer and employee records,
price list entries, parts from a parts list, and other confidential data stored in a structured data
source, such as a database, directory server, or a structured data file such as CSV or
spreadsheet.
To implement EDM policies, you identify and prepare the data you want to protect. You create
an Exact Data Profile and index the structured data source using the Enforce Server
Detecting content using Exact Data Matching (EDM) 526
Introducing Exact Data Matching (EDM)

administration console, or remotely using the Remote EDM Indexer. During the indexing
process, the system indexes the data by accessing and extracting the text-based content,
normalizing it, and securing it using a nonreversible hash. You can schedule indexing on a
regular basis after you have pulled current data from the data source to ensure that the EDM
index reflects the current data.
Once you have profiled the data, you configure the Content Matches Exact Data condition
to match individual pieces of the indexed data. For increased accuracy you can configure the
condition to match combinations of data fields from a particular record. The EDM policy condition
matches on data coming from the same row or record of data. For example, you can configure
the EDM policy condition to look for any three of First Name, Last Name, SSN, Account Number,
or Phone Number occurring together in a message and corresponding to a record from your
customer database.
Once the policy is deployed to one or more detection servers, cloud detection services, or
appliances, the system can detect the data fields (or records) that you have profiled in either
structured or unstructured format. For example, you could deploy the EDM policy to a Network
Discover Server and scan data repositories for confidential data matching data records in the
index. Your could also deploy the EDM policy to a Network Prevent for Email Server to detect
records in email communications and attachments, such as Microsoft Word files. If the
attachment is a spreadsheet, such as Microsoft Excel, the EDM policy can detect the presence
of confidential records there as well.
See “About the Exact Data Profile and index” on page 528.

About using EDM to protect content


To understand how EDM works, consider the following example. Your company maintains an
employee database that contains the following column fields:
■ First Name
■ Last Name
■ SSN
■ Date of Hire
■ Salary
In a structured data format such as a database, each row represents one record, with each
record containing values for each column data field. In this example, each row in the database
contains information for one employee, and you can use EDM to protect each record. For
example, one row in the data source file contains the following pipe ("|") delimited record:
First Name | Last Name | SSN | Date of hire | Salary
Bob | Smith | 123-45-6789 | 05/26/99 | $42500
Detecting content using Exact Data Matching (EDM) 527
Introducing Exact Data Matching (EDM)

You create an Exact Data Profile and index the data source file. When you configure the profile,
you map the data field columns to system-defined patterns and validate the data. You then
configure the EDM policy condition that references the Exact Data Profile. In this example, the
condition matches if a message contains all five data fields.
The detection server reports a match if it detects the following in any inbound message:
Bob Smith 123-45-6789 05/26/99 $42500
But, a message containing the following does not match because that record is not in the
index:
Betty Smith 000-00-0000 05/26/99 $42500
If you limited the condition to matching only the Last Name, SSN, and Salary column fields,
the following message is a match because it meets the criteria:
Robert, Smith, 123-45-6789, 05/29/99, $42500
Finally, the following message contents do not match because the value for the SSN is not
present in the profile:
Bob, Smith, 415-789-0000, 05/26/99, $42500
See “Configuring Exact Data profiles for EDM” on page 534.

EDM policy features


EDM policy matching involves searching for indexed content in a given message or file and
generating an incident if a match is found within the defined proximity range. The proximity
range can be changed by editing the EDM.SimpleTextProximityRadius Advanced Server
setting.
Policy matching features of EDM include the following:
■ You can select any number of columns to be matched from a given data source.
■ You can define excluded combinations so that matches against those combinations are
not reported.
■ When the system creates the index, it provides pattern validation for social security numbers,
credit card numbers, U.S. and Canada phone numbers and ZIP codes, email and IP
addresses, numbers, percents, and fields containing other values.
■ There is an editable stopword dictionary you can use to prevent single-token stopwords
from matching and prevents EDM from treating articles and prepositions as possible field
matches. Stopwords are common words, such as articles and prepositions. Stopwords are
not indexed.
■ The system provides match highlighting at the incident snapshot screen: tokens from
matching rows are highlighted.
Detecting content using Exact Data Matching (EDM) 528
Introducing Exact Data Matching (EDM)

■ You can use a WHERE clause in the EDM rule and matches that do not satisfy the WHERE
clause are ignored. For example, you can use a WHERE clause to only match on records
where the customer's country is the United States.
■ You can use Data Owner Exception to ignore detection based on the sender or recipient's
email address or domain. Data owner exception lets you tag or authorize a specific field
in an Exact Data Profile as the data owner. At run-time if the sender or recipient of the data
is authorized as a data owner, the condition does not trigger a match and the data is sent
or received by the data owner.
■ You can use profiled Directory Group Matching (DGM) to match on senders or recipients
of data based on email address or Windows user name.
■ Proximity matching range that is proportional to the number of required matches set in the
policy condition.
■ Full support for single- and multi-token cell indexing and matching. A multi-token is a cell
that is indexed that contains two or more words. Since a single CJK (Chinese, Japanese,
Korean) character is regarded as a token, two or more CJK characters are regarded as a
multi-token.

About the Exact Data Profile and index


The Exact Data Profile is the user-defined configuration that you create before indexing to
index the data source. The index is a set of secure files that contain hashes of the exact data
values from each field in your data source, along with information about those data values.
The index does not contain the data values themselves.
The index that is generated consists of 19 binary DataSource.rdx files, each with space to fit
into random access memory (RAM) on the detection server(s). By default, Symantec Data
Loss Prevention stores index files in
C:\ProgramData\Symantec\DataLossPrevention\ServerPlatformCommon\15.5\Protect\index
(on Windows) or in
/var/Symantec/DataLossPrevention/ServerPlatformCommon/15.5/Protect/index (on
Linux) on the Enforce Server.
Symantec Data Loss Prevention automatically deploys all EDM indexes (*.rdx files) to the
index directory on all detection servers. When an active policy that references an EDM profile
is deployed to a detection server, the detection server loads the corresponding EDM index
into RAM. If a new detection server is added after an index has been created, the *.rdx files
in the index folder on the Enforce Server are deployed to the index folder on the new detection
server. You cannot manually deploy index files to detection servers.
At run-time during detection, the system converts extracted content into hashed data values
using the same algorithm it employs for indexes. It then compares data values from input
content to those in the appropriate index file(s), identifying matches.
Detecting content using Exact Data Matching (EDM) 529
Introducing Exact Data Matching (EDM)

See “Creating and modifying Exact Data Profiles for EDM” on page 541.
See “Memory requirements for EDM” on page 579.

About the exact data source file


The data source file is a tabular file containing data in a standard delimited format (comma,
semicolon, pipe, or tab) that has been extracted from a database, spreadsheet, or other
structured data source, and cleansed for profiling. You upload the data source file to the Enforce
Server when you are defining the Exact Data Profile. For example, you can convert an Excel
spreadsheet to a comma-separated values (CSV) format and the resulting *.csv file can be
used as the data source for your EDM profile.
See “About cleansing the exact data source file for EDM” on page 530.
See “Creating the exact data source file for EDM” on page 535.
You can use the SQL pre-indexer to index the data source directly. However, this approach
has limitations because in most cases the data must first be cleansed before it is indexed.
See “Remote EDM indexing” on page 585.
The data source file must contain at least one unique column field. A unique column field is a
column that has mostly unique values. It can have duplicate values, but not more than the
number set in term_commonority_threshold. The default value for this setting is 10. Some
examples of unique column fields include social security number, drivers license number, and
credit card number.
See “Best practices for using EDM” on page 601.
The maximum number of columns for a single data source file is 32. If the data source file has
more than 32 columns, the Enforce Server administration console produces an error message
at the profile screen, and the data source file is not indexed. The maximum number of rows
is 4,294,967,294 and the total number of cells in a single data source file cannot exceed 6
billion cells. If your data source file is larger than this, split it into multiple files and index each
separately.
Table 26-1 summarizes size limitations for EDM data source files.

Note: The format for the data source file should be a text-based format using commas,
semicolons, pipes, or tabs as delimiters. In general you should avoid using a spreadsheet
format for the data source file (such as XLS or XLSX) because such programs use scientific
notation to render numbers.
Detecting content using Exact Data Matching (EDM) 530
Introducing Exact Data Matching (EDM)

Table 26-1 EDM data source file size limitations

Data source file Limit Description

Columns 32 The data source file cannot have more than 32 columns. If it does, the system
does not index it.

Cells 6 billion The data source file cannot have more than 6 billion data cells. If it does, the
system does not index it.

Rows 4,294,967,294 The maximum number of rows supported is 4,294,967,294.

About cleansing the exact data source file for EDM


Once you have created the data source file, you must prepare the data for indexing by cleansing
it. It is critical that you cleanse the data source file to ensure that your EDM policies are as
accurate as possible. You can use tools such as Stream Editor (sed) and awk to cleanse the
data source file. Melissa Data provides good tools for normalizing data in the data source,
such as addresses.
Table 26-2 provides the workflow for cleansing the data source file for indexing.

Table 26-2 Workflow for cleansing the data source file for EDM

Step Action Description

1 Prepare the data source file for indexing. See “Preparing the exact data source file for indexing
for EDM” on page 537.

2 Ensure that the data source has at least See “Ensure data source has at least one column
one column that is unique data. of unique data (EDM)” on page 602.

3 Remove incomplete and duplicate See “Cleanse the data source file of blank columns
records. Do not fill empty cells with and duplicate rows (EDM)” on page 603.
bogus data.

4 Remove improper characters. See “Remove ambiguous character types from the
data source file (EDM)” on page 604.

5 Verify that the data source file is below See “Preparing the exact data source file for indexing
the error threshold. The error threshold for EDM” on page 537.
is the maximum percentage of rows that
contain errors before indexing stops.

About using System Fields for data source validation with EDM
Column headings in your data source are useful for visual reference. However, they do not
tell Symantec Data Loss Prevention what kind of data the columns contain. To do this, you
Detecting content using Exact Data Matching (EDM) 531
Introducing Exact Data Matching (EDM)

use the Field Mappings section of the Exact Data Profile to specify mappings between fields
in your data source. You can also use field mappings to specify fields that the system recognizes
in the system-provided policy templates. The Field Mappings section also gives you advanced
options for specifying custom fields and validating the data in those fields.
See “Mapping Exact Data Profile fields for EDM” on page 545.
Consider the following example use of field mappings. Your company wants to protect employee
data, including employee social security numbers. You create a Data Loss Prevention policy
based on the Employee Data Protection template. The policy requires an exact data index
with fields for social security numbers and other employee data. You prepare your data source
and then create the Exact Data Profile. To validate the data in the social security number
field, you map this column field in your index to the "Social Security Number" system field
pattern. The system then validates all data in that field using the Social Security Number
validator to ensure that each data item is a social security number.
Using the system-defined field patterns to validate your data is critical to the accuracy of your
EDM policies. If there is no system-defined field pattern that corresponds to one or more data
fields in your index, you can define custom fields and choose the appropriate validator to
validate the data.
See “Map data source column to system fields to leverage validation (EDM)” on page 605.

About index scheduling for EDM


After you have indexed an exact data source extract, its schema cannot be changed because
the *.rdx index file is binary. If the data source changes, or the number of columns or data
mapping of the exact data source file changes, you must create a new EDM index and update
the policies that reference the changed data. In this case you can schedule the indexing to
keep the index in sync with the data source.
The typical use case is as follows. You extract data from a database to a file and cleanse it to
create your data source file. Using the Enforce Server administration console you define an
Exact Data Profile and index the data source file. The system generates the *.rdx index files
and deploys them to one or more detection servers. However, if you know that the data changes
frequently, you need to generate a new data source file weekly or monthly to keep up with the
changes to the database. In this case, you can use index scheduling to automate the indexing
of the data source file so you do not have to return to the Enforce Server administration console
and reindex the updated data source. Your only task is to drop an updated and cleansed data
source file to the Enforce Server for scheduled indexing.

Note: You must reindex after upgrading to the latest version of Symantec Data Loss Prevention.

See “Configuring Exact Data profiles for EDM” on page 534.


See “Scheduling Exact Data Profile indexing for EDM” on page 548.
Detecting content using Exact Data Matching (EDM) 532
Introducing Exact Data Matching (EDM)

See “Use scheduled indexing to automate profile updates (EDM)” on page 607.

About the Content Matches Exact Data From condition for EDM
The Content Matches Exact Data From an Exact Data Profile condition is the detection
component you use to implement EDM policy conditions. When you define this condition, you
select the EDM profile on which the condition is based. You also select the columns you want
to use in your condition, as well as any WHERE clause limitations.

Note: You cannot use the Content Matches Exact Data From an Exact Data Profile condition
as a policy exception. Symantec Data Loss Prevention does not support the use of the EDM
condition as a policy exception.

See “Configuring the Content Matches Exact Data policy condition for EDM” on page 551.

About Data Owner Exception for EDM


Although EDM does not support the explicit use of match exceptions in policies, EDM does
support criteria-based matching exceptions. This feature of EDM is known as Data Owner
Exception. Data owner exception lets you tag or authorize a specific field in an Exact Data
Profile as the data owner. At run-time if the sender or recipient of the data is authorized as a
data owner, the condition does not trigger a match and the data is sent or received by the data
owner.
You implement data owner exception by including either the email address field or domain
address field in your Exact Data Profile. In the EDM policy condition, you specify the field as
either the sender or recipient data owner. An authorized data owner, identified by email address
or a domain address, who is a sender can send confidential information without triggering an
EDM match or incident. This means that the sender can send any information that is contained
in the row where the sender's email address or domain is specified. Authorized data owner
recipients can be specified individually or all recipients in the list can be allowed to receive the
data without triggering a match.
As a policy author, data owner exception gives you the flexibility to allow data owners to use
their own data legitimately. For example, if data owner exception is enabled, an employee can
send an email containing their confidential information (such as an account number) without
triggering a match or an incident. Similarly, if data owner exception is configured for a recipient,
the system does not trigger an EDM match or incident if the data owner receives their own
information, such as when someone outside the company sends an email to the data owner
containing the data owner's account number.
See “About upgrading EDM deployments” on page 534.
See “Creating the exact data source file for Data Owner Exception for EDM” on page 536.
Detecting content using Exact Data Matching (EDM) 533
Introducing Exact Data Matching (EDM)

See “Configuring Data Owner Exception for EDM policy conditions” on page 554.

About profiled Directory Group Matching (DGM) for EDM


Profiled Directory Group Matching (DGM) is a specialized implementation of EDM that is used
to detect the exact identity of a message user, sender, or recipient that has been profiled from
a directory server or database.
Profiled DGM leverages EDM technology to detect identities that you have indexed from your
database or directory server using an Exact Data Profile. For example, you can use profiled
DGM to identify network user activity or to analyze content associated with particular users,
senders, or recipients. Or, you can exclude certain email addresses from analysis. Or, you
might want to prevent certain people from sending confidential information by email.
To implement profiled DGM, your exact data source file must contain one or more of the
following fields:
■ Email address
■ IP address
■ Windows user name
■ IM name
If you include the email address field in the DGM profile, the field appears in the Directory
EDM drop-down list at the incident snapshot screen in the Enforce Server administration
console, which facilitates remediation.
See “Creating the exact data source file for profiled DGM for EDM” on page 537.
See “Include an email address field in the Exact Data Profile for profiled DGM (EDM)”
on page 610.
See “Use profiled DGM for Network Prevent for Web identity detection (EDM)” on page 611.

About two-tier detection for EDM on the endpoint


The EDM index is server-based. If you deploy a policy containing an EDM condition to the
DLP Agent on the endpoint, the system uses two-tier detection to evaluate data for matching.
The EDM detection condition is not evaluated locally by the DLP Agent. Instead, the DLP
Agent sends the data to the Endpoint Server for evaluation against the index. If the endpoint
is offline, the message cannot be sent until the server is available, which can affect endpoint
performance. In addition, two-tier detection has no ability to block, encrypt, or notify. Symantec
does not recommend two-tier detection.
See “Two-tier detection for DLP Agents” on page 395.
To check if you are using two-tier detection, read the
C:\ProgramData\Symantec\DataLossPrevention\DetectionServer\15.5\logs\debug\FileReader.log
Detecting content using Exact Data Matching (EDM) 534
Configuring Exact Data profiles for EDM

on the Endpoint Server to see if any EDM indexes are loaded. Look for the line "loaded database
profile."
See “Troubleshooting policies” on page 445.

About upgrading EDM deployments


To take advantage of the latest EDM enhancements, you must upgrade your servers to the
latest version of Symantec Data Loss Prevention version and you must reindex your EDM
data sources using the latest version of the EDM Indexer. Reindexing should be done after
you upgrade all of your servers. In that case, the old detection servers can continue to work
with the old indexes while you upgrade.
See “About Data Owner Exception for EDM” on page 532.
See “Updating EDM indexes to the latest version” on page 574.
See “Memory requirements for EDM” on page 579.
See “EDM index out-of-date error codes” on page 578.

Configuring Exact Data profiles for EDM


To implement EDM, you create the Exact Data Profile, index the data source, and define one
or more Content Matches Exact Data conditions to match profiled data exactly.
See “About the Exact Data Profile and index” on page 528.

Table 26-3 Implementing Exact Data Matching with EDM

Step Action Description

1 Create the data source file. Export the source data from the database (or other data repository) to
a tabular text file with delimited fields.

If you want to except data owners from matching, you need to include
specific data items in the data source file.

See “About the exact data source file” on page 529.

If you want to match identities for profiled Directory Group Matching


(DGM), you need to include specific data items in the data source files.

See “Creating the exact data source file for EDM” on page 535.

See “Creating the exact data source file for profiled DGM for EDM”
on page 537.
Detecting content using Exact Data Matching (EDM) 535
Configuring Exact Data profiles for EDM

Table 26-3 Implementing Exact Data Matching with EDM (continued)

Step Action Description

2 Prepare the data source file for Cleanse the data source file.
indexing.
See “Preparing the exact data source file for indexing for EDM”
on page 537.

3 Upload the data source file to the You can copy or upload the data source file to the Enforce Server, or
Enforce Server. access it remotely.

See “Uploading exact data source files for EDM to the Enforce Server”
on page 539.

4 Create an Exact Data Profile. An Exact Data Profile is required to implement Exact Data Matching
(EDM) policies. The Exact Data Profile specifies the data source, data
field types, and the indexing schedule.

See “Creating and modifying Exact Data Profiles for EDM” on page 541.

5 Map and validate the data fields. You map the source data fields to system or custom data types that
the system validates. For example, a social security number data field
needs to be nine digits.

See “About using System Fields for data source validation with EDM”
on page 530.

See “Mapping Exact Data Profile fields for EDM” on page 545.

6 Index the data source, or Schedule the indexing to keep the index in sync with the data
schedule indexing. source.See “About index scheduling for EDM” on page 531.

See “Scheduling Exact Data Profile indexing for EDM” on page 548.

7 Configure and tune one or more See “Configuring the Content Matches Exact Data policy condition for
Content Matches Exact Data EDM” on page 551.
policy conditions.

Creating the exact data source file for EDM


The first step in the EDM indexing process is to create the data source. A data source is a
tabular file containing data in a standard delimited format, where data is delimited by commas,
semicolons, pipes, or tabs.
If you plan to use a policy template, review it before creating the data source file to see which
data fields the policy uses. For relatively small data sources, include as many suggested fields
in your data source as possible. However, note that the more fields you include, the more
memory the resulting index requires. This consideration is important if you have a large data
source. When you create the data profile, you can confirm how well the fields in your data
source match against the suggested fields for the template.
Detecting content using Exact Data Matching (EDM) 536
Configuring Exact Data profiles for EDM

See Table 26-4 on page 536.

Table 26-4 Create the exact data source file

Step Description

1 Export the data you want to protect from a database or other tabular data format, such as an Excel
spreadsheet, to a tabular text file. The data source file you create must be a tabular text file that contains
rows of data from the original source. Each row from the original source is included as a row in the data
source file. Delimit columns using a tab, a comma, or a pipe. Pipe is preferred. Comma should not be
used if your data source fields contain numbers.

See “About the exact data source file” on page 529.

You must maintain all the structured data that you exported from the source database table or table-like
format in one data source file. You cannot split the data source across multiple files.

The data source file cannot exceed 32 columns, 4,294,967,294 rows, or 6 billion cells. If you plan to
upload the data source file to the Enforce Server, browser capacity limits the data source size to 2 GB.
For file sizes larger than this size you can copy the file to the Enforce Server using FTP/S, SCP, SFTP,
CIFS, or NFS.

2 Include required data fields for specific EDM implementations:

■ Unique data
For all EDM implementations, make sure that the data source contains at least one column of unique
data.
See “Ensure data source has at least one column of unique data (EDM)” on page 602.
■ Data Owner Exception
Make sure that the data source contains the email address field or domain field, if you plan to use
data owner exceptions.
See “Creating the exact data source file for Data Owner Exception for EDM” on page 536.
■ Directory Group Matching
Make sure that the data source includes one or more sender/recipient identifying fields.
See “Creating the exact data source file for profiled DGM for EDM” on page 537.

3 Prepare the data source file for indexing.

See “Preparing the exact data source file for indexing for EDM” on page 537.

Creating the exact data source file for Data Owner Exception for
EDM
To implement Data Owner Exception and ignore data owners from detection, you must explicitly
include each user's email address or domain address in the Exact Data Profile. Each expected
domain (for example, symantec.com) must be explicitly added to the Exact Data Profile. The
system does not automatically match on subdomains (for example, support.symantec.com).
Each subdomain must be explicitly added to the Exact Data Profile.
Detecting content using Exact Data Matching (EDM) 537
Configuring Exact Data profiles for EDM

To implement the data owner exception feature, you must include either or both of the following
fields in your data source file:
■ Email address, such as [email protected]
■ Domain address, such as symantec.com
See “About Data Owner Exception for EDM” on page 532.
See “Configuring Data Owner Exception for EDM policy conditions” on page 554.

Creating the exact data source file for profiled DGM for EDM
Profiled DGM leverages Exact Data Matching (EDM) technology to precisely detect identities.
Identity-related attributes may include an IP address, email address, user name, business
unit, department, manager, title, or employment status. Other attributes may be whether that
employee has provided consent to be monitored, or whether the employee has access to
sensitive information. To implement profiled DGM, you must include at least one required data
field in your data source.
See “About the Exact Data Profile and index” on page 528.
Table 26-5 lists the required fields for profiled DGM. The data source file must contain at least
one of these fields.

Table 26-5 Profiled DGM data source fields for EDM

Field Description

Email address If you use an email address column field in the data source file, the email address appears in
the Directory EDM drop-down list at the incident snapshot screen.

IP address For example: 172.24.56.33

Windows user name If you use a Windows user name field in your data source, the data must be in the following
format: domain\user; for example: ACME\john_smith.

AOL IM name IM screen name

Skype name For example: myscreenname123

Microsoft Office
Communicator name

Preparing the exact data source file for indexing for EDM
Once you create the exact data source file, you must prepare it so that you can efficiently index
the data you want to protect.
Detecting content using Exact Data Matching (EDM) 538
Configuring Exact Data profiles for EDM

When you index an exact data profile, the Enforce Server keeps track of empty cells and any
misplaced data which count as errors. For example, an error may be a name that appears in
a column for phone numbers. Errors can constitute a certain percentage of the data in the
profile (five percent, by default). If this default error threshold is met, Symantec Data Loss
Prevention stops indexing. It then displays an error to warn you that your data may be
unorganized or corrupt.
To prepare the exact data source for EDM indexing
1 Make sure that the data source file is formatted as follows:
■ If the data source has more than 200,000 rows, verify that it has at least two columns
of data. One of the columns should contain unique values. For example, credit card
numbers, driver’s license numbers, or account numbers (as opposed to first and last
names, which are generic).
See “Ensure data source has at least one column of unique data (EDM)” on page 602.
■ Verify that you have delimited the data source using pipes ( | ) or tabs. If the data
source file uses commas as delimiters, remove any commas that do not serve as
delimiters.
See “Do not use the comma delimiter if the data source has number fields (EDM)”
on page 605.
■ Verify that data values are not enclosed in quotes.
■ Remove single-character and abbreviated data values from the data source. For
example, remove the column name and all values for a column in which the possible
values are Y and N.
■ Optionally, remove any columns that contain numeric values with less that five digits,
as these can cause false positives in production.
See “Remove ambiguous character types from the data source file (EDM)” on page 604.
■ Verify that numbers, such as credit card or social security, are delimited internally by
dashes, or spaces, or none at all. Make sure that you do not use a data-field delimiter
such as a comma as an internal delimiter in any such numbers. For example:
123-45-6789, or 123 45 6789, or 123456789 are valid, but not 123,45,6789.
See “Do not use the comma delimiter if the data source has number fields (EDM)”
on page 605.
■ Eliminate duplicate records, which can cause duplicate incidents in production.
See “Cleanse the data source file of blank columns and duplicate rows (EDM)”
on page 603.
■ Do not index common values. EDM works best with values that are unique. Think
about the data you want to index (and thus protect). Is this data truly valuable? If the
value is something common, it is not useful as an EDM value. For example, suppose
that you want to look for "US states." Since there are only 50 states, if your exact data
Detecting content using Exact Data Matching (EDM) 539
Configuring Exact Data profiles for EDM

profile has 300,000 rows, the result is a lot of duplicates of common values. Symantec
Data Loss Prevention indexes all values in the exact data profile, regardless of if the
data is used in a policy or not. It is good practice to use values that are less common
and preferably unique to get the best results with EDM.
See “Ensure data source has at least one column of unique data (EDM)” on page 602.

2 Once you have prepared the exact data source file, proceed with the next step in the EDM
process: upload the exact data source file to the Enforce Server for profiling the data you
want to protect.
See “Uploading exact data source files for EDM to the Enforce Server” on page 539.

Uploading exact data source files for EDM to the Enforce Server
After you have prepared the data source file for indexing, load it to the Enforce Server so the
data source can be indexed.
See “Creating and modifying Exact Data Profiles for EDM” on page 541.
Listed here are the options you have for making the data source file available to the Enforce
Server. Consult with your database administrator to determine the best method for your needs.

Table 26-6 Uploading the data source file for EDM to the Enforce Server for indexing

Upload option(s) Use case Description

Upload Data Source Data source file is If you have a smaller data source file (less than 50 MB), upload the data
to Server Now less than 50 MB source file to the Enforce Server using the Enforce Server administration
console (web interface). When creating the Exact Data Profile, you can
specify the file path or browse to the directory and upload the data source
file.
Note: Due to browser capacity limits, the maximum file size that you can
upload is 2 GB. However, uploading any file over 50 MB is not
recommended since files over this size can take a long time to upload. If
your data source file is over 50 MB, consider copying the data source file
to the datafiles directory using the next option.
Detecting content using Exact Data Matching (EDM) 540
Configuring Exact Data profiles for EDM

Table 26-6 Uploading the data source file for EDM to the Enforce Server for indexing
(continued)

Upload option(s) Use case Description

Reference Data Data source file is If you have a large data source file (over 50 MB), copy it to the datafiles
Source on Manager over 50 MB. directory on the host where Enforce is installed.
Host
■ On Windows this directory is located at
C:\ProgramData\Symantec\DataLossPrevention
\ServerPlatformCommon\15.5\Protect\datafiles.
■ On Linux this directory is located at
/var/Symantec/DataLossPrevention
/ServerPlatformCommon/15.5/datafiles.

This option is convenient because it makes the data file available through
a drop-down list during configuration of the Exact Data Profile. If it is a
large file, use a third-party solution (such as Secure FTP) to transfer the
data source file to the Enforce Server.
Note: Ensure that the Enforce user (usually called "protect") has modify
permissions (on Windows) or rw permissions (on Linux) for all files in the
datafiles directory.

Use This File Name Data source file is You may want to create an EDM profile before you have created the data
not yet created. source file. In this case you can create a profile template and specify the
name of the data source file you plan to create. This option lets you define
EDM policies using the EDM profile template before you index the data
source. The policies do not operate until the data source is indexed. When
you have created the data source file you place it in the
\ProgramData\Symantec\DataLossPrevention
\ServerPlatformCommon\15.5\Protect\datafiles directory
(Windows) or /var/Symantec/DataLossPrevention
/ServerPlatformCommon/15.5/Protect/datafiles (Linux) and
index the data source immediately on save or schedule indexing.

See “Creating and modifying Exact Data Profiles for EDM” on page 541.
Detecting content using Exact Data Matching (EDM) 541
Configuring Exact Data profiles for EDM

Table 26-6 Uploading the data source file for EDM to the Enforce Server for indexing
(continued)

Upload option(s) Use case Description

Use This File Name Data source is to In some environments it may not be secure or feasible to copy or upload
be indexed the data source file to the Enforce Server. In this situation you can index
and
remotely and the data source remotely using Remote EDM Indexer.
Load Externally copied to the
See “Remote EDM indexing” on page 585.
Generated Index Enforce Server.
This utility lets you index an exact data source on a computer other than
the Enforce Server host. This feature is useful when you do not want to
copy the data source file to the same computer as the Enforce Server.
As an example, consider a situation where the originating department
wants to avoid the security risk of copying the data to an
extra-departmental host. In this case you can use the Remote EDM
Indexer.

First you create an EDM profile template where you choose the Use this
File Name and the Number of Columns options. You must specify the
name of the data source file and the number of columns it contains.

See “Creating an EDM profile template for remote indexing” on page 589.

You then use the Remote EDM Indexer to remotely index the data source
and copy the index files to the Enforce Server host and load the externally
generated index. The Load Externally Generated Index option is only
available after you have defined and saved the profile. Remote indexes
are loaded from the \Program
Files\Symantec\DataLossPrevention\
EnforceServer\15.5\Protect\index directory on the Enforce
Server host.

See “Copying and loading remote EDM index files to the Enforce Server”
on page 594.

Creating and modifying Exact Data Profiles for EDM


The Manage > Data Profiles > Exact Data > Add Exact Data Profile screen is the home
page for managing and adding Exact Data Profiles. An Exact Data Profile is required to
implement an instance of the Content Matches Exact Data conditions. An Exact Data Profile
specifies the data source, the indexing parameters, and the indexing schedule. Once you have
created the EDM profile, you index the data source and configure one or more Content Matches
Exact Data conditions that can be added to rules to use the profile and detect exact content
matches.
See “Configuring Exact Data profiles for EDM” on page 534.
Detecting content using Exact Data Matching (EDM) 542
Configuring Exact Data profiles for EDM

Note: If you are using the Remote EDM Indexer to generate the Exact Data Profile, refer to
the following topic.

To create or modify an Exact Data Profile


1 Make sure that you have created the data source file.
See “Creating the exact data source file for EDM” on page 535.
2 Make sure that you have prepared the data source file for indexing.
See “Preparing the exact data source file for indexing for EDM” on page 537.
3 Make sure that the data source contains the email address field or domain field, if you
plan to use data owner exceptions.
See “About Data Owner Exception for EDM” on page 532.
4 In the Enforce Server administration console, navigate to Manage > Data Profiles >
Exact Data.
5 Click Add Exact Data Profile.
6 Enter a unique, descriptive Name for the profile (limited to 256 characters).
For easy reference, choose a name that describes the data content and the index type
(for example, Employee Data EDM).
If you modify an existing Exact Data Profile you can change the profile name.
7 Select one of the following Data Source options to make the data source file available to
the Enforce Server:
■ Upload Data Source to Server Now
If you are creating a new profile, click Browse and select the data source file, or enter
the full path to the data source file.
If you are modifying an existing profile, select Upload Now.
See “Uploading exact data source files for EDM to the Enforce Server” on page 539.
■ Reference Data Source on Manager Host
If you copied the data source file to the datafiles directory on the Enforce Server, it
appears in the drop-down list for selection.
See “Uploading exact data source files for EDM to the Enforce Server” on page 539.
■ Use This File Name
Select this option if you have not yet created the data source file but want to configure
EDM policies using a placeholder EDM profile. Enter the file name of the data source
you plan to create, including the Number of Columns it is to have. When you do
create the data source, you must copy it to the datafiles directory.
See “Uploading exact data source files for EDM to the Enforce Server” on page 539.
Detecting content using Exact Data Matching (EDM) 543
Configuring Exact Data profiles for EDM

Note: Use this option with caution. Be sure to remember to create the data source file
and copy it to the datafiles directory. Name the data source file exactly the same
as the name you enter here and include the exact number of columns you specify
here.

■ Load Externally Generated Index


Select this option if you have created an index on a remote computer using the Remote
EDM Indexer. This option is only available after you have defined and saved the profile.
Profiles are loaded from the \Program
Files\Symantec\DataLossPrevention\EnforceServer\15.5\Protect\index
directory (Windows) or the
/var/Symantec/DataLossPrevention/EnforceServer/15.5/index directory (Linux)
on the Enforce Server host.
See “Uploading exact data source files for EDM to the Enforce Server” on page 539.

8 If the first row of your data source contains Column Names, select Read first row as
column names.
9 Specify the Error Threshold, which is the maximum percentage of rows that contain
errors before indexing stops.
A data source error is either an empty cell, a cell with the wrong type of data, or extra
cells in the data source. For example, a name in a column for phone numbers is an error.
If errors exceed a certain percentage of the overall data source (by default, 5%), the
system quits indexing and displays an indexing error message. The index is not created
if the data source has more invalid records than the error threshold value allows. Although
you can change the threshold value, more than a small percentage of errors in the data
source can indicate that the data source is corrupt, is in an incorrect format, or cannot be
read. If you have a significant percentage of errors (10% or more), stop indexing and
cleanse the data source.
See “Preparing the exact data source file for indexing for EDM” on page 537.
10 Select the Column Separator Char (delimiter) that you have used to separate the values
in the data source file. The delimiters you can use are tabs, commas, or pipes.
11 Select one of the following encoding values for the content to analyze, which must match
the encoding of your data source:
■ ISO-8859-1 (Latin-1) (default value)
Standard 8-bit encoding for Western European languages using the Latin alphabet.
■ UTF-8
Use this encoding for all languages that use the Unicode 4.0 standard (all single- and
double-byte characters), including those in East Asian languages.
■ UTF-16
Detecting content using Exact Data Matching (EDM) 544
Configuring Exact Data profiles for EDM

Use this encoding for all languages that use the Unicode 4.0 standard (all single- and
double-byte characters), including those in East Asian languages.

Note: Make sure that you select the correct encoding. The system does not prevent you
from creating an EDM profile using the wrong encoding. The system only reports an error
at run-time when the EDM policy attempts to match inbound data. To make sure that you
select the correct encoding, after you click Next, verify that the column names appear
correctly. If the column names do not look correct, you chose the wrong encoding.

12 Click Next to go to the second Add Exact Data Profile screen.


13 The Field Mappings section displays the columns in the data source and the field to
which each column is mapped in the Exact Data Profile. Field mappings in existing Exact
Data Profiles are fixed and, therefore, are not editable.
See “About using System Fields for data source validation with EDM” on page 530.
See “Mapping Exact Data Profile fields for EDM” on page 545.
Confirm that the column names in your data source are accurately represented in the
Data Source Field column. If you selected the Column Names option, the Data Source
Field column lists the names in the first row of your data source. If you did not select the
Column Names option, the column lists Col 1, Col 2, and so on.
14 In the System Field column, select a field from the drop-down list for each data source
field. This step is required if you use a policy template, or if you want to check for errors
in the data source.
For example, for a data source field that is called SOCIAL_SECURITY_NUMBER, select
Social Security Number from the corresponding drop-down list. The values in the System
Field drop-down lists include all suggested fields for all policy templates.
15 Optionally, specify and name any custom fields (that is, the fields that are not pre-populated
in the System Field drop-down lists). To do so, perform these steps in the following order:
■ Click Advanced View to the right of the Field Mappings heading. This screen displays
two additional columns (Custom Name and Type).
■ To add a custom system field name, go to the appropriate System Field drop-down
list. Select Custom, and type the name in the corresponding Custom Name text field.
■ To specify a pattern type (for purposes of error checking), go to the appropriate Type
drop-down list and select the wanted pattern. To see descriptions of all available pattern
types, click Description at the top of the column.
Detecting content using Exact Data Matching (EDM) 545
Configuring Exact Data profiles for EDM

16 Check your field mappings against the suggested fields for the policy template you plan
to use. To do so, go to the Check Mappings Against drop-down list, select a template,
and click Check now on the right.
The system displays a list of all template fields that you have not mapped. You can go
back and map these fields now. Alternatively, you may want to expand your data source
to include as many expected fields as possible, and then re-create the exact data profile.
Symantec recommends that you include as many expected data fields as possible.
17 In the Indexing section of the screen, select one of the following options:
■ Submit Indexing Job on Save
Select this option to begin indexing the data source when you save the exact data
profile.
■ Submit Indexing Job on Schedule
Select this option to index the data source according to a specific schedule. Make a
selection from the Schedule drop-down list and specify days, dates, and times as
required.
See “About index scheduling for EDM” on page 531.
See “Scheduling Exact Data Profile indexing for EDM” on page 548.

18 Click Finish.
After Symantec Data Loss Prevention finishes indexing, it deletes the original data source
from the Enforce Server. After you index a data source, you cannot change its schema.
If you change column mappings for a data source after you index it, you must create a
new exact data profile.
After the indexing process is complete you can create new Content Matches Exact Data
conditions that can be added to a rule that references the Exact Data Profile you have
created.
See “Configuring the Content Matches Exact Data policy condition for EDM” on page 551.

Mapping Exact Data Profile fields for EDM


After you have added and configured the data source file and settings, the Manage > Data
Profiles > Exact Data > Add Exact Data Profile screen lets you map the fields from the data
source file to the Exact Data Profile you configure.
To enable error checking on a field in a data source or to use the index with a policy template
that uses a system field, you must map the field in the data source to the system field. The
Field Mappings section lets you map the columns in the original data source to system fields
in the Exact Data Profile.
Detecting content using Exact Data Matching (EDM) 546
Configuring Exact Data profiles for EDM

Table 26-7 Field mapping options

Field Description

Data Source Field If you selected the Column Names option at the Add Exact Data Profile screen, this column
lists the values that are found in the first row from the data source. If you did not select this
option, this column lists the columns by generic names (such as Col 1, Col 2, and so on).
Note: If you implement a data owner exception, you must map either or both the email address
and domain fields.

See “Configuring the Content Matches Exact Data policy condition for EDM” on page 551.

System Field Select the system field for each column.

A system field value (except None Selected) cannot be mapped to more than one column.

Some system fields have system patterns associated with them (such as social security
number) and some do not (such as last name).

See “Using system-provided pattern validators for EDM profiles” on page 547.

Check mappings Select a policy template from the drop-down list to compare the field mappings against and
against policy then click Check now.
template
All policy templates that implement EDM appear in the drop-down menu, including any you
have imported.

See “Choosing an Exact Data Profile” on page 409.

If you plan to use more than one policy template, select one and check it, and then select
another and check it, and so on.

If there are any fields in the policy template for which no data exists in the data source, a
message appears listing the missing fields. You can save the profile anyway or use a different
Exact Data Profile.

Advanced View If you want to customize the schema for the exact data profile, click Advanced View to display
the advanced field mapping options.

Table 26-8 lists and describes the additional columns you can specify in the Advanced View
screen.

Indexing Select one of the indexing options.

See “Scheduling Exact Data Profile indexing for EDM” on page 548.

Finish Click Finish when you are done configuring the Exact Data Profile.

From the Advanced View you map the system and data source fields to system patterns.
System patterns map the specified structure to the data in the Exact Data Profile and enable
efficient error checking and hints for the indexer.
Detecting content using Exact Data Matching (EDM) 547
Configuring Exact Data profiles for EDM

Table 26-8 Advanced View options for EDM

Field Description

Custom Name If you select Custom Name for a System Field, enter a unique name for it and then select a
value for Type. The name is limited to 60 characters.

Type If you select a value other than Custom for a System Field, some data types automatically
select a value for Type. For example, if you select Birth Date for the System Field, Date is
automatically selected as the Type. You can accept it or change it.

Some data types do not automatically select a value for Type. For example, if you select
Account Number for the System Field, the Type remains unselected. You can specify the
data type of your particular account numbers.

See “Using system-provided pattern validators for EDM profiles” on page 547.

Description Click the link (description) beside the Type column header to display a pop-up window
containing the available system data types.

See “Using system-provided pattern validators for EDM profiles” on page 547.

Simple View Click Simple View to return to the Simple View (with the Custom Name and Type columns
hidden).

See “Creating and modifying Exact Data Profiles for EDM” on page 541.

Using system-provided pattern validators for EDM profiles


Table 26-9 lists and describes the system-provided data validators for EDM profiles.

Table 26-9 System-provided data validators for EDM profiles

Type Description

Credit Card Number The Credit Card pattern is built around knowledge about various international credit cards,
their registered prefixes, and number of digits in account numbers. The following types of
Credit Cards patterns are validated: MasterCard, Visa, America Express, Diners Club, Discover,
Enroute, and JCB.

Optional spaces in designated areas within credit cards numbers are recognized. Note that
only spaces in generally accepted locations (for example, after every 4th digit in MC/Visa) are
recognized. Note that the possible location of spaces differs for different card types. Credit
card numbers are validated using checksum algorithm. If a number looks like a credit card
number (that is, it has correct number of digits and correct prefix), but does not pass checksum
algorithm, it is not considered a credit card, but just a number.

Email Email is a sequence of characters that looks like the following: [email protected], where
string may contain letters, digits, underscore, dash, and dot, and 'tld' is one of the approved
DNS top-level generic domains, or any two letters (for country domains).
Detecting content using Exact Data Matching (EDM) 548
Configuring Exact Data profiles for EDM

Table 26-9 System-provided data validators for EDM profiles (continued)

Type Description

IP Address IP Address is a collection of 4 sequences of between 1 and 3 digits, separated by dots.

Number Number is either float or integer, either by itself or in round brackets (parenthesis).

Percent Percent is a number immediately followed by the percent sign ("%"). No space is allowed
between a number and a percent sign.

Phone Only US and Canadian telephone numbers are recognized. The phone number must start
with any digit but 1, with the exception of numbers that include a country code.
Phone number can be one of the following formats:

■ 7 digits (no spaces or dashes)


■ Same as above, preceded by 3 digits, or by 3 digits in round brackets, followed by spaces
or dashes
■ 3 digits, followed by optional spaces or dashes, followed by 4 digits
■ Same as above, preceded by the number 1, followed by spaces or dashes

All of these cases can be optionally followed by an extension number, preceded by spaces or
dashes. The extension number is 2 to 5 digits preceded by any of the following (case
insensitive): 'x' 'ex' 'ext' 'exten' 'extens' 'extensions' optionally followed by a dot and spaces.
Note: The system does not recognize the pattern XXX-XXX-XXXX as a valid phone number
format because this format is frequently used in other forms of identification. If your data source
contains a column of phone numbers in that format, select None Selected to avoid confusion
between phone numbers and other data.

Postal Code Only US ZIP codes and Canadian Postal Codes are recognized. The US ZIP code is a sequence
of 5 digits, optionally followed by dash, followed by another 4 digits. The Canadian Postal
Code is a sequence like K2B 8C8, that is, "letter-digit-letter-space-digit-letter-digit" where
space(s) in the middle is optional.

Social Security Only US Social Security Numbers are recognized. The SOCIAL SECURITY NUMBER is 3
Number digits, optionally followed by spaces or dashes, followed by 2 digits, optionally followed by
spaces or dashes, followed by 4 digits.

Scheduling Exact Data Profile indexing for EDM


When you configure an Exact Data Profile, you can set a schedule for indexing the data source
(Submit Indexing on Job Schedule).
See “About index scheduling for EDM” on page 531.
Before you set up a schedule, consider the following recommendations:
■ If you update your data sources occasionally (for example, less than once a month), there
is no need to create a schedule. Index the data each time you update the data source.
Detecting content using Exact Data Matching (EDM) 549
Configuring Exact Data profiles for EDM

■ Schedule indexing for times of minimal system use. Indexing affects performance throughout
the Symantec Data Loss Prevention system, and large data sources can take time to index.
■ Index a data source as soon as you add or modify the corresponding exact data profile,
and re-index the data source whenever you update it. For example, consider a scenario
whereby every Wednesday at 2:00 A.M. you update the data source. In this case you
should schedule indexing every Wednesday at 3:00 A.M. Do not index data sources daily
as this can degrade performance.
■ If you need to update indexes frequently (for example, daily), Symantec recommend that
you use the Remote EDM Indexer.
■ Monitor results and modify your indexing schedule accordingly. If performance is good and
you want more timely updates, for example, schedule more frequent data updates and
indexing.
The Indexing section lets you index the Exact Data Profile as soon as you save it
(recommended) or on a regular schedule as follows:

Table 26-10 Scheduling indexing for Exact Data Profiles for EDM

Parameter Description

Submit Indexing Select this option to index the Exact Data Profile when you click Save.
Job on Save

Submit Indexing Select this option to schedule an indexing job. The default option is No Regular Schedule. If you
Job on Schedule want to index according to a schedule, select a desired schedule period, as described.

Index Once On – Enter the date to index the document profile in the format MM/DD/YY. You can also click the
date widget and select a date.

At – Select the hour to start indexing.

Index Daily At – Select the hour to start indexing.

Until – Select this check box to specify a date in the format MM/DD/YY when the indexing should
stop. You can also click the date widget and select a date.

Index Weekly Day of the week – Select the day(s) to index the document profile.

At – Select the hour to start indexing.

Until – Select this check box to specify a date in the format MM/DD/YY when the indexing should
stop. You can also click the date widget and select a date.

Index Monthly Day – Enter the number of the day of each month you want the indexing to occur. The number
must be 1 through 28.

At – Select the hour to start indexing.

Until – Select this check box to specify a date in the format MM/DD/YY when the indexing should
stop. You can also click the date widget and select a date.
Detecting content using Exact Data Matching (EDM) 550
Configuring Exact Data profiles for EDM

See “Mapping Exact Data Profile fields for EDM” on page 545.
See “Creating and modifying Exact Data Profiles for EDM” on page 541.

Managing and adding Exact Data Profiles for EDM


You manage and create Exact Data Profiles for EDM at the Manage > Data Profiles > Exact
Data screen. Once a profile has been created, the Exact Data screen lists all Exact Data
Profiles configured in the system.
See “About the Exact Data Profile and index” on page 528.

Table 26-11 Exact Data screen actions for EDM

Action Description

Add EDM profile Click Add Exact Data Profile to define a new Exact Data Profile.

See “Configuring Exact Data profiles for EDM” on page 534.

Edit EDM profile To modify an existing Exact Data Profile, click the name of the profile, or click the pencil icon
at the far right of the profile row.

See “Creating and modifying Exact Data Profiles for EDM” on page 541.

Remove EDM profile Click the red X icon at the far right of the profile row to delete the Exact Data Profile from the
system. A dialog box confirms the deletion.
Note: You cannot edit or remove a profile if another user currently modifies that profile, or if a
policy exists that depends on that profile.

Download EDM Click the download profile link to download and save the Exact Data Profile.
profile
This is useful for archiving and sharing profiles across environments. The file is in the binary
*.edm format.

Refresh EDM profile Click the refresh arrow icon at the upper right of the Exact Data screen to fetch the latest status
status of the indexing process.

If you are in the process of indexing, the system displays the message "Indexing is starting."
The system does not automatically refresh the screen when the indexing process completes.

Table 26-12 Exact Data screen details for EDM

Column Description

Exact Data Profile The name of the exact data profile.

Last Active Version The version of the exact data profile and the name of the detection server that runs the profile.
Detecting content using Exact Data Matching (EDM) 551
Configuring EDM policies

Table 26-12 Exact Data screen details for EDM (continued)

Column Description

Status The current status of the exact data profile, which can be any of the following:
■ Next scheduled indexing (if it is not currently indexing)
■ Sending an index to a detection server
■ Indexing
■ Deploying to servers

In addition, the current status of the indexing process for each detection server, which can be
any of the following:

■ Completed, including a completion date


■ Pending index completion (waiting for the Enforce Server to finish indexing the exact data
source file)
■ Replicating indexing
■ Creating index (internally)
■ Building caches

Error messages The Exact Data screen displays any error messages in red.

For example, if the Exact Data Profile is corrupt or does not exist, the system displays an error
message.

Configuring EDM policies


This section describes how to configure EDM policy conditions.
See “Configuring the Content Matches Exact Data policy condition for EDM” on page 551.
See “Configuring Data Owner Exception for EDM policy conditions” on page 554.
See “Configuring the Sender/User based on a Profiled Directory policy condition for EDM”
on page 554.
See “Configuring the Recipient based on a Profiled Directory policy condition for EDM”
on page 555.
See “Configuring Advanced Settings for EDM policies” on page 557.

Configuring the Content Matches Exact Data policy condition for


EDM
Once you have defined the Exact Data Profile and indexed the data source, you configure one
or more Content Matches Exact Data conditions in policy rules
See “About the Content Matches Exact Data From condition for EDM” on page 532.
Detecting content using Exact Data Matching (EDM) 552
Configuring EDM policies

Table 26-13 Configure the Content Matches Exact Data policy condition for EDM

Steps Action Description

1 Configure an EDM Create a new EDM detection rule in a policy, or modify an existing EDM rule.
policy detection rule.
See “Configuring policies” on page 413.

See “Configuring policy rules” on page 417.

Match Data Rows when All of these match

2 Select the fields to The first thing you do when configuring the EDM condition is select each data
match. field that you want the condition to match. You can select all or deselect all fields
at once. The system displays all the fields or columns that were included in the
index. You do not have to select all the fields, but you should select at least 2 or
3, one of which must be unique, such as social security number, credit card
number, and so forth.

See “Best practices for using EDM” on page 601.

3 Choose the number of Choose the number of the selected fields to match from the drop down menu.
selected fields to match. This number represents the number of fields of those selected that must be present
in a message to trigger a match. You must select at least as many fields to match
as the number of data fields you check. For example, if you choose 2 of the
selected fields from the menu, you must have checked at least two fields present
in a message for detection.

See “Ensure data source has at least one column of unique data (EDM)”
on page 602.

4 Select the WHERE The WHERE clause option matches on the specified field value. You specify a
clause to enter specific WHERE clause value by selecting an exact data field from the menu and by
field values to match entering a value for that field in the adjacent text box. If you enter more than one
(optional). value, separate the values with commas.

See “Use a WHERE clause to detect records that meet specific criteria (EDM)”
on page 609.

For example, consider an Exact Data Profile for "Employees" with a "State" field
containing state abbreviations. In this example, to implement the WHERE clause,
you select (check) WHERE, choose "State" from the drop-down list, and enter
CA,NV in the text box. This WHERE clause then limits the detection server to
matching messages that contain either CA or NV as the value for the State field.
Note: You cannot specify a field for WHERE that is the same as one of the
selected matched fields.

Ignore Data Rows when Any of these match

5 Ignore data owners Selecting this option implements Data Owner Exception.
(optional).
See “Configuring Data Owner Exception for EDM policy conditions” on page 554.
Detecting content using Exact Data Matching (EDM) 553
Configuring EDM policies

Table 26-13 Configure the Content Matches Exact Data policy condition for EDM (continued)

Steps Action Description

6 Exclude data field You can use the exclude data field combinations to specify combinations of data
combinations (optional). values that are exempted from detection. If the data appears in exempted pairs
or groups, it does not cause a match. Excluded combinations are only available
when matching 2 or 3 fields. To enable this option, you must select 2 or 3 fields
to match from the _ of the selected fields menu at the top of the condition
configuration.

See “Leverage exception tuples to avoid false positives (EDM)” on page 609.

To implement excluded combinations, select an option from each Field N column


that appears. Then click the right-arrow icon to add the field combination to the
Excluded Combinations list. To remove a field from the list, select it and click
the left-arrow icon.
Note: Hold down the Ctrl key to select more than one field in the right-most
column.

Additional match condition parameters

7 Select an incident Enter or modify the minimum number of matches required for the condition to
minimum. report an incident.

For example, consider a scenario where you specify 1 of the selected fields for
a social security number field and an incident minimum of 5. In this situation the
engine must detect at least five matching social security numbers in a single
message to trigger an incident.
See “Match count variant examples (EDM)” on page 570.

8 Select components to Select one or more message components to match on:


match on.
■ Envelope – The header of the message.
■ Subject – (Not available for EDM.)
■ Body – The content of the message.
■ Attachments – The content of any files attached to or transported by the
message.

See “Selecting components to match on” on page 423.

9 Select one or more Select this option to create a compound condition. All conditions must match for
conditions to also the rule to trigger an incident.
match.
You can Add any available condition from the list.

See “Configuring compound match conditions” on page 429.

10 Test and troubleshoot See “Test and tune policies to improve match accuracy” on page 453.
the policy.
See “Troubleshooting policies” on page 445.
Detecting content using Exact Data Matching (EDM) 554
Configuring EDM policies

Configuring Data Owner Exception for EDM policy conditions


To except data owners from detection, you must include in your Exact Data Profile either an
email address or a domain address field (for example, symantec.com). Once Data Owner
Exception (DOE) is enabled, if the sender or recipient of confidential information is the data
owner (by email address or domain), the detection server allows the data to be sent or received
without generating an incident
To configure DOE for an EDM policy condition
1 When you are configuring the Content Matches Exact Data condition, select the Ignore
data owners option.
2 Select one of the following options:
■ Sender matches — Select this option to EXCLUDE the data sender from detection.
■ Any or All Recipient matches — Select one of these options to EXCLUDE any or
all data recipient(s) from detection.

Note: When you configure DOE for the EDM condition, you cannot select a value for Ignore
Sender/Recipient that is the same as one of the matched fields.

See “About Data Owner Exception for EDM” on page 532.

Configuring the Sender/User based on a Profiled Directory policy


condition for EDM
The Sender/User based on a Directory from detection rule lets you create detection rules
based on sender identity or (for endpoint incidents) user identity. This condition requires an
Exact Data Profile.
See “Creating the exact data source file for profiled DGM for EDM” on page 537.
After you select the Exact Data Profile, when you configure the rule, the directory you selected
and the sender identifier(s) appear at the top of the page.
Table 26-14 describes the parameters for configuring the Sender/User based on a Directory
from an EDM Profile condition.
Detecting content using Exact Data Matching (EDM) 555
Configuring EDM policies

Table 26-14 Configuring the Sender/User based on a Directory from an EDM Profile condition

Parameter Description

Where Select this option to have the system match on the specified field values. Specify the values by
selecting a field from the drop-down list and typing the values for that field in the adjacent text box.
If you enter more than one value, separate the values with commas.

For example, for an Employees directory group profile that includes a Department field, you would
select Where, select Department from the drop-down list, and enter Marketing,Sales in the text
box. If the condition is implemented as a rule, in this example a match occurs only if the sender or
user works in Marketing or Sales (as long as the other input content meets all other detection criteria).
If the condition is implemented as an exception, in this example the system ignores from matching
messages from a sender or user who works in Marketing or Sales.

Is Any Of Enter or modify the information you want to match. For example, if you want to match any sender
in the Sales department, select Department from the drop-down list, and then enter Sales in this
field (assuming that your data includes a Department column). Use a comma-separated list if you
want to specify more than one value.

Configuring the Recipient based on a Profiled Directory policy


condition for EDM
The Recipient based on a Directory from condition lets you create detection methods based
on the identity of the recipient. This method requires an Exact Data Profile.
See “Creating the exact data source file for profiled DGM for EDM” on page 537.
After you select the Exact Data Profile, when you configure the rule, the directory you selected
and the recipient identifier(s) appear at the top of the page.
Table 26-15 describes the parameters for configuring Recipient based on a Directory from
an EDM profile condition.

Table 26-15 Configuring the Recipient based on a Directory from an EDM profile condition

Parameter Description

Where Select this option to have the system match on the specified field values. Specify the values by
selecting a field from the drop-down list and typing the values for that field in the adjacent text box.
If you enter more than one value, separate the values with commas.

For example, for an Employees directory group profile that includes a Department field, you would
select Where, select Department from the drop-down list, and enter Marketing, Sales in the text
box. For a detection rule, this example causes the system to capture an incident only if at least one
recipient works in Marketing or Sales (as long as the input content meets all other detection criteria).
For an exception, this example prevents the system from capturing an incident if at least one recipient
works in Marketing or Sales.
Detecting content using Exact Data Matching (EDM) 556
Configuring EDM policies

Table 26-15 Configuring the Recipient based on a Directory from an EDM profile condition
(continued)

Parameter Description

Is Any Of Enter or modify the information you want to match. For example, if you want to match any recipient
in the Sales department, select Department from the drop-down list, and then enter Sales in this
field (assuming that your data includes a Department column). Use a comma-separated list if you
want to specify more than one value.

About configuring natural language processing for Chinese,


Japanese, and Korean for EDM policies

Introducing EDM token matching


Symantec Data Loss Prevention detection servers support natural language processing for
Chinese, Japanese, and Korean (CJK) in policies that use Exact Data Matching (EDM)
detection. When natural language processing for CJK languages is enabled, the detection
server validates CJK tokens before reporting a match, which improves matching accuracy.

EDM token matching examples for CJK languages


Table 26-16 provides EDM token matching examples for Chinese, Japanese, and Korean
languages. All examples assume that the keyword condition is configured to match on whole
words only.
If token verification is enabled, the message size must be sufficient for the token verifier to
recognize the language. For example: the message “東京都市部の人口” is too small for a
message for the token verification process to recognize the language of the message. The
following message is a sufficient size for token verification processing:
今朝のニュースによると東京都市部の人口は増加傾向にあるとのことでした。 全国的な人口
減少の傾向の中、東京への一極集中を表しています。

Table 26-16 EDM token matching examples for CJK

Language Keyword Matches on server with token Matches on server with


validation ON token validation OFF

Chinese 通信 数字无线通信 数字无线通信 交通信息 网站

Japanese 京都市 京都府京都市左京区 京都府京都市左京区 東京都市部


の人口

Korean 정부 정부의 방침 정부의 방침 의정부 경전철


Detecting content using Exact Data Matching (EDM) 557
Configuring EDM policies

Enabling and using CJK token verification for EDM


To use token verification for Chinese, Japanese, and Korean (CJK) languages you must enable
it on each detection server by setting the advanced server setting EDM.TokenVerifierEnabled
to true. In addition, there must be a sufficient amount of message text for the system to
recognize the language.
Table 26-17 lists and describes the detection server parameter that lets you enable token
verification for CJK languages.

Table 26-17 EDM token verification parameter

Setting Default Description

EDM.TokenVerifierEnabled false Default is disabled (false).

If enabled (true), the server validates tokens for Chinese,


Japanese, and Korean language keywords.

See “Enable keyword token verification for CJK” on page 848. describes how to enable and
use token verification for CJK keywords.
Enable EDM token verification for CJK
1 Log on to the Enforce Server as an administrative user.
2 Navigate to the System > Servers and Detectors > Overview > Server/Detector Detail
- Advanced Settings screen for the detection server you want to configure.
See “Advanced server settings” on page 285.
3 Locate the parameter EDM.TokenVerifierEnabled.
4 Change the value to true from false (default).
Setting the server parameter EDM.TokenVerifierEnabled = true enables token validation
for CJK token detection.
5 Save the detection server configuration.
6 Recycle the detection server.

Configuring Advanced Settings for EDM policies


EDM has various advanced settings available at the System > Servers and Detectors >
Overview > Server/Detector Detail - Advanced Settings screen for the chosen detection
server. Use caution when modifying these settings on a server. Check with Symantec Data
Loss Prevention Support before changing any of the settings on this screen. Changes to these
settings do not take effect until after the server is restarted.
See “Advanced server settings” on page 285.
Detecting content using Exact Data Matching (EDM) 558
Configuring EDM policies

Table 26-18 Advanced Settings for EDM indexing and detection

EDM parameter Default Description

EDM.MatchCountVariant 3 This setting specifies how matches are counted.


■ 1 - Counts the number of token sets matched regardless of use
of the same tokens across several matches.
■ 2 - Counts the number of unique token sets.
■ 3 - Counts the number of unique supersets of token sets. (default)

See “Match count variant examples (EDM)” on page 570.

EDM.MaximumNumberOfMatches 100 Defines a top limit on the number of matches returned from each
ToReturn RAM index search. For multi-file indices, this limit is applied to each
sub-index search independently before the search results are
combined. As a result the number of actual matches can exceed
this limit for multiple file indices.

EDM.RunProximityLogic true If true (default), this setting runs the token proximity check. The
free-form text proximity is defined by the setting
EDM.SimpleTextProximityRadius. The tabular text proximity
is defined by belonging to the same table row.
Note: Disabling proximity is not recommended because it can
negatively affect the performance of the system.

EDM.SimpleTextProximityRadius 35 Provides the baseline range for proximity checking a matched token.
This value is multiplied by the number of required matches to equal
the complete proximity check range.

To keep the same "required match density," the proximity check


range behaves like a moving window in a text page. D is defined as
the proportionality factor for the window and is set in the policy
condition by choosing how many fields to match on for the EDM
condition. N is the SimpleTextProximityRadius value. A number of
tokens are in the proximity range if the first token in is within N x D
words from the last token. The proximity check range is directly
proportional to the number of matches by a factor of D.

See “Proximity matching example for EDM” on page 572.


Note: Increasing the radius value higher than the default can
negatively affect system performance and is not recommended.

EDM.TokenVerifierEnabled false Default is disabled (false).

If enabled (true), the server validates tokens for Chinese, Japanese,


and Korean language keywords.
Detecting content using Exact Data Matching (EDM) 559
Configuring EDM policies

Table 26-18 Advanced Settings for EDM indexing and detection (continued)

EDM parameter Default Description

Lexer.IncludePunctuationInWords true If true, during detection punctuation characters are considered as


part of a token.

If false, during detection punctuation within a token or multi-token


is treated as white space.

See “Multi-token with punctuation (EDM)” on page 563.


Note: This setting applies to detection content, not to indexed
content.

Lexer.MaximumNumberOfTokens 30000 Maximum number of tokens extracted from each message


component for detection. Applicable to all detection technologies
where tokenization is required (EDM, profiled DGM, and the system
patterns supported by those technologies). Increasing the default
value may cause the detection server to run out of memory and
restart.

Lexer.Validate true If true, performs system pattern-specific validation during indexing.


Setting this to false is not recommended.

See “Using system-provided pattern validators for EDM profiles”


on page 547.

MessageChain.NumChains Varies This number varies depending on detection server type. It is either
4 or 8. The number of messages, in parallel, that the filereader
processes. Setting this number higher than 8 (with the other default
settings) is not recommended. A higher setting does not substantially
increase performance and there is a much greater risk of running
out of memory. Setting this to less than 8 (in some cases 1) helps
when processing big files, but it may slow down the system
considerably.

Note: Maximum tokens per multi-token and stopwords are calculated and evaluated respectively
during indexing. TheLexer.MaxTokensPerMultiToken and Lexer Stopword Languages Advanced
Server settings are no longer necessary. The stopword language on Enforce is specified in
the indexer.properties file at C:\Program Files\Symantec\Data Loss
Prevention\Indexer\15.5\Protect\config\Indexer.properties. In English, the property
is stopword_languages = en.
Detecting content using Exact Data Matching (EDM) 560
Using multi-token matching with EDM

Using multi-token matching with EDM


EDM policy matching is based on tokens in the index. For languages based on the Latin
alphabet, a token is a word or string of alphanumeric characters delimited by spaces. For
Chinese, Japanese, and Korean languages, a token is determined by other means. Tokens
are normalized so that formatting and case are ignored. At run-time the server performs a
full-text search against an inbound message, checking each word against the index for potential
matches. The matching algorithm compares each word in the message with the contents of
each token in the index.
A multi-token cell is a cell in the index that contains multiple words separated by spaces,
leading or trailing punctuation, or alternative Latin and Chinese, Japanese, or Korean language
characters. The sub-token parts of a multi-token cell obey the same rules as single-token cells:
they are normalized according to their pattern where normalization can apply. Inbound message
data must match a multi-token cell exactly, including whitespace, punctuation, and stopwords
(assuming the default settings).
For example, an indexed cell containing the string "Bank of America" is a multi-token comprising
3 sub-token parts. During detection, the inbound message "bank of america" (normalized)
matches the multi-token cell, but "bank america" does not.
Multi-token matching is enabled by default. Multi-token cells are more computationally expensive
than single-token cells. If the index includes multi-token cells, you must verify that you have
enough memory to index, load, and process the EDM profile.
See “Characteristics of multi-token cells (EDM)” on page 560.
See “Memory requirements for EDM” on page 579.

Characteristics of multi-token cells (EDM)


Table 26-19 lists and describes characteristics of multi-token matching.
See “Using multi-token matching with EDM” on page 560.

Table 26-19 Characteristics of multi-tokens for EDM

Characteristic Description

The number of tokens in a single cell is limited to 200 The number of characters is not limited. In the case of a
tokens. CJK token, each character is treated as a single token and
the number of CJK characters is limited to 200 characters.

Whitespace in Latin multi-token cells is considered, but See “Multi-token with spaces (EDM)” on page 561.
multiple whitespaces are normalized to 1.
Detecting content using Exact Data Matching (EDM) 561
Using multi-token matching with EDM

Table 26-19 Characteristics of multi-tokens for EDM (continued)

Characteristic Description

Punctuation immediately preceding and following a token See “Multi-token with punctuation (EDM)” on page 563.
or sub-token is always ignored.
See “Additional examples for multi-token cells with
punctuation (EDM)” on page 564.

You can configure how punctuation within a token or Lexer.IncludePunctuationInWords = true


multi-token is treated during detection. For most cases the
See “Configuring Advanced Settings for EDM policies”
default setting ("true") is appropriate. If set to "false,"
on page 557.
punctuation is treated as whitespace.

For proximity range checking the sub-token parts of a See “Proximity matching example for EDM” on page 572.
multi-token are counted as single tokens.

The system does not consider stopwords when matching See “Multi-token with stopwords (EDM)” on page 562.
multi-tokens. In other words, stopwords are not excluded.

Multi-tokens are more computationally expensive than See “Memory requirements for EDM” on page 579.
single tokens and require additional memory for indexing,
loading, and processing.

Multi-token with spaces (EDM)


Table 26-20 shows examples of multi-tokens with spaces.

Table 26-20 Multi-token cell with spaces examples

Description Indexed content Detected content Explanation

Cell contains space Bank of America Bank of America Cell with spaces is
multi-token.

Multi-token must match


exactly.

Cells contains multiple Bank of America Bank of America Multiple spaces are
spaces normalized to one.

Cells contain space between 傠傫 傠傫 傠傫 傠傫 White spaces between CKJ


CKJ characters characters are ignored.
傠傫傠傫

Cells contain space between EDM 傠傫 EDM 傠傫 White spaces between Latin
Latin and CJK characters and CJK characters are
EDM傠傫
ignored.
Detecting content using Exact Data Matching (EDM) 562
Using multi-token matching with EDM

Multi-token with stopwords (EDM)


Stopwords are common words, such as articles and prepositions. When creating single-tokens,
the EDM indexing process ignores words found in the EDM stopword list (\Program
Data\Symantec\DataLossPrevention\EnforceServer\15.5\config\stopwords), as well
as single letters. However, when creating multi-tokens, stopwords and single letters are not
ignored. Instead, they are part of the multi-token.
Table 26-21 shows multi-token matches with stopwords, single letters, and single digits.

Table 26-21 Cell contains stopwords or single letter or single digit (EDM)

Description Cell content Should match Explanation

Cell contains stopword. throw other ball throw other ball Common word ("other") is
filtered out during indexing
but not when it is part of a
multi-token.

Cell contains single letter. throw a ball throw a ball Single letter ("a") is filtered
out, but not when it is part of
a multi-token.

Cell contains single digit. throw 1 ball throw 1 ball Unlike single-letter words
that are stopwords, single
digits are never ignored.

Multi-token with mixed language characters (EDM)


Table 26-22 shows examples of multi-tokens with mixed Latin and CJK characters.

Table 26-22 Multi-token cell with Latin and CJK characters examples (EDM)

Description Cell content Should match Explanation

Cell includes Latin and CJK ABC傠傫 ABC傠傫 Mixed Latin-CJK cell is
characters with no spaces. multi-token.
傠傫ABC 傠傫ABC
Whitespace between Latin
Also matches with:
and CJK characters is
ABC 傠傫 ignored.
傠傥 ABC

EDM ignores whitespace


between the Latin
characters and the CJK
token.
Detecting content using Exact Data Matching (EDM) 563
Using multi-token matching with EDM

Table 26-22 Multi-token cell with Latin and CJK characters examples (EDM) (continued)

Description Cell content Should match Explanation

Cell includes Latin and CJK ABC 傠傫 ABC 傠傫 Multiple spaces are ignored.
with one or more spaces.
傠傥 ABC 傠傥 ABC

Also matches with:

ABC傠傫

傠傫ABC

Cell contains Latin or CJK 什仁 仂仃 仄仅 仇仈仉 什仁 仂仃 仄仅 仇仈仉 Single-token cell.


with numbers. 147(什仂仅 51-1) 147(什仂仅 51-1)

Multi-token with punctuation (EDM)


Punctuation is always ignored if it comes at the beginning (leading) or end (trailing) of a token
or multi-token. Whether punctuation included in a token or multi-token is required for matching
depends on the Advanced Server Setting Lexer.IncludePunctuationInWords, which by
default is set to true (enabled).
See “Multi-token punctuation characters (EDM)” on page 569.

Note: For convenience purposes the Lexer.IncludePunctuationInWords parameter is referred


to by the three-letter acronym "WIP" throughout this section.

The WIP setting operates at detection-time to alter how matches are reported. For most EDM
policies you should not change the WIP setting. For a few limited situations, such as account
numbers or addresses, you may need to set IncludePunctuationInWords = false depending
on your detection requirements.
See “Multi-token punctuation characters (EDM)” on page 569.
Table 26-23 lists and explains how multi-token matching works with punctuation.

Table 26-23 Multi-token punctuation table (EDM)

Indexed Detected WIP setting Match Explanation


content content

a.b a.b TRUE Yes The indexed content and the detected content are
exactly the same.

FALSE No The detected content is treated as "a b" and is therefore


not a match.
Detecting content using Exact Data Matching (EDM) 564
Using multi-token matching with EDM

Table 26-23 Multi-token punctuation table (EDM) (continued)

Indexed Detected WIP setting Match Explanation


content content

a.b ab TRUE No The indexed content and the detected content are
different.

FALSE No The indexed content and the detected content are


different.

ab a.b TRUE No The indexed content and the detected content are
different.

FALSE Yes The detected content is treated as "a b" and is therefore
a match.

ab ab TRUE Yes The indexed content and the detected content are
exactly the same

FALSE Yes The indexed content and the detected content are
exactly the same

Additional examples for multi-token cells with punctuation (EDM)


Table 26-24 lists and describes some additional examples for multi-token cells with punctuation.
In these examples, the main thing to keep in mind is that during indexing, if a token includes
punctuation marks between characters the punctuation is always retained. This means that
EDM cannot detect that cell if the WIP setting is false. In other words, if indexed data has cell
which has a token with internal punctuation, the WIP setting should be set to true.

Table 26-24 Additional use cases for multi-token cells with punctuation (EDM)

Description Indexed content Detected content Explanation

Cell contains a physical 346 Guerrero St., Apt. #2 346 Guerrero St., Apt. #2 The indexed content is a
address with punctuation. multi-token cell.
346 Guerrero St Apt 2
Both match because the
punctuation comes at the
beginning or end of the
sub-token parts and is
therefore ignored.
Detecting content using Exact Data Matching (EDM) 565
Using multi-token matching with EDM

Table 26-24 Additional use cases for multi-token cells with punctuation (EDM) (continued)

Description Indexed content Detected content Explanation

Cell contains internal O'NEAL ST. O'NEAL ST The indexed content is a


punctuation with no space multi-token cell.
before or after.
Internal punctuation is
included (assuming WIP is
true), and leading or trailing
punctuation is ignored
(assuming there is a space
delimiter after the
punctuation).

Cell contains Asian 傠傫##傠傫 傠傫##傠傫 (if WIP true) The indexed content is a
language characters (CJK) single token cell.
with indexed internal
During detection, Asian
punctuation.
language characters (CJK)
with internal punctuation is
affected by the WIP setting.
Thus, in this example 傠傫
##傠傫 matches only if the
WIP setting is true.

If the WIP setting is false, 傠


傫##傠傫 is considered a
multi-token because the
internal punctuation is
treated as whitespace. Thus,
no content can match.

Cell contains Asian 傠傫 傠傫 傠傫 傠傫 The indexed content is a


language characters (CJK) multi-token cell.
傠傫##傠傫 (if WIP false)
without indexed internal
The detected content
punctuation.
matches as indexed. If the
WIP setting is false, the
detected content matches
傠傫##傠傫 because internal
punctuation is ignored.
Detecting content using Exact Data Matching (EDM) 566
Using multi-token matching with EDM

Table 26-24 Additional use cases for multi-token cells with punctuation (EDM) (continued)

Description Indexed content Detected content Explanation

Cell contains mix of Latin EDM##傠傫 EDM 傠傫 The indexed content is a


and CJK characters with multi-token cell.
punctuation separating the
A cell with alternate Latin
Latin and Asian characters.
and CJK characters is
always a multi-token and
punctuation between Latin
and Asian characters is
always treated as a single
white space regardless of
the WIP setting.

Cell contains mix of Latin DLP##EDM 傠傫##傠傥 DLP##EDM##傠傫##傠傥 The indexed content is a
and CJK characters with (if WIP true) multi-token cell.
internal punctuation.
DLP##EDM 傠傫##傠傥 (if During detection,
WIP true) punctuation between the
Latin and Asian characters
is treated as a single
whitespace and leading and
trailing punctuation is
ignored.

If the WIP setting is true the


punctuation internal to the
Latin characters and internal
to the Asian character is
retained.

If the WIP setting is false, no


content can match because
internal punctuation is
ignored.
Detecting content using Exact Data Matching (EDM) 567
Using multi-token matching with EDM

Table 26-24 Additional use cases for multi-token cells with punctuation (EDM) (continued)

Description Indexed content Detected content Explanation

Cell contains mix of Latin DLP EDM 傠傫 傠傥 DLP EDM 傠傫 傠傥 The indexed content is a
and CJK characters with multi-token cell.
DLP#EDM 傠傫#傠傥 (if
internal punctuation.
WIP false) During detection,
punctuation between the
DLP#EDM##傠傫#傠傥 (if
Latin and Asian characters
WIP false)
is treated as a single
whitespace and leading and
trailing punctuation is
ignored. Thus, it matches as
indexed.

If the WIP setting is false, it


matches DLP;EDM##傠傫
#傠傥 because internal
punctuation is ignored.

Some special use cases for system-recognized data patterns (EDM)


EDM provides validation for and recognition of the following special data patterns:
■ Credit card number
■ Email address
■ IP address
■ Number
■ Percent
■ Phone number (US, Canada)
■ Postal code (US, Canada)
■ Social security number (US SSN)
See “Using system-provided pattern validators for EDM profiles” on page 547.

Note: It is a best practice to always validate your index against the recognized system patterns
when the data source includes one or more such column fields. See “Map data source column
to system fields to leverage validation (EDM)” on page 605.

The general rule for system-recognized patterns is that the WIP setting does not apply during
detection. Instead, the rules for that particular pattern apply. In other words, if the pattern is
recognized during detection, the WIP setting is not checked. This is always true if the pattern
Detecting content using Exact Data Matching (EDM) 568
Using multi-token matching with EDM

is a string of characters such as an email address, and if the cell contains a number that
conforms to one of the recognized number patterns (such as CCN or SSN).
In addition, even if the pattern is a generic number such as account number that does not
conform to one of the recognized number patterns, the WIP setting may still not apply. To
ensure accurate matching for generic numbers that do not conform to one of the
system-recognized patterns, you should not include punctuation in these number cells. If the
cell contents conforms to one of the system-recognized patterns, the punctuation rules for that
pattern apply and the WIP setting does not.
See “Do not use the comma delimiter if the data source has number fields (EDM)” on page 605.
See Table 26-25 on page 568. lists and describes examples for detecting system-recognized
data patterns.

Caution: This list is not exhaustive. It is provided for informational purposes only to ensure that
you are aware that data that matches system-defined patterns takes precedence and the WIP
setting is ignored. Before deploying your EDM policies into production, you must test detection
accuracy and adjust the index accordingly to ensure that the data that you have indexed
matches as expected during detection.

Table 26-25 Some special use cases for system-recognized data patterns (EDM)

Description Indexed content Detected content Explanation

Cell contains an email [email protected] [email protected] An email address is indexed


address. and detected as a
single-token regardless of
the WIP setting. It must
match exactly as indexed. If
you were to set WIP to false,
"person example com"
would not match as a
multi-token and does not
match the indexed
single-token.
Detecting content using Exact Data Matching (EDM) 569
Using multi-token matching with EDM

Table 26-25 Some special use cases for system-recognized data patterns (EDM) (continued)

Description Indexed content Detected content Explanation

Cells contains a 10-digit ########## ########## The WIP setting is ignored


account number. because the number
(###) ### ####
conforms to the phone
(###) ###-#### number pattern and its rules
take precedence.

## ###### ## ## ###### ## Must match exactly. The


pattern ##-######-## does
not match even if WIP is set
to false.

### #### ### ### #### ### Must match exactly. The
pattern ###-####-### does
not match even if WIP is set
to false.

Multi-token punctuation characters (EDM)


In EDM, a multi-token cell is any cell that has been indexed that contains punctuation (as well
as spaces or alternative Latin words and CJK characters).
See Table 26-26 on page 569.
Using multi-token matching with EDM lists the symbols that are identified and treated as
punctuation during EDM indexing.

Table 26-26 Characters treated as punctuation for indexing (EDM)

Punctuation name Character representation

Apostrophe '

Tilde ~

Exclamation point !

Ampersand &

Dash -

Single quotation mark '

Double quotation mark "

Period (dot) .
Detecting content using Exact Data Matching (EDM) 570
Using multi-token matching with EDM

Table 26-26 Characters treated as punctuation for indexing (EDM) (continued)

Punctuation name Character representation

Question mark ?

At sign @

Dollar sign $

Percent sign %

Asterisk *

Caret symbol ^

Open parenthesis (

Close parenthesis )

Open bracket [

Close bracket ]

Open brace {

Close brace }

Forward slash /

Back slash \

Pound sign #

Equal sign =

Plus sign +

Match count variant examples (EDM)


The default value for the Advanced Server setting EDM.MatchCountVariant eliminates the
matches that consist of the same set of tokens from some other match. Rarely is there a need
to change the default value, but if necessary you can configure how EDM matches are counted
using this parameter.
See “Advanced server settings” on page 285.
Table 26-27 provides examples for match counting. All examples assume that the policy is
set to match three out of four column fields and that the profile index contains the following
cell contents:
Kathy | Stevens | 123-45-6789 | 1111-1111-1111-1111
Detecting content using Exact Data Matching (EDM) 571
Using multi-token matching with EDM

Kathy | Stevens | 123-45-6789 | 2222-2222-2222-2222


Kathy | Stevens | 123-45-6789 | 3333-3333-3333-3333

Table 26-27 Match count variant examples (EDM)

Inbound message Match Number of matches Explanation


contents count
variant

Kathy Stevens 123-45-6789 1 3 Records matched in the profile: first


name, last name, and SSN.

2 1 Number of unique token sets matched.

3 1 Number of unique supersets of token


sets.

Kathy Stevens 123-45-6789 1 3 If EDM.HighlightAllMatchesInProximity


1111-1111-1111-1111 = false, EDM matches the left-most
2 1: if tokens for each profile data row. The
Kathy Stevens 123-45-6789 EDM.HighlightAllMatchesInProximity token set for each row is as follows:
= false (default)
Row # 1: Kathy Stevens 123-45-6789
2: if
EDM.HighlightAllMatchesInProximity Row # 2: Kathy Stevens 123-45-6789
= true Row # 3: Kathy Stevens 123-45-6789

3 1 If EDM.HighlightAllMatchesInProximity
= true, EDM matches all tokens within
the proximity window. The token set for
each row is as follows:

Row # 1: Kathy Stevens 123-45-6789


1111-1111-1111-1111 Kathy Stevens
123-45-6789

Row # 2: Kathy Stevens 123-45-6789


Kathy Stevens 123-45-6789

Row # 3: Kathy Stevens 123-45-6789


Kathy Stevens 123-45-6789
Detecting content using Exact Data Matching (EDM) 572
Using multi-token matching with EDM

Table 26-27 Match count variant examples (EDM) (continued)

Inbound message Match Number of matches Explanation


contents count
variant

1111-1111-1111-1111 1 3 If EDM.HighlightAllMatchesInProximity
Kathy Stevens 123-45-6789 = false, EDM matches the left-most
2 2 tokens for each profile data row. The
token set for each row is as follows:
3 2: if
EDM.HighlightAllMatchesInProximity Row # 1: 1111-1111-1111-1111 Kathy
= false (default) Stevens

1: if Row # 2: Kathy Stevens 123-45-6789


EDM.HighlightAllMatchesInProximity
Row # 3: Kathy Stevens 123-45-6789
= true
If EDM.HighlightAllMatchesInProximity
= true, EDM matches all tokens within
the proximity window. The token set for
each row is as follows:

Row # 1: 1111-1111-1111-1111 Kathy


Stevens 123-45-6789

Row # 2: Kathy Stevens 123-45-6789

Row # 3: Kathy Stevens 123-45-6789

Proximity matching example for EDM


EDM protects confidential data by correlating uniquely identifiable information, such as SSN,
with data that is not unique, such as last name. When correlating data, it is important to ensure
that terms are related. In natural languages, it is more likely that when two words appear close
together they are being used in the same context and are therefore related.
Based on the premise that word proximity indicates relatedness, EDM employs a
proximity-matching radius or range to limit how much freeform content the system will examine
when searching for matches. EDM proximity matching is designed to reduce false positives
by ensuring that matched terms are proximate.
The proximity range is proportional to the policy definition. The proximity range is determined
by the proximity radius multiplied by the number of matches required by the EDM policy
condition. The radius is set by the Advanced Server Setting parameter
EDM.SimpleTextProximityRadius. The default value is 35. In addition, proximity matching
applies to both free-form text and tabular data. There is no distinction at run-time between the
two. Thus, tabular data is treated the same as free text data and the proximity check is
performed beyond the scope of the length of the row contents
Detecting content using Exact Data Matching (EDM) 573
Using multi-token matching with EDM

For example, assuming the default radius of 35 and a policy set to match 3 out of 4 column
fields, the proximity range is 105 tokens (3 x 35). If the policy matches 2 out of 3 the proximity
range is 70 tokens (35 x 2).

Warning: While you can decrease the value of the proximity radius, Symantec does not
recomment increasing this value beyond the default (35). Doing so may cause performance
issues. See “Configuring Advanced Settings for EDM policies” on page 557.

Table 26-28 shows a proximity matching example based on the default proximity radius setting.
In this example, the detected content produces 1 unique token set match, described as follows:
■ The proximity range window is 105 tokens (35 x 3).
■ The proximity range window starts at the leftmost match ("Stevens") and ends at the
rightmost match ("123-45-6789").
■ The total number of tokens from "Stevens" to the SSN (including both) is 105 tokens.
■ The stopwords "other" and "a" are counted for proximity range purposes.
■ "Bank of America" is a multi-token. Each sub-token part of a multi-token is counted as a
single token for proximity purposes.

Table 26-28 Proximity example for EDM

Indexed data Policy Proximity Detected content

Last_Name | Employer | Match 3 of 3 Radius = 35 Zendrerit inceptos Kathy Stevens lorem ipsum pharetra
SSN tokens (default) convallis leo suscipit ipsum sodales rhoncus, vitae dui
nisi volutpat augue maecenas in, luctus id risus magna
Stevens | Bank of America
arcu maecenas leo quisque. Rutrum convallis tortor
| 123-45-6789
urna morbi elementum hac curabitur morbi, nunc dictum
primis elit senectus faucibus convallis surfrent.
Aptentnour gravida adipiscing iaculis himenaeos,
himenaeos a porta etiam viverra. Class torquent uni
other tristique cubilia in Bank of America. Dictumst
lorem eget ipsum. Hendrerit inceptos other sagittis
quisque. Leo mollis per nisl per felis, nullam cras mattis
augue turpis integer pharetra convallis suscipit
hendrerit? Lubilia en mictumst horem eget ipsum.
Inceptos urna sagittis quisque dictum odio hendrerit
convallis suscipit ipsum wrdsrf 123-45-6789.
Detecting content using Exact Data Matching (EDM) 574
Updating EDM indexes to the latest version

Updating EDM indexes to the latest version


When you upgrade to the latest version of Symantec Data Loss Prevention, you must update
each Exact Data profile by reindexing the data source using the latest EDM Indexer. You need
to verify the amount of memory that is required for indexing the data source, and loading and
processing the index at run-time on the detection server.
See “About upgrading EDM deployments” on page 534.
See “Memory requirements for EDM” on page 579.
If you do not reindex the data source file, the system presents error messages indicating that
the Exact Data profile is out-of-date. You must reindex the Exact Data profile, and re-calculate
memory requirements.
See “EDM index out-of-date error codes” on page 578.
Two primary upgrade scenarios exist for EDM:
■ You use the Remote EDM Indexer to create indexes remotely and copy them to the Enforce
Server.
See “Update process using the Remote EDM Indexer” on page 574.
■ You already have a data source file that is current and cleansed that you can copy to the
upgraded Enforce Server for indexing.
See “Update process using the Enforce Server for EDM” on page 576.

Update process using the Remote EDM Indexer


You can use the following procedure for upgrading your EDM deployments to the latest version
of Symantec Data Loss Prevention. This procedure assumes that you can remotely index the
data source and copy the index file to the Enforce Server.
See “Remote EDM indexing” on page 585.
If remote indexing is not possible, the other option for upgrade is to copy the data source file
to the Enforce Server.
See “Update process using the Enforce Server for EDM” on page 576.
Detecting content using Exact Data Matching (EDM) 575
Updating EDM indexes to the latest version

Table 26-29 Update process using the Remote EDM Indexer

Step Action Description

1 Upgrade the Enforce Server Refer to the Symantec Data Loss Prevention Upgrade Guide at
to the latest version. http://www.symantec.com/docs/DOC9258 for details.

Do not upgrade the EDM detection server(s) now.

The latest Enforce Server can continue to receive incidents from older
detection servers during the upgrade process. Policies and other data cannot
be pushed out to older detection servers. There is one-way communication
only between the latest version of Enforce and previous versions of detection
servers.

2 Create a newly-generated Using the latest Enforce Server administration console, create a new EDM
remote EDM profile profile template for remote EDM indexing.
template.
See “Creating an EDM profile template for remote indexing” on page 589.

Download the *.edm profile template and copy it to the remote data source
host system.

See “Downloading and copying the EDM profile file to a remote system”
on page 591.

3 Install the latest version of Install the latest version of the Symantec Data Loss Prevention Remote EDM
the Remote EDM Indexer on Indexer on the remote data source host so that you can index the data source.
the remote data source host.
See “Remote EDM indexing” on page 585.

4 Calculate the memory that Calculate the memory that is required for indexing before you attempt to index
is required to index the data the data source. The Remote EDM Indexer is allocated sufficient memory to
source and adjust the index most data sources. If you have a very large index you may have to
indexer memory setting. allocate more memory.

See “Memory requirements for EDM” on page 579.

5 Index the data source using The result of this process is multiple latest-version compatible *.rdx files
the latest Remote EDM that you can load into the latest version of the Enforce Server.
Indexer.
If you have a data source file prepared, run the Remote EDM Indexer and
index it.

See “Remote indexing examples using data source file (EDM)” on page 592.

If the data source is an Oracle database and the data is clean, use the SQL
Preindexer to pipe the data to the Remote EDM Indexer.

See “Remote indexing examples using SQL Preindexer (EDM)” on page 593.
Detecting content using Exact Data Matching (EDM) 576
Updating EDM indexes to the latest version

Table 26-29 Update process using the Remote EDM Indexer (continued)

Step Action Description

6 Calculate the memory that You need to calculate how much RAM the detection server requires to load
is required to load and and process the index at run-time. These calculations are required for each
process the index and adjust EDM index you want to deploy.
the detection server memory
See “Memory requirements for EDM” on page 579.
setting for each EDM
detection server host.

7 Update the EDM profile by Copy the *.pdx and *.rdx files from the remote host to the latest Enforce
loading the latest version of Server host file system.
the index.
Load the index into the EDM profile you created in Step 2.

See “Copying and loading remote EDM index files to the Enforce Server”
on page 594.

8 Upgrade one or more EDM Once you have created the latest-version compliant EDM profiles and
detection servers to the upgraded the Enforce Server, you can then upgrade the detection servers.
latest version.
Refer to the Symantec Data Loss Prevention Upgrade Guide at
http://www.symantec.com/docs/DOC9258 for details.

Make sure that you have calculated and verified the memory requirements
for loading and processing multi-token indexes on the detection server.

See “Memory requirements for EDM” on page 579.

9 Test and verify the updated To test the upgraded system and updated index, you can create a new policy
index. that references the updated index.

10 Remove out-of-date EDM Once you have verified the new EDM index and policy, you can retire the
indexes. legacy EDM index and policy.

Update process using the Enforce Server for EDM


Use the following index update procedure if remote indexing is not possible and you have a
current data source file that you can copy to the Enforce Server.
Detecting content using Exact Data Matching (EDM) 577
Updating EDM indexes to the latest version

Table 26-30 Update process using the Enforce Server

Step Action Description

1 Upgrade the Enforce Refer to the Symantec Data Loss Prevention Upgrade Guide at
Server to the latest http://www.symantec.com/docs/DOC9258 for details.
version.
Do not upgrade the EDM detection servers now.

The Enforce Server can continue to receive incidents from older detection servers during
the upgrade process. Policies and other data cannot be pushed out to older detection
servers (one-way communication only between the current version of Enforce and older
detection servers).

2 Create, prepare, and Copy the data source file to the opt/Symantec/DataLoss
copy the data source Prevention/EnforceServer/15.5/Protect/datafiles (Linux) or ProgramData
file to the 15.5 \Symantec\DataLossPrevention\ServerPlatformCommon\15.5\Protect\datafiles
Enforce Server host. (Windows) directory on the upgraded 15.5 Enforce Server host file system.

See “Creating the exact data source file for EDM” on page 535.

See “Preparing the exact data source file for indexing for EDM” on page 537.

See “Uploading exact data source files for EDM to the Enforce Server” on page 539.

3 Calculate memory the Calculate the memory that is required for indexing before you attempt to index the data
memory that is source.
required to index the
See “Memory requirements for EDM” on page 579.
data source and
update the indexer
memory setting.

4 Create a new Create a new EDM profile using the latest version of the Enforce Server administration
latest-version-compliant console.
EDM profile and index
Choose the option Reference Data Source on Manager Host for uploading the data
the data source file.
source file (assuming that you copied it to the /datafiles directory).

Index the data source file on save of the profile.

See “Creating and modifying Exact Data Profiles for EDM” on page 541.

5 Calculate the memory You need to calculate how much RAM the detection server requires to load and process
that is required to load the index and run-time. These calculations are required for each EDM index you want
and process the index to deploy and the memory adjustments are cumulative.
at run-time. Adjust the
See “Memory requirements for EDM” on page 579.
memory settings for
each EDM detection
server host.
Detecting content using Exact Data Matching (EDM) 578
Updating EDM indexes to the latest version

Table 26-30 Update process using the Enforce Server (continued)

Step Action Description

6 Upgrade the EDM Once you have created the latest-version-compliant EDM profile you can then upgrade
detection servers to the detection servers.
the latest version.
Refer to the Symantec Data Loss Prevention Upgrade Guide at
http://www.symantec.com/docs/DOC9258 for details.

Make sure that you have calculated and verified the memory requirements for loading
and processing multi-token indexes on the detection server.

See “Memory requirements for EDM” on page 579.

7 Test and verify the To test the upgraded system and updated index, you can create a new policy that
updated index. references the updated index.

8 Remove out-of-date Once you have verified the new EDM index and policy, you can retire the legacy EDM
EDM indexes. index and policy.
Note: Indexes that are created for versions earlier than 14.0 do not work with version
14.5 and later.

See “Remote EDM indexing” on page 585.

EDM index out-of-date error codes


The latest version of Symantec Data Loss Prevention provided several enhancements for
EDM. You must reindex the data source for each Exact Data profile using the latest EDM
Indexer.
If your EDM index is not compliant with the current version, the system returns error codes.
These error codes are listed in Table 26-31.

Table 26-31 Error messages for non-compliant Exact Data Profiles

Error message type Error code Error message

Enforce Server error 2928 One or more profiles are out of date and must be reindexed.
event
See “Updating EDM indexes to the latest version” on page 574.

See “Memory requirements for EDM” on page 579.

Enforce Server error 2928 Check the Manage > Data Profiles > Exact Data page for more details.
event detail The following EDM profiles are out of date: Profile X, Profile XY, and so
on.

System Event error 2928 One or more profiles are out of date and must be reindexed.

Exact Data Profile error N/A This profile is out of date, and must be reindexed.
Detecting content using Exact Data Matching (EDM) 579
Memory requirements for EDM

Memory requirements for EDM


Using EDM for Symantec Data Loss Prevention deployments affects hardware memory
requirements for Symantec Data Loss Prevention deployments. In particular, EDM affects the
memory required to index the data size as well as the memory required to load the index on
the detection server.
Once you have established what your specific EDM memory requirements are, you can evaluate
how those requirements affect the general system requirements for your Data Loss Prevention
deployment. See the Symantec Data Loss Prevention System Requirements and Compatibility
Guide for details about general requirements and potential EDM deployment impact.

About memory requirements for EDM


The memory requirements for EDM are related to several factors, including:
■ Number of indexes you are building
■ Total size of the indexes
■ Number of cells in each index
■ Number of message chains
These size limitations apply to EDM indexes:
■ The maximum number of rows supported is 4,294,967,294.
■ The maximum number of supported cells is 6 billion.
Table 26-32 gives an overview of the steps that you can follow to determine and set memory
requirements for EDM.

Table 26-32 Workflow for determining memory requirements for EDM indexes

Step Action For more information

1 Determine the memory See “Overview of configuring memory and indexing the data
that is required to index source for EDM” on page 580.
the data source.

2 Increase the indexer See “Determining requirements for both local and remote
memory according to your indexers for EDM” on page 580.
calculations.

3 Determine the memory See “Detection server memory requirements for EDM”
that is required to load the on page 582.
index on the detection
server.
Detecting content using Exact Data Matching (EDM) 580
Memory requirements for EDM

Table 26-32 Workflow for determining memory requirements for EDM indexes (continued)

Step Action For more information

4 Increase the detection See “Increasing the memory for the detection server (File
server memory according Reader) for EDM” on page 584.
to your calculations.

5 Repeat for each EDM


index you want to deploy.

Overview of configuring memory and indexing the data source for


EDM
Table 26-33 provides the steps for determining how much memory is needed to index the data
source.

Table 26-33 Memory requirements for indexing the data source for EDM

Step Action Details

1 Estimate the memory requirements See “Determining requirements for both local and remote
for the indexer. indexers for EDM” on page 580.

2 Increase the indexer memory. The next step is to increase the memory allocated to the
indexer. The procedure for increasing the indexer memory
differs depending on whether you are using the EDM indexer
local to the Enforce Server or the Remote EDM Indexer.

3 Restart the Symantec DLP Manager You must restart this service after you have changed the
service. memory allocation.

4 Index the data source. The last step is to index the data source. You need to do this
before you calculate remaining memory requirements.

See “Configuring Exact Data profiles for EDM” on page 534.

Determining requirements for both local and remote indexers for


EDM
This topic provides an overview of memory requirements for both the EDM indexer that is local
to the Symantec Data Loss Prevention Enforce Server and for the Remote EDM Indexer.
With the default settings, both EDM indexers can index any data source with 500 million cells
or less. For any data source with more than 500 million cells, an additional 3 bytes per cell is
needed to index the data source.
Detecting content using Exact Data Matching (EDM) 581
Memory requirements for EDM

You can schedule indexing for multiple indexes serially (at different times) or in parallel (at the
same time). When indexing serially, you need to allocate memory to accommodate the indexing
of the biggest index. When indexing in parallel, you need to allocate memory to accommodate
the indexing of all indexes that you are creating at that time.

Serial indexing
If you create the indexes serially (no two are created in parallel), the memory requirement for
the biggest index is:
2 billion cells – 0 .5 billion default x 3 bytes = 4.5 GB rounded to 5 GB additional memory.
This memory requirement includes the 2 GB (2048 MB) default memory for the Enforce Server
and the 5 GB additional system memory.
Table 26-34 provides examples for how the data source size affects indexer memory
requirements for serial indexes.

Table 26-34 Examples for indexer memory requirements-serial indexing for EDM

Data source size Indexer memory Description


requirement

100 million cells 2048 MB (default) No additional RAM is needed for the indexer.

500 million cells 2048 MB (default) No additional RAM is needed for the indexer.

1 billion cells 4 GB If you have a single data source with 1 billion cells (for
example, 10 columns by 100 million rows), you need extra
system memory for 0.5 billion cells (1 billion cells – 0.5 million
default) 0.5 million x 3 bytes, or 1.5 GB of RAM (rounded to
2 GB) to index the data source. This amount is added to the
default indexer RAM allotment.

2 billion cells 7 GB If you have a single data source with 2 billion cells (for
example, 10 columns by 200 million rows), you need extra
system memory for 1.5 billion cells (2 billion cells – 0.5 million
default) 1.5 million x 3 bytes, or 4.5 GB of RAM (rounded to
5 GB) to index the data source.

Parallel indexing with EDM


If you index these four files in Table 26-34 simultaneously (in parallel), you are indexing more
than 500 million cells. So, the additional memory (3.6 billion cells – 0.5 billion cells provided
by default) required is as follows:
3.1 billion cells x 3 bytes = 9.3 GB rounded to 10 GB additional memory.
Detecting content using Exact Data Matching (EDM) 582
Memory requirements for EDM

As explained in detail later, you set wrapper.java.maxmemory to 12 GB. This memory


requirement includes 2048 MB default memory for the Enforce Server and an additional 9 GB
system memory from the additional memory calculation above.

Note: For CJK language indexes, or indexes that are predominantly multi-token, these formulas
should use a multiplier of 4 bytes instead of 3 bytes. In both of these cases, a 350-million cell
data source is supported by default.

Detection server memory requirements for EDM


The detection server should not use more than 60% of the memory of the computer. For
example, if your detection server needs 6 GB memory to run, make sure you have 10 GB on
that server.

Default configuration for a detection server


The default configuration for detection server has 4GB and 8 message chains. See the following
formulas and Table 26-35 to determine how to calculate your actual memory requirements. In
addition, you can use the spreadsheet provided at the Symantec Support Center at
http://www.symantec.com/docs/DOC8255.html to determine your actual memory requirements.
See “Using the EDM Memory Requirements Spreadsheet” on page 585.
To load the index, the detection server needs 13 bytes per cell for system memory plus 1 GB
Java heap memory for each message chain in the detection server. The following examples
show scenarios for a customer who has three indexes that are all under the same schedule.
For Java heap memory requirements, the formula is:
Java heap memory requirement = the number of message chains * 1 GB.
For system memory requirements, the general formula is:
System memory requirement = number of cells * 13 bytes.

Detection Server memory settings for EDM


The Advanced Server Settings property for the number of message chains is:
MessageChain.NumChains.

The Java heap memory settings for a detection server are set in the Enforce Server
administration console at the Server Detail - Advanced Server Settings page, using the
BoxMonitor.FileReaderMemory. property. The format is -Xrs -Xms1200M -Xmx4G. You don't
needed to change the system memory setting, but make sure that the detection server has
enough free memory available.
Detecting content using Exact Data Matching (EDM) 583
Memory requirements for EDM

Note: When you update this setting, only change the -Xmx value in this property. For example,
only change "4G." to a new value, and leave all other values the same.

The examples in Table 26-35 show the settings for five different situations.

Table 26-35 EDM detection server Java heap memory settings and addition system memory
examples

Example Calculation Boxmonitor.FileReaderMemory Additional system


setting memory required

Example 1: Single Java Heap memory -Xmx6G 25 MB


small index with 2 requirement:
million cells to load
1 * 1 GB = 2 GB

System memory is:

2 million * 13 bytes =
25 MB

Example 2: Java heap memory -Xmx28G 37.2 GB


requirement is:
3 indexes when
running 24 chains: 24 * 1GB = 24 GB

■ Index 1: 100 million System Memory


cells requirement is:
■ Index 2: 1 billion For 100 million cells
cells index: 100 million * 13
■ Index 3: 2 billion bytes = 1.2 GB
cells
For 1 billion cells
index:

1 billion * 13 bytes =
12 GB

For 2 billion cells


index:

2 billion * 13 bytes =
24GB

Total system memory


requirement is:

1.2 GB + 12 GB + 24
GB = 37.2 GB
Detecting content using Exact Data Matching (EDM) 584
Memory requirements for EDM

Table 26-35 EDM detection server Java heap memory settings and addition system memory
examples (continued)

Example Calculation Boxmonitor.FileReaderMemory Additional system


setting memory required

Example 3: One single Java Heap memory -Xmx28G 60.5 GB


index with 5 billion requirement is:
cells and 24 message
24 * 1GB = 24 GB
chains
System memory
requirement is:

5 billion * 13 bytes =
60.5 GB

Example 4: One single Java heap memory -Xmx28G 19.3 GB


index with 1.6 billion requirement is:
cells and 24 message
24 * 1GB = 24 GB
chains
System memory
requirement is:

1.6 billion * 13 bytes =


19.3 GB

Example 5: One single Java heap memory -Xmx12G 6.1 GB


index with 500 million requirement is:
cells and 8 message
8 * 1 GB = 8 GB
chains
System memory
requirement is:

500 million * 13 bytes


= 6.1 GB

Increasing the memory for the detection server (File Reader) for
EDM
This topic provides instructions for increasing the File Reader memory allocation for a detection
server. These instructions assume that you have performed the necessary calculations.
To increase the memory for detection server processing
1 In the Enforce Server administration console, navigate to the Server Detail - Advanced
Server Settings screen for the detection server where the EDM index is deployed or to
be deployed.
2 Locate the following setting: BoxMonitor.FileReaderMemory.
Detecting content using Exact Data Matching (EDM) 585
Remote EDM indexing

3 Change the -Xmx4G value in the following string to match the calculations you have made.
-Xrs -Xms1200M -Xmx4G -XX:PermSize=128M -XX:MaxPermSize=256M
For example: -Xrs -Xms1200M -Xmx11G -XX:PermSize=128M -XX:MaxPermSize=256M
4 Save the configuration and restart the detection server.

Using the EDM Memory Requirements Spreadsheet


The EDM Memory Requirements Spreadsheet is a tool that you can use to determine the
additional system memory needed on the detection server to run your indexes. It is available
as an Excel spreadsheet on the Symantec Support Center at:
https://support.symantec.com/en_US/article.DOC8255.html
Figure 26-1 shows an example of the spreadsheet with four message chains and three indexes.

Figure 26-1 EDM Memory Requirements Spreadsheet

To compute the additions system memory required to run your indexes, enter the following
information:
1. Obtain the number of cells in each index (you can specify up to 10 indexes).
2. Enter that number into # of cells in Index.
When you change any value, the spreadsheet updates the Required RAM field.
The value in the Required RAM field is the additional system memory that is required to run
the indexes specified.

Remote EDM indexing


An EDM index maps the data you want to protect to the Exact Data profile. The typical EDM
workflow for creating the EDM index is to upload the data source file to the Enforce Server,
create the Exact Data profile, and index the data source. Instead of uploading the data source
file to the Enforce Server for indexing, you can index the data source locally and securely using
the Remote EDM Indexer.
See “About the Exact Data Profile and index” on page 528.
Detecting content using Exact Data Matching (EDM) 586
Remote EDM indexing

For example, if copying the confidential data source file to the Enforce Server presents a
potential security or logistical issue, you can use the Remote EDM Indexer to create the
cryptographic index directly on the data source host before moving the index to the Enforce
Server. If you are upgrading to the latest Symantec Data Loss Prevention version you may
want to use the Remote EDM Indexer to update your existing EDM indexes.
See “About the Remote EDM Indexer” on page 586.
See “About the SQL Preindexer for EDM” on page 586.
The Remote EDM Indexer is a standalone tool that lets you index the data source file directly
on the data source host.
See “System requirements for remote EDM indexing” on page 587.

About the Remote EDM Indexer


The Remote EDM Indexer utility converts a data source file to an EDM index. The utility is
similar to the local EDM Indexer used by the Enforce Server. However, the Remote EDM
Indexer is designed for use on a computer that is not part of the Symantec Data Loss Prevention
server configuration.
Using the Remote EDM Indexer to index a data source on a remote machine has the following
advantages over using the EDM Indexer on the Enforce Server:
■ It enables the owner of the data, rather than the Symantec Data Loss Prevention
administrator, to index the data.
■ It shifts the system load that is required for indexing onto another computer. The CPU and
RAM on the Enforce Server is reserved for other tasks.
See “About the SQL Preindexer for EDM” on page 586.
See “Workflow for remote EDM indexing” on page 587.

About the SQL Preindexer for EDM


You use the SQL Preindexer utility with the Remote EDM Indexer to run SQL queries against
Oracle databases and pipe the resulting data to the Remote EDM Indexer for indexing.
See “System requirements for remote EDM indexing” on page 587.
The SQL Preindexer utility is installed in the C:\Program
Files\Symantec\DataLossPrevention\ServerPlatformCommon\Indexer\15.1\Protect\bin
directory during installation of the Remote EDM Indexer. The SQL Preindexer utility generates
an index directly from an Oracle SQL database. The SQL Preindexer processes the database
query and passes it to the standard input of the Remote EDM Indexer utility.
To use the SQL Preindexer the data source must be relatively clean since the query result
data is piped directly to the Remote EDM Indexer.
Detecting content using Exact Data Matching (EDM) 587
Remote EDM indexing

See “About the Remote EDM Indexer” on page 586.

System requirements for remote EDM indexing


The Remote EDM Indexer runs on the Windows and Linux operating system versions that are
supported for Symantec Data Loss Prevention servers. See the Symantec Data Loss Prevention
System Requirements and Compatibility Guide for more information about operating system
support.
The SQL Preindexer supports Oracle databases and requires a relatively clean data source.
See “About the SQL Preindexer for EDM” on page 586.
The RAM requirements for using the Remote EDM Indexer vary according to the size of the
data source being indexed and the number of multi-token columns in the data source.
See “Memory requirements for EDM” on page 579.

Workflow for remote EDM indexing


This section summarizes the steps to index a data file on a remote machine and then use the
index in Symantec Data Loss Prevention.
See “About the Exact Data Profile and index” on page 528.

Table 26-36 Steps to use the Remote EDM Indexer

Step Action Description

Step 1 Install the Remote EDM See “Installing the Remote EDM Indexer” on page 588.
Indexer on a computer that
is not part of the Symantec
Data Loss Prevention
system.

Step 2 Create an Exact Data Profile On the Enforce Server, generate an EDM Profile template using the *.edm
on the Enforce Server to use file name extension and specifying the exact number of columns to be indexed.
with the Remote EDM
See “Creating an EDM profile template for remote indexing” on page 589.
Indexer.

Step 3 Copy the Exact Data Profile Download the profile template from the Enforce Server and copy it to the
file to the computer where remote data source host computer.
the Remote EDM Indexer
See “Downloading and copying the EDM profile file to a remote system”
resides.
on page 591.
Detecting content using Exact Data Matching (EDM) 588
Remote EDM indexing

Table 26-36 Steps to use the Remote EDM Indexer (continued)

Step Action Description

Step 4 Run the Remote EDM If you have a cleansed data source file, use the RemoteEDMIndexer with
Indexer and create the index the -data, -profile and -result options.
files.
If the data source is an Oracle database, use the SqlPreindexer and the
RemoteEDMIndexer to index the data source directly with the -alias (oracle
DB host), -username and -password credentials, and the -query string
or -query_path

See “Generating remote index files for EDM” on page 591.

Step 5 Copy the index files from the Copy the resulting *.pdx and *.rdx files from the remote machine to the
remote machine to the Enforce Server host at C:\ProgramData\Symantec\DataLossPrevention
Enforce Server. \EnforceServer\15.5\Protect\index.

See “Copying and loading remote EDM index files to the Enforce Server”
on page 594.

Step 6 Load the index files into the Update the EDM profile by loading the externally generated index.
Enforce Server.
Submit the profile for indexing.

See “Copying and loading remote EDM index files to the Enforce Server”
on page 594.

Step 7 Troubleshoot any problems Verify that indexing is started and completes.
that occur during the
Check the system events for Code 2926 ("Created Exact Data Profile" and
indexing process.
"Data source saved").

The ExternalDataSource.<name>.rdx and *.pdx files are removed


from the index directory and replaced by the file DataSource.<profile
id>.<version>.rdxver.

See “Troubleshooting remote indexing errors for EDM” on page 599.

Step 8 Create policy with EDM You should see the column data for defining the EDM condition.
condition.
See “Configuring the Content Matches Exact Data policy condition for EDM”
on page 551.

Installing the Remote EDM Indexer


You install the Remote EDM Indexer on one or more systems where the confidential files you
want to index are stored. The process for installing a remote indexer is the same for EMDI,
EDM, and IDM.
Detecting content using Exact Data Matching (EDM) 589
Remote EDM indexing

See “About installing remote indexers” on page 589.


You can install the Remote EDM Indexer on all of the supported Windows and Linux platforms.
See the Symantec Data Loss Prevention System Requirements Guide for platform details.

Creating an EDM profile template for remote indexing


The EDM Indexer uses an Exact Data Profile when it runs to ensure that the data is correctly
formatted. You must create the Exact Data Profile before you use the Remote EDM Indexer.
The profile is a template that describes the columns that are used to organize the data. The
profile does not need to contain any data. After creating the profile, copy it to the computer
that runs the Remote EDM Indexer.
See “About the Exact Data Profile and index” on page 528.
To create an EDM profile for remote indexing
1 From the Enforce Server administration console, navigate to the Manage > Data Profiles
> Exact Data screen.
2 Click Add Exact Data Profile.
3 In the Name field, enter a name for the profile.
4 In the Data Source field, select Use This File Name, and enter the name of the index
file to create with the *.edm extension.
You must select this option since you are only creating the profile template at this point.
Later you will then index the profile with data source using the Remote EDM Indexer.
Enter the file name of the data source you plan to create for remote EDM indexing. Be
sure to name the data source file exactly the same as the name you enter here.
See “Uploading exact data source files for EDM to the Enforce Server” on page 539.
Once you have copied the generated remote index back to the Enforce Server, you use
the Load Externally Generated Index option to load the remote index into the profile
template
See “Copying and loading remote EDM index files to the Enforce Server” on page 594.
5 In the Number of Columns text box, specify the number of columns in the data source
to be indexed.
For remote EDM indexing purposes you must specify the exact Number of Columns the
index is to have. Be sure to include the exact number of columns you specify here in the
data source file.
See “Uploading exact data source files for EDM to the Enforce Server” on page 539.
6 If the first row of the data source contains the column names, select the option Read first
row as column names.
Detecting content using Exact Data Matching (EDM) 590
Remote EDM indexing

7 In the Error Threshold text box, enter the maximum percentage of rows that can contain
errors.
If, during indexing of the data source, the number of rows with errors exceeds the
percentage that you specify here, the indexing operation fails.
8 In the Column Separator Char field, select the type of character that is used in your data
source to separate the columns of data.
9 In the File Encoding field, select the character encoding that is used in your data source.
If Latin characters are used, select the ISO-8859-1 option. For East Asian languages, use
either the UTF-8 or UTF-16 options.
10 Click Next to map the column headings from the data source to the profile.
11 In the Field Mappings section, map the Data Source Field to the System Field for each
column by selecting the column name from the System Field drop-down list.
The Data Source Field lists the number of columns you specified at the previous screen.
The System Field contains a list of standard column headings. If any of the column
headings in your data source match the choices available in the System Field list, map
each accordingly. Be sure that you match the selection in the System Field column to its
corresponding numbered column in the Data Source Field.
For example, for a data source that you have specified in the profile as having three
columns, the mapping configuration may be:

Data Source Field System Field

Col 1 First Name

Col 2 Last Name

Col 3 Social Security Number

12 If a Data Source Field does not map to a heading value in the options available from the
System Field column, click the Advanced View link.
In the Advanced View the system displays a Custom Name column beside the System
Field column.
Enter the correct column name in the text box that corresponds to the appropriate column
in the data source.
Optionally, you can specify the data type for the Custom Name you entered by selecting
the data type from the Type drop-down list. These data types are system-defined. Click
the description link beside the Type name for details on each system-defined data type.
Detecting content using Exact Data Matching (EDM) 591
Remote EDM indexing

13 If you intend to use the Exact Data Profile to implement a policy template that contains
one or more EDM rules, you can validate your profile mappings for the template. To do
this, select the template from the Check mappings against policy template drop-down
list and click Check now. The system indicates any unmapped fields that the template
requires.
14 Do not select any Indexing option available at this screen, since you intend to index
remotely.
15 Click Finish to complete the profile creation process.

Downloading and copying the EDM profile file to a remote system


Download and copy the EDM profile to the remote system
1 Configure an Exact Data Profile.
See “Creating an EDM profile template for remote indexing” on page 589.
2 Download the EDM profile by selecting the download profile link at the Manage > Data
Profiles > Exact Data screen.
The system prompts you to save the EDM profile as a file. The file extension is *.edm.
3 Save the file.
If the data source host computer where you intend to run the Remote EDM Indexer is
available on the same subnet as the Enforce Server you can browse to that computer
and select it as the destination. Otherwise, manually copy the profile to the remote system.
4 Use the profile to index the data source using the Remote EDM Indexer.
See “Generating remote index files for EDM” on page 591.

Generating remote index files for EDM


You use the command-line Remote EDM Indexer utility to generate an EDM index for importing
to the Enforce Server. You can use the Remote EDM Indexer to index data source file that
you have generated and cleansed. Or you can pipe the output from the SQL Preindexer to
the standard input of the Remote EDM Indexer. The SQL Preindexer requires an Oracle DB
data source and clean data.
When the indexing process completes, the Remote EDM Indexer generates several files in
the specified result directory. These files are named after the data file that was indexed, with
one file having the .pdx extension and another file with the .rdx extension. The system
generates 12 .rdx files named ExternalDataSource.<DataSourceName>.rdx.0 -
ExternalDataSource.<DataSourceName>.rdx.11.
Detecting content using Exact Data Matching (EDM) 592
Remote EDM indexing

Table 26-37 Options for generating remote EDM indexes

Use case Description Remarks

Remote EDM Indexer with data source Specify data source file, EDM profile, Use when you have a cleansed data
file. output directory. source file; use for upgrading to the
latest vesion.

See “Remote indexing examples using


data source file (EDM)” on page 592.

Remote EDM Indexer with SQL Query DB and pipe output to stdin of Requires Oracle DB and clean data.
Preindexer Remote EDM Indexer.
See “Remote indexing examples using
SQL Preindexer (EDM)” on page 593.

Remote indexing examples using data source file (EDM)


To use the Remote EDM Indexer to index a flat data source file you have generated and
cleansed, you specify the local data source file name and path (-data), the local EDM profile
file name and path (-profile), and the output directory for the generated index files (-result).
The syntax for using the Remote EDM Indexer to generate an index from a cleansed data
source tabular text file is as follows:

RemoteEDMIndexer -data=<local data source filename and path>


-profile=<local *.edm profile file name and path>
-result=<local output directory for *.rdx and *pdx index files>

For example:

RemoteEDMIndexer -data=C:\EDMIndexDirectory\CustomerData.dat
-profile=C:\EDMIndexDirectory\RemoteEDMProfile.edm
-result=C:\EDMIndexDirectory\

This command generates an EDM index using the local data source tabular text file
CustomerData.dat and the local RemoteEDMProfile.edm file that you generated and copied
from the Enforce Server to the remote host, where \EDMIndexDirectory is the directory for
placing the generated index files.
When the generation of the indexes is successful, the utility displays the message "Successfully
created index" as the last line of output.
In addition, the following index files are created and placed in the -result directory:
■ ExternalDataSource.CustomerData.pdx

■ ExternalDataSource.CustomerData.rdx
Detecting content using Exact Data Matching (EDM) 593
Remote EDM indexing

Twelve files, named ExternalDataSource.<DataSourceName>.rdx.0 -


ExternalDataSource.<DataSourceName>.rdx.11 are always generated. Copy these files to
the Enforce Server and update the EDM profile using the remote index.
See “Remote EDM Indexer command options” on page 597.

Remote indexing examples using SQL Preindexer (EDM)


If your data source is an Oracle DB and has clean data you can index the data source directly
using the SQL Preindexer with the Remote EDM Indexer.
The syntax is as follows:

SqlPreindexer -alias=<oracle connect string: //host:port/SID>


-username=<DB user> -password=<DB password> -query=<sql to run> |
RemoteEDMIndexer -profile=<*.edm profile file name and path>
-result=<output directory for index files>

For example:

SqlPreindexer -alias=@//myhost:1521/orcl -username=scott -password=tiger


-query="SELECT name, salary FROM employee" |
RemoteEDMIndexer -profile=C:\ExportEDMProfile.edm -result=C:\EDMIndexDirectory\

With this command the SQL Preindexer utility connects to the Oracle database and runs the
SQL query to retrieve name and salary data from the employee table. The SQL Preindexer
returns the result of the query to stdout (the command console). The SQL query must be in
quotes. The Remote EDM Indexer command runs the utility and reads the query result from
the stdin console. The Remote EDM Indexer indexes the data using the ExportEDMProfile.edm
profile as specified by the profile file name and local file path.
When the generation of the indexes is successful, the utility displays the message "Successfully
created index" as the last line of output.
In addition, the utility places the following generated index files in the EDMIndexDirectory
-result directory:
■ ExternalDataSource.CustomerData.pdx
■ ExternalDataSource.CustomerData.rdx
Here is another example using SQL Preindexer and Remote EDM Indexer commands:

SqlPreindexer -alias=@//localhost:1521/CUST -username=cust_user -password=cust_pword


-query="SELECT account_id, amount_owed, available_credit FROM customer_account" -verbose |
RemoteEDMIndexer -profile=C:\EDMIndexDirectory\CustomerData.edm
-result=C:\EDMIndexDirectory\ -verbose
Detecting content using Exact Data Matching (EDM) 594
Remote EDM indexing

Here the SQL Preindexer command queries the CUST.customer_account table in the database
for the account_id, amount_owed, and availble_credit records. The result is piped to the
Remote EDM Indexer which generates the index files based on the CustomerData.edm profile.
The -verbose option is used for troubleshooting.
As an alternative to the -query SQL string you can use the -query_path option and specify
the file path and name for the SQL query (*.sql). If you do not specify a query or query path
the entire DB is queried.

SqlPreindexer -alias=@//localhost:1521/cust -username=cust_user -password=cust_pwrd


-query_path=C:\EDMIndexDirectory\QueryCust.sql -verbose |
RemoteEDMIndexer -profile=C:\EDMIndexDirectory\CustomerData.edm
-result=C:\EDMIndexDirectory\ -verbose

See “SQL Preindexer command options (EDM)” on page 595.

Copying and loading remote EDM index files to the Enforce Server
The following files are created in the -result directory when you remotely index a data source:
■ ExternalDataSource.<DataSourceName>.pdx

■ ExternalDataSource.<DataSourceName>.rdx.0 -
ExternalDataSource.<DataSourceName>.rdx.11

After you create the index files on a remote machine, the files must be copied to the Enforce
Server, loaded into the previously created remote EDM profile, and indexed.
See “Creating an EDM profile template for remote indexing” on page 589.
To copy and load the files on the Enforce Server
1 Go to the directory where the index files were generated. (This directory is the one specified
in the -result option.)
2 Copy all of the index files with .pdx and .rdx extensions to the index directory on the
Enforce Server. This directory is located at
C:\ProgramData\Symantec\DataLossPrevention\ServerPlatformCommon\15.5\Index
(Windows) or /var/Symantec/DataLossPrevention/ServerPlatformCommon/15.5/index
(Linux).
3 From the Enforce Server administration console, navigate to the Manage > Policies >
Exact Data screen.
This screen lists all the Exact Data Profiles in the system.
4 Click the name of the Exact Data Profile you used with the Remote EDM Indexer.
5 To load the new index files, go to the Data Source section of the Exact Data Profile and
select Load Externally Generated Index.
Detecting content using Exact Data Matching (EDM) 595
Remote EDM indexing

6 In the Indexing section, select Submit Indexing Job on Save.


As an alternative to indexing immediately on save, consider scheduling a job on the remote
machine to run the Remote EDM Indexer on a regular basis. The job should also copy
the generated files to the index directory on the Enforce Server. You can then schedule
loading the updated index files on the Enforce Server from the profile by selecting Load
Externally Generated Index and Submit Indexing Job on Schedule and configuring
an indexing schedule.
See “Use scheduled indexing to automate profile updates (EDM)” on page 607.
7 Click Save.

SQL Preindexer command options (EDM)


On install the SQL Preindexer utility is available at C:\Program Files\Symantec\Data Loss
Prevention\Indexer\15.1\Protect\bin (Windows) and
/Symantec/DataLossPrevention/Indexer/15.1/Protect/bin (Linux).

The SQL Preindexer provides a command-line interface. The syntax for running the utility is
as follows:

SqlPreindexer -alias=<@//oracle_host:port/SID> -username=<DB_user> [options]

Note the following about the arguments:


■ The SQL Preindexer requires the -alias and -username arguments.
■ If you omit the -password option, the user is prompted to enter it.
■ If you use the -query option, the SQL query string must be in quotes.
■ If you omit the -query option, the utility indexes the entire database.
■ To query using wildcards, use the -qeury_path option. The SQL Preindexer does not
support the use of wildcards from the command line using the -query option. For example:
"select * from CUST_DATA" does not work with -query; you must query each individual
column field: "select cust_ID, cust_Name, cust_SSN from CUST_DATE." The query "select
* from CUST_DATA" works using the -qeury_path command.
See “Remote indexing examples using SQL Preindexer (EDM)” on page 593.
Table 26-38 lists the command options for the SQL Preindexer.
Detecting content using Exact Data Matching (EDM) 596
Remote EDM indexing

Table 26-38 SQL Preindexer command options (EDM)

Option Summary Description

-alias Oracle DB connect string Specifies the database alias that is used to connect to the
database in the following format:
Required
@//oracle_DB_host:port/SID

For example:

-alias=@//myhost:1521/ORCL

-alias=@//localhost:1521/CUST

-driver Oracle JDBC driver class Specifies the JDBC driver class, for example:
oracle.jdbc.driver.OracleDriver.

-encoding Character encoding Specifies the character encoding of the data to index. The
(iso-8859-1) default is iso-8859-1.

Data with non-English characters should use UTF-8 or UTF-16.

-password Oracle DB password Specifies the password to the database.

If this option is not specified, the password is read from stdin.

-query-query_path SQL query This option specifies the SQL query to perform. The statement
must be enclosed in quotes.

If you omit the -query option the utility indexes the entire
database.

SQL script Specifies the file name and local path that contains a SQL
query to run. Must be full path.

This option can be used as an alternative to the -query option


when the query is a long SQL statement.

-separator Output column separator Specifies whether the output column separator is a comma,
(tab) pipe, or tab. The default separator is a tab.

To specify a comma separator or pipe separator, enclose the


separator character in quotation marks: "," or "|".

-subprotocol Oracle thin driver Specifies the JDBC connect string subprotocol (for example,
oracle:thin).

-username Oracle DB user Specifies the name of the database user.

Required

-verbose Print verbose output for Displays a statistical summation of the operation when it is
debugging. complete.

See “Troubleshooting preindexing errors for EDM” on page 598.


Detecting content using Exact Data Matching (EDM) 597
Remote EDM indexing

Remote EDM Indexer command options


On install, the Remote EDM Indexer utility is available at \Program Files\Symantec\Data
Loss Prevention\Indexer\15.1\Protect\bin (Windows) and
opt/Symantec/DataLossPrevention/Indexer/15.1/Protect/bin (Linux).

If you are on Linux, change users to the “SymantecDLP” user before running the Remote EDM
Indexer. (The installation program creates the “SymantecDLP” user.)
The Remote EDM Indexer provides a command line interface. The syntax for running the utility
is as follows:

RemoteEDMIndexer -profile=<file *.edm> -result=<out_dir> [options]

Note the following about the syntax:


■ The Remote EDM Indexer requires the -profile and -result arguments.
■ If you use a flat data source file as input, you must specify the file name and local path
using the -data option.
■ The -data option is omitted when you use the SQL Preindexer to pipe the data to the
Remote EDM Indexer.
See “Remote indexing examples using data source file (EDM)” on page 592.
Table 26-39 describes the command options for the Remote EDM Indexer.

Table 26-39 Remote EDM Indexer command options

Option Summary Description

-data Data source to be indexed Specifies the data source to be indexed. If this option is not
(stdin) specified, the utility reads data from stdin.

Required if you use a Required if using data source file and not the SQL Preindexer.
tabular text file

-encoding Character encoding of data Specifies the character encoding of the data to index. The
to be indexed (ISO-8859-1) default is ISO-8859-1.

Use UTF-8 or UTF-16 if the data contains non-English


characters.

-ignore_date Ignore expiration date of the Overrides the expiration date of the Exact Data Profile if the
EDM profile profile has expired. (By default, an Exact Data Profile expires
after 30 days.)

-profile File containing the EDM Specifies the Exact Data Profile to be used. This profile is the
profile one that is selected by clicking the “download link” on the
Exact Data screen in the Enforce Server management console
Required
Detecting content using Exact Data Matching (EDM) 598
Remote EDM indexing

Table 26-39 Remote EDM Indexer command options (continued)

Option Summary Description

-result Directory to place the Specifies the directory where the index files are generated.
resulting indexes

Required

-verbose Display verbose output Displays a statistical summation of the indexing operation
when the index is complete.

See “Troubleshooting preindexing errors for EDM”


on page 598.

Troubleshooting preindexing errors for EDM


If you receive an error that the SQL Preindexer was unable to perform query or failed to prepare
for indexing, verify that the -query string is in quotes. You can test your -query string by
running only the SQL Preindexer command. If the command is correct the data queried from
the database is displayed to the console as stdout.
You may encounter errors when you index large amounts of data. Often the set of data contains
a data record that is incomplete, inconsistent, or inaccurate. Data rows that contain more
columns than expected or incorrect column data types often cannot be properly indexed and
are unrecognized.
The SQL Preindexer can be configured to provide a summary of information about the indexing
operation when it completes. To do so, specify the verbose option when running the SQL
Preindexer.
To see the rows of data that the Remote EDM Indexer did not index, adjust the configuration
in the Indexer.properties file using the following procedure.
To record those data rows that were not indexed
1 Locate the Indexer.properties file at \Program Files\Symantec\Data Loss
Prevention\Indexer\15.1\Protect\config\Indexer.properties (Windows) or
/Symantec/DataLossPrevention/Indexer/15.1/Protect/config/Indexer.properties
(Linux).
2 Open the file in a text editor.
Detecting content using Exact Data Matching (EDM) 599
Remote EDM indexing

3 Locate the create_error_file property and change the “false” setting to “true.”
4 Save and close the Indexer.properties file.
The Remote EDM Indexer logs errors in a file with the same name as the data file being
indexed and the .err suffix.
The rows of data that are listed in the error file are not encrypted. Safeguard the error file
to minimize any security risk from data exposure.
See “About the SQL Preindexer for EDM” on page 586.

Troubleshooting remote indexing errors for EDM


The Remote EDM Indexer displays a message that indicates whether the indexing operation
was successful or not. If the Remote EDM Indexer successfully creates the index, the console
displays the message "Successfully created index" as the last line of output. In addition, *.pdx
and *.rdx files are created in the -result directory.
The result depends on the error threshold that you specify in the EDM profile. Any error
percentage under the threshold completes successfully. Detailed information about the indexing
operation is available with the -verbose option.
See “Remote EDM Indexer command options” on page 597.
If the index generation is not successful, try these troubleshooting tips:

Table 26-40 Remote Indexer troubleshooting tips for EDM

Error Symptom Description

Index files not Use the -verbose option in Specifying the verbose option when running the Remote EDM
generated the command to reveal error Indexer provides a statistical summary of information about the
message. indexing operation after it completes. This information includes
the number of errors and where the errors occurred.

"Failed to create Verify file and path names. Verify that you included the full path and proper file name for
index" the -data file and the -profile file (*.edm). The paths must
be local to the host.
"Cannot compute
index"

"Unable to generate
index"

"Destination is not a Directory path not correct. Verify that you properly entered the full path to the destination
directory" directory for the required -result argument.
Detecting content using Exact Data Matching (EDM) 600
Remote EDM indexing

Table 26-40 Remote Indexer troubleshooting tips for EDM (continued)

Error Symptom Description

*.idx file instead Did not use -data argument The -data option is required if you are using a data source file
of *.rdx file and not the SQL Preindexer. In other words, the only time you
do not use the -data argument is when you are using the SQL
Preindexer.

If you run the Remote EDM Indexer without the -data option
and no SQL Preindexer query, you get an *.idx and *.rdx
file that cannot be used as for the EDM index. Rerun the index
using the -data option or a SQL Preindexer -query or
-query-path.

In addition, you may encounter errors when you index large amounts of data. Often the set of
data contains a data record that is incomplete, inconsistent, or incorrectly formatted. Data rows
that contain more columns than expected or incorrect data types often cannot be properly
indexed and are unrecognized during indexing. The rows of data with errors cannot be indexed
until those errors are corrected and the Remote EDM Indexer rerun. Symantec provides a
couple of ways to get information about any errors and the ultimate success of the indexing
operation.
To see the actual rows of data that the Remote EDM Indexer failed to index, modify the
Indexer.properties file.

To modify the Indexer.properties file and view remote indexing errors


1 Locate the Indexer.properties file at \Program Files\Symantec\Data Loss
Prevention\Indexer\15.1\Protect\config\Indexer.properties (Windows) or
/opt/Symantec/DataLossPrevention/Indexer/15.1/Protect/config/Indexer.properties
(Linux).
2 To edit the file, open it in a text editor.
3 Locate the create_error_file property parameter and change the “false” value to “true.”
4 Save and close the Indexer.properties file.
The Remote EDM Indexer logs errors in a file with the same name as the indexed data
file and with an .err extension. This error file is created in the logs directory.
The rows of data that are listed in the error file are not encrypted. Encrypt the error file to
minimize any security risk from data exposure.
Detecting content using Exact Data Matching (EDM) 601
Best practices for using EDM

Best practices for using EDM


EDM is the most accurate form of detection. It is also the most complex to set up and maintain.
To ensure that your EDM policies are as accurate as possible, consider the recommendations
in this section when you are implementing your EDM profiles and policies.
The following table provides a summary of the EDM policy considerations discussed in this
chapter, with links to individual topics for more details.

Table 26-41 Summary of EDM best practices

Best practice Description

Ensure that the data source file contains at least one See “Ensure data source has at least one column of unique
column of unique data. data (EDM)” on page 602.

Eliminate duplicate rows and blank columns before See “Cleanse the data source file of blank columns and
indexing. duplicate rows (EDM)” on page 603.

To reduce false positives, avoid single characters, quotes, See “Remove ambiguous character types from the data
abbreviations, numeric fields with less than 5 digits, and source file (EDM)” on page 604.
dates.

Understand multi-token indexing and clean up as See “Understand how multi-token cell matching functions
necessary. (EDM)” on page 604.

Use the pipe (|) character to delimit columns in your data See “Do not use the comma delimiter if the data source
source. has number fields (EDM)” on page 605.

Review an example cleansed data source file. See “Ensure that the data source is clean for indexing
(EDM)” on page 605.

Map data source column to system fields to leverage See “Map data source column to system fields to leverage
validation during indexing. validation (EDM)” on page 605.

Leverage EDM policy templates whenever possible. See “Leverage EDM policy templates when possible”
on page 606.

Include the column headers as the first row of the data See “Include column headers as the first row of the data
source file. source file (EDM)” on page 606.

Check the system alerts to tune Exact Data Profiles. See “Check the system alerts to tune profile accuracy
(EDM)” on page 607.

Use stopwords to exclude common words from matching. See “Use stopwords to exclude common words from
detection (EDM)” on page 607.

Automate profile updates with scheduled indexing. See “Use scheduled indexing to automate profile updates
(EDM)” on page 607.
Detecting content using Exact Data Matching (EDM) 602
Best practices for using EDM

Table 26-41 Summary of EDM best practices (continued)

Best practice Description

Match on two or three columns in an EDM rule. See “Match on 3 columns in an EDM condition to increase
detection accuracy” on page 608.

Leverage exception tuples to avoid false positives. See “Leverage exception tuples to avoid false positives
(EDM)” on page 609.

Use a WHERE clause to detect records that meet a See “Use a WHERE clause to detect records that meet
specific criteria. specific criteria (EDM)” on page 609.

Use the minimum matches field to fine tune EDM rules. See “Use the minimum matches field to fine tune EDM
rules” on page 610.

Consider using Data Identifiers in combination with EDM See “Combine Data Identifiers with EDM rules to limit the
rules. impact of two-tier detection” on page 610.

Include an email address field in the Exact Data Profile for See “Include an email address field in the Exact Data
profiled DGM. Profile for profiled DGM (EDM)” on page 610.

Use profiled DGM for Network Prevent for Web identity See “Use profiled DGM for Network Prevent for Web
detection identity detection (EDM)” on page 611.

Ensure data source has at least one column of unique data (EDM)
EDM is designed to detect combinations of data fields that are globally unique. At a minimum,
your EDM index must include at least one column of data that contains a unique value for each
record in the row. Column data such as account number, social security number, and credit
card number are inherently unique, whereas state or zip code are not unique, nor are names.
If you do not include at least one column of unique data in your index, your EDM profile will
not accurately detect the data you want to protect.
A unique column field is a column that has mostly unique values. It can have duplicate values,
but not more than the number set in term_commonority_threshold. The default value for this
setting is 10.
Table 26-42 describes the various types of unique data to include in your EDM indexes, as
well as fields that are not unique. You can include the non-unique fields in your EDM indexes
as long as you have at least one column field that is unique.
Detecting content using Exact Data Matching (EDM) 603
Best practices for using EDM

Table 26-42 Examples of unique data for EDM policies

Unique data for EDM Non-unique data

The following data fields are usually unique: The following data fields are not unique:
■ Account number ■ First name
■ Bank Card number ■ Last name
■ Phone number ■ City
■ Email address ■ State
■ Social security number ■ Zip code
■ Tax ID number ■ Password
■ Drivers license number ■ PIN number
■ Employee number
■ Insurance number

Cleanse the data source file of blank columns and duplicate rows
(EDM)
The data source file should be as clean as possible before you create the EDM index, otherwise
the resulting profile may create false positives.
When you create the data source file, avoid including empty cells or blank columns. Blank
columns or fields count as “errors” when you generate the EDM profile. A data source error is
either an empty cell or a cell with the wrong type of data (a name appearing in a phone number
column). The error threshold is the maximum percentage of rows that contain errors before
indexing stops. If the errors exceed the error threshold percentage for the profile (by default,
5%), the system stops indexing and displays an indexing error message.
The best practice is to remove blank columns and empty cells from the data source file, rather
than increasing the error threshold. Keep in mind that if you have many empty cells, it may
require a 100% error threshold for the system to create the profile. If you specify 100% as the
error threshold, the system indexes the data source without checking for errors.
In addition, do not fill empty cells or blank fields with bogus data so that the error threshold is
met. Adding fictitious or "null" data to the data source file will reduce the accuracy of the EDM
profile and is strongly discouraged. Content you want to monitor should be legitimate and not
null.
See “About cleansing the exact data source file for EDM” on page 530.
See “Preparing the exact data source file for indexing for EDM” on page 537.
See “Ensure that the data source is clean for indexing (EDM)” on page 605.
Detecting content using Exact Data Matching (EDM) 604
Best practices for using EDM

Remove ambiguous character types from the data source file (EDM)
You cannot have extraneous spaces, punctuation, and inconsistently populated fields in the
data source file. You can use tools such as Stream Editor (sed) and AWK to remove these
items from you data source file or files before indexing them.

Table 26-43 Characters to avoid in the data source file

Characters to avoid Explanation

Single characters Single character fields should be eliminated from the data source file. These are
more likely to cause false positives, since a single character is going to appear
frequently in normal communications.

Abbreviations Abbreviated fields should be eliminated from the data source file for the same reason
as single characters.

Quotes Text fields should not be enclosed in quotes.

Small numbers Indexing numeric fields that contain less than 5 digits is not recommended because
it will likely yield many false positives.

Dates Date fields are also not recommended. Dates are treated like a string, so if you are
indexing a date, such as 12/6/2007, the string will have to match exactly. The indexer
will only match 12/6/2007, and not any other date formats, such as Dec 6, 2007,
12-6-2007, or 6 Dec 2007. It must be an exact match.

Understand how multi-token cell matching functions (EDM)


An EDM rule performs a full-text search against the message, checking each word (except
those that are excluded by way of the columns you choose to match in the policy) for potential
matches. The matching algorithm compares each individual word in the message with the
contents of each token in the data profile.
If a cell in the data profile contains multiple words separated by spaces, punctuation, or
alternative Latin and Chinese, Japanese, and Korean (CJK) language characters, the cell is
a multi-token cell. The sub-token parts of a multi-token cell obey the same rules as single-token
cells: they are normalized according to their pattern where normalization can apply.
If a cell contains a multi-token, the multi-token must match exactly. For example, a column
field with the value “Joe Brown” is a multi-token cell (assuming multi-token matching is enabled).
At run-time the processor looks to match the exact string "Joe Brown,” including the space
(multiple spaces are normalized to one). The system does not match on "Joe" and "Brown" if
they are detected as single tokens.
In addition, multi-token cells are more computationally expensive than single-token cells. If
the index includes multi-token cells, you must verify that you have enough memory to index,
load, and process the EDM profile.
Detecting content using Exact Data Matching (EDM) 605
Best practices for using EDM

If multi-token matching is enabled, any punctuation that is next to a space is ignored. Therefore,
punctuation before and after a space is ignored.
Lastly, do not change the WIP setting from "true" to "false" unless you are sure that is the
result you want to achieve. You should only set WIP = false when you need to loosen the
matching criteria, such as account numbers where formatting may change across messages.
Make sure you test detection results to ensure you are getting the matches you expect.
See “Memory requirements for EDM” on page 579.

Do not use the comma delimiter if the data source has number fields
(EDM)
Of the three types of column delimiters that you can choose from for separating the fields in
the data source file (pipe, tab, semicolon, or comma), the pipe, semicolon, or tab (default) is
recommended. The comma delimiter is ambiguous and should not be used, especially if one
or more fields in your data source contain numbers. If you use a comma-delimited data source
file, make sure there are no commas in the data set other than those used as column delimiters.

Note: Although the system also treats the pound sign, equals sign, plus sign, semicolon, and
colon characters as separators, you should not use these because like the comma their
meaning is ambiguous.

Map data source column to system fields to leverage validation (EDM)


When you create the Exact Data Profile, you can validate how well the fields in your data
source match against system-defined patterns for that field. For example, if you map a field
to the credit card system pattern, the system will validate that the data matches the credit card
system pattern. If it does not, the system will create an error for every record that contains an
invalid credit card number. Mapping data source fields in your index to system-defined field
patterns helps you ensure that the fields in your index meet the data type criteria.
If there is no corresponding system field to map to a data source column, consider creating a
custom field to map data source column data. You can use the description field to annotate
both system and custom fields.
See “Mapping Exact Data Profile fields for EDM” on page 545.
See “Creating and modifying Exact Data Profiles for EDM” on page 541.

Ensure that the data source is clean for indexing (EDM)


The following list summarizes a cleansed data source that is ready for indexing:
■ It contains at least one unique column field.
Detecting content using Exact Data Matching (EDM) 606
Best practices for using EDM

■ It is not a single-column data source; it has two or more columns.


■ Empty cells and rows and blank columns are removed.
■ Incomplete and duplicate records are removed.
■ The number of faulty cells is below the default error rate (5%) for indexing.
■ Bogus data is not used to fill in blank cells or rows.
■ Improper and ambiguous characters are removed.
■ Multi-tokens comply with space and memory requirements.
■ Column fields are validated against the system-defined patterns that are available.
■ Mappings are validate against policy templates where applicable.
See “Ensure data source has at least one column of unique data (EDM)” on page 602.
See “Cleanse the data source file of blank columns and duplicate rows (EDM)” on page 603.
See “Remove ambiguous character types from the data source file (EDM)” on page 604.
See “Understand how multi-token cell matching functions (EDM)” on page 604.
See “Map data source column to system fields to leverage validation (EDM)” on page 605.

Leverage EDM policy templates when possible


Symantec Data Loss Prevention provides several policy templates that implement EDM rules.
The general recommendation is to use policy templates whenever possible when implementing
EDM. If you do use a policy template for EDM, you should validate the index against the
template when you configure the Exact Data Profile.
See “Creating and modifying Exact Data Profiles for EDM” on page 541.

Include column headers as the first row of the data source file (EDM)
When you extract the source data to the data source file, you should include the column
headers as the first row in the data source file. Including the column headers will make it easier
for you to identify the data you want to use in your policies.
The column names reflect the column mappings that were created when the exact data profile
was added. If there is an unmapped column, it is called Col X, where X is the column number
(starting with 1) in the original data profile.
If the Exact Data Profile is to be used for DGM, the file must have a column with a heading of
email, or the DGM will not appear in the Directory EDM drop-down list (at the remediation
page).
Detecting content using Exact Data Matching (EDM) 607
Best practices for using EDM

Check the system alerts to tune profile accuracy (EDM)


You should always review the system alerts after creating the Exact Data Profile. The system
alerts provide very specific information about problems encountered when creating the profile,
such as a SSN in an address field, which will affect accuracy.

Use stopwords to exclude common words from detection (EDM)


During indexing, words found in stopword files are ignored. Stopwords are common words
that are excluding from matching. For example, the stopwords file contains common words
such as articles, prepositions, and so forth. You can adjust the stopwords file by adding to or
removing words from the file. It is recommended that you back up the original before changing
it.
Stopword files are located at the following directory where the detection server running the
index is installed: \Program
Data\Symantec\DataLossPrevention\EnforceServer\15.5\Protect\config\stopwords.
By default, the system uses the stopwords_en.txt file, which is the English language version.
Other language stopword files are also located in this same directory. You can change the
default stopword language file by updating the stopword_languages = en property in
C:\Program
Files\Symantec\DataLossPrevention\EnforceServer\15.5\Protect\config\Indexer.properties
file on the Enforce Server.

Use scheduled indexing to automate profile updates (EDM)


When you configure an Exact Data Profile, you can set a schedule for indexing the data
source file. Index scheduling lets you decide when you want to index the data source file. For
example, instead of indexing the data source at the same time that you define the profile, you
can schedule it for a later date. Alternatively, if you need to reindex the data source on a regular
basis, you can schedule indexing to occur on a regular basis.
Before you set up an index schedule, consider the following:
■ If you update your data sources occasionally (for example, less than once a month),
generally there is no need to create a schedule. Index the data each time you update the
data source.
■ Schedule indexing for times of minimal system use. Indexing affects performance throughout
the Symantec Data Loss Prevention system, and large data sources can take time to index.
■ Index a data source as soon as you add or modify the corresponding exact data profile,
and re-index the data source whenever you update it. For example, consider a scenario
whereby every Wednesday at 2:00 P.M. you generate an updated data source file. In this
case you could schedule indexing every Wednesday at 3:00 P.M., giving you enough time
to cleanse the data source file and copy it to the Enforce Server.
Detecting content using Exact Data Matching (EDM) 608
Best practices for using EDM

■ Do not index data sources daily as this can degrade performance.


■ Monitor results and modify your indexing schedule accordingly. If performance is good and
you want more timely updates, for example, schedule more frequent data updates and
indexing.
Consider using scheduled indexing with remote EDM indexing to keep an EDM profile up to
date. For example, you can schedule a cron job on the remote machine to run the Remote
EDM Indexer on a regular basis. The job can also copy the generated index files to the index
directory on the Enforce Server. You can then configure the Enforce Server to load the externally
generated index and submit it for indexing on a scheduled basis.
See “About index scheduling for EDM” on page 531.
See “Scheduling Exact Data Profile indexing for EDM” on page 548.
See “Copying and loading remote EDM index files to the Enforce Server” on page 594.

Match on 3 columns in an EDM condition to increase detection


accuracy
In a structured data format such as a database, each row represents one record, with each
record containing related values for each column data field. Thus, for an EDM policy rule
condition to match, all the data must come from the same row or record of data. When you
define an EDM rule, you must select the fields that must be present to be a match. Although
there is no limit to the number of columns you can select to match in a row (up to the total
number of columns in the index, which is a maximum of 32), it is recommended that you match
on at least 2 or 3 columns, one of which must be unique. Generally matching on 3 fields is
preferred, but if one of the columns contains a unique value such as SSN or Credit Card
number, 2 columns may be used
Consider the following example. You want to create an EDM policy condition based on an
Exact Data Profile that contains the following 5 columns of indexed data:
■ First Name
■ Last Name
■ Social security number (SSN)
■ Phone Number
■ Email Address
If you select all 5 columns to be included in the policy, consider the possible results based on
the number of fields you require for each match.
If you choose "1 of the selected fields" to match, the policy will undoubtedly generate a large
number of false positives because the record will not be unique enough. (Even if the condition
Detecting content using Exact Data Matching (EDM) 609
Best practices for using EDM

only matches the SSN field, there may still be false positives because there are other types
of nine-digit numbers that may trigger a match.)
If you choose "2 of the selected fields" to match, the policy will still produce false positives
because there are potential worthless combinations of data: First Name + Last Name, Phone
Number + Email Address, or First Name + Phone Number.
If you choose to match on 4 or all 5 of the column fields, you will not be able to exclude certain
data field combinations because that option is only available for matches on 2 or 3 fields.
See “Leverage exception tuples to avoid false positives (EDM)” on page 609.
In this example, to ensure that you generate the most accurate match, the recommendation
is that you choose "3 of the selected fields to match." In this way you can reduce the number
of false positives while using one or more exceptions to exclude the combinations that do not
present a concern, such as First Name + Last Name + Phone Number
Whatever number of fields you choose to match, ensure that you are including the column
with the most unique data, and that you are matching at least 2-column fields.

Leverage exception tuples to avoid false positives (EDM)


The EDM policy condition lets you define exception tuples to exclude combinations on data.
You must select 2 or 3 columns to match to leverage exception tuples.
EDM allows detection based on any combination of columns in a given row of data (that is, N
of M fields from a given record). It can trigger on "tuples," or specified sets of data types. For
example, a combination of the first name and SSN fields could be acceptable, but a combination
of the last name and SSN fields would not. EDM also allows more complex rules such as
looking for N of M fields, but excluding specified tuples. For example, this type of rule definition
is required to identify incidents in violation of state data privacy laws, such as California SB
1386, which requires a first name and last name in combination with any of the following: SSN,
bank account number, credit card number, or driver's license number.
While exception tuples can help you reduce false positives, if you are using several exception
tuples, it may be a sign your index is flawed. In this case, consider redoing your index so you
do not have to use so many excluded combinations to achieve the desired matches.

Use a WHERE clause to detect records that meet specific criteria


(EDM)
Another configuration parameter of the EDM policy condition is the "Where" clause option.
This option matches on the exact value you specify for the field you select. You can enter
multiple values by separating each with commas. Using a WHERE clause to detect records
that meet specific criteria helps you improve the accuracy of your EDM policies.
Detecting content using Exact Data Matching (EDM) 610
Best practices for using EDM

For example, if you wanted to match only on an Exact Data Profile for "Employees" with a
"State" field containing certain states, you could configure the match where "State" equals
"CA,NV". This rule then causes the detection engine to match a message that contains either
CA or NV as content.

Use the minimum matches field to fine tune EDM rules


The minimum matches field is useful for fine-tuning the sensitivity of an EDM rule. For example,
one employee's first and last name in an outgoing email may be acceptable. However, 100
employees' first and last names is a serious breach. Another example might be a last name
and social security number policy. The policy might allow an employee to send information to
a doctor, but the sending of two last names and social security numbers is suspicious.

Combine Data Identifiers with EDM rules to limit the impact of two-tier
detection
When implementing EDM policies, it is recommended that you combine Data Identifiers (DIs)
rules with the EDM condition to form compound policies. As reference, note that all
system-provided policy templates that implement EDM rules also implement Data Identifier
rules in the same policy.
Data Identifiers and EDM are both designed to protect personally identifiable information (PII).
Including Data Identifiers with your EDM rules make your policies more robust and reusable
across detection servers because unlike EDM rules Data Identifiers are executed on the
endpoint and do not require two-tier detection. Thus, if an endpoint is off the network, the Data
Identifier rules can protect PII such as SSNs.
Data Identifier rules are also useful to use in your EDM policies while you are gathering and
preparing your confidential data for EDM indexing. For example, a policy might contain the
US SSN Data Identifier and an EDM rule for as yet unindexed or unknown SSNs.

Include an email address field in the Exact Data Profile for profiled
DGM (EDM)
You must include the appropriate fields in the Exact Data Profile to implement profiled DGM.
See “Creating the exact data source file for profiled DGM for EDM” on page 537.
If you include the email address field in the Exact Data Profile for profiled DGM and map it to
the email data validator, email address will appear in the Directory EDM drop-down list (at
the remediation page).
Detecting content using Exact Data Matching (EDM) 611
Best practices for using EDM

Use profiled DGM for Network Prevent for Web identity detection
(EDM)
If you want to implement DGM for Network Prevent for Web, use one of the profiled DGM
conditions to implement identity matching. For example, you may want to use identity matching
to block all web traffic for a specific users. For Network Prevent for Web, you cannot use
synchronized DGM conditions for this use case.
See “Creating the exact data source file for profiled DGM for EDM” on page 537.
See “Configuring the Sender/User based on a Profiled Directory condition” on page 944.
Chapter 27
Detecting content using
Indexed Document
Matching (IDM)
This chapter includes the following topics:

■ Introducing Indexed Document Matching (IDM)

■ Configuring IDM profiles and policy conditions

■ Best practices for using IDM

■ Remote IDM indexing

Introducing Indexed Document Matching (IDM)


You use Indexed Document Matching (IDM) to protect confidential information that is stored
as unstructured data in documents and files. For example, you can use IDM to detect financial
report data stored in Microsoft Office documents, merger and acquisition information stored
in PDF files, and source code stored in text files. You can also use IDM to detect binary files,
such as JPEG images, CAD designs, and multimedia files. In addition, you can use IDM to
detect derived content such as text that has been copied from a source document to another
file.
See “Supported forms of matching for IDM” on page 613.
See “About the Indexed Document Profile” on page 615.
Detecting content using Indexed Document Matching (IDM) 613
Introducing Indexed Document Matching (IDM)

About using IDM


To use IDM you collect the documents and files that you want to protect and index the files
and documents using the Enforce Server. During the indexing process the system uses an
algorithm to fingerprint each file or file contents. You then create a policy that contains one or
more IDM conditions that reference the index. The system then checks files against the index
for matches.
For example, consider a document source you have collected that includes several confidential
Microsoft Office documents (Word, Excel, PowerPoint) and image files (JPEG, BMP). You
create an Indexed Document Profile and index the documents and files. You then configure
the Content Matches Document Signature policy condition with a Minimum Document
Exposure setting of 50%. The IDM policy and index are deployed to a detection server.
In production the detection server checks inbound files against the index for matches. If an
inbound text-based file that the system can extract the contents from contains 50% or more
of content indexed from one of the source documents, the system records a match. And, if an
inbound image file has the same binary signature as one of the files that has been indexed,
the system records a match. The server and agent perform exact file matching automatically
on binary (non-extractable) files even though the policy condition is configured for partial
matching.

Note: The Mac Agent is substantially the same as the Windows Agent, except that the Mac
Agent does not support two-tier detection, and different channels are supported on the Mac
Agent and Windows Agent. See “Overview of Mac agent detection technologies and policy
authoring features” on page 2280.

See “Types of IDM detection” on page 614.


See “About the Indexed Document Profile” on page 615.

Supported forms of matching for IDM


IDM supports three forms of matching: exact file, exact file contents, and partial file contents.
Detection servers support all three forms of matching. The DLP Agent supports exact file and
partial file contents matching locally on the endpoint.
Table 27-1 summarizes the forms of matching by the platforms that IDM supports.
Detecting content using Indexed Document Matching (IDM) 614
Introducing Indexed Document Matching (IDM)

Table 27-1 Forms of matching for IDM

Type of matching Description Platform

Partial file contents Match of discrete passages of extracted and normalized Detection server
file contents.
DLP Agent
See “Using IDM to detect exact and partial file contents”
on page 621.

Exact file Match is based on the binary signature of the file. Detection server

See “Using IDM to detect exact files” on page 620. DLP Agent

Exact file contents Match is an exact match of the extracted and normalized Detection server
file contents.
Note: Symantec recommends
See “Using IDM to detect exact and partial file contents” that you use partial file contents
on page 621. matching rather than exact file
contents matching.

Types of IDM detection


There are three types of IDM detection implementations: agent, server, and two-tier. The type
you choose is based on your data loss prevention requirements.
Table 27-2 summarizes the three types of IDM detection.

Table 27-2 Types of IDM detection

Type Description Details

Agent IDM The DLP Agent supports partial contents matching in See “Agent IDM detection”
addition to exact file matching locally on the endpoint. on page 614.

Server IDM The detection server performs exact file matching, exact See “Server IDM detection”
file contents matching, and partial file contents matching. on page 615.

Two-tier IDM The DLP Agent sends the data to the detection server for See “Two-tier IDM detection”
policy evaluation. on page 615.

Agent IDM detection


With Agent IDM detection the DLP Agent evaluates documents locally in real time for partial
file contents and exact file matches. Agent IDM lets you use the block, notify, and user cancel
response rules on the endpoint with IDM policies. Symantec Data Loss Prevention also supports
detection on stream-based channels such as Printing or Copying/Pasting from the Clipboard.
See “Supported forms of matching for IDM” on page 613.
Detecting content using Indexed Document Matching (IDM) 615
Introducing Indexed Document Matching (IDM)

Agent IDM is enabled by default for a newly installed Endpoint Server. Agent IDM for macOS
is enabled by default for newly installed Endpoint Servers, but disabled if you upgrade. In the
case of all upgrades, if you want to use agent IDM you must enable it and reindex your IDM
profiles so that the endpoint index is generated and made available for download by DLP
Agents.

Server IDM detection


With server IDM detection, the IDM index is deployed to one or more detection servers and
all detection processing occurs on the server or servers. You can use server IDM to perform
exact file matching and file contents matching. For file contents matching, you can choose to
match file contents exactly or partially (10% to 90%) according to the Minimum Document
Exposure set for the IDM condition.
See “Supported forms of matching for IDM” on page 613.

Two-tier IDM detection


Two-tier is a method of detection that requires communication and data transfer between the
DLP Agent and the Endpoint Server to detect incidents. It is recommended only if you have
very large indexes and the agents do not have enough space to support the profiles. Two-tier
detection has more latency than local detection and requires substantially more network
bandwidth. As a result, it does not support inline response rules for blocking or pop-up
notifications.
With two-tier IDM the DLP Agent sends the data to the Endpoint Server for matching against
the server index. If two-tier detection is enabled for IDM, the server supports all forms of
matching, including exact file, exact file contents, and partial file contents.

Note: Two-tier detection is not supported on agents running on macOS endpoints.

If you use two-tier detection for IDM on the Windows endpoint, make sure that you understand
the performance implications of two-tier detection.
See “Two-tier detection for DLP Agents” on page 395.

About the Indexed Document Profile


The Indexed Document Profile is the user-defined configuration for creating and generating
IDM indexes. You define an Indexed Document Profile using the Enforce Server administration
console. You reference the profile in one or more IDM policy rules or exceptions. The profile
is reusable across policies: you can create one document profile and reference it in multiple
policies. When you create the Indexed Document Profile, you have the option of indexing
the document source immediately on save of the profile or at a scheduled time. However, you
must index the document source before you can detect policy violations.
Detecting content using Indexed Document Matching (IDM) 616
Introducing Indexed Document Matching (IDM)

See “Creating and modifying Indexed Document Profiles” on page 629.


For example, consider a scenario where you want to create an IDM index to detect when exact
versions of certain documents are found, or when passages or sections of the documents are
exposed. When you define the Indexed Document Profile, you can upload the documents
to the Enforce Server, or you can index the documents using the Remote IDM Indexer. You
can also use file name and file size filters in the document profile to include or ignore certain
files during indexing.

About the document data source


The document data source is the collection of documents you want to index and detect using
IDM. The indexing algorithm uses a fixed amount of memory per document, so it is bound by
the number of documents, rather than their total size. With a profile using 2 GB when loaded
in memory, approximately 1,000,000 documents can be indexed. The exact number of
documents the system permits depends on how many documents have text that can be
extracted.
See “Preparing the document data source for indexing” on page 625.
For smaller document sets (50 MB or less), you can upload the source files to the Enforce
Server using a ZIP file. For larger document sets (up to 2 GB), you can copy the source files
to the host file system where the Enforce Server is installed, either encapsulated within a single
ZIP file or as individual files. You can use FTP/S to transfer the files to the Enforce Server.
Alternatively, you can use the Remote IDM Indexer to remotely index documents.
See “About indexing remote documents” on page 617.
The document data source can contain any file type and any combination of files. If the system
can extract the contents of the file, IDM detects file contents, either exactly or partially depending
on the platform and the policy configuration. If the system cannot extract the contents of the
file, IDM detects the exact file.
See “Supported forms of matching for IDM” on page 613.

About the indexing process


The IDM indexer is a separate process that installs with and runs on the Enforce Server. Partial
matching is disabled by default on the Agent, and enabled by default on the Detection Server.
See “Configure endpoint partial content matching” on page 632.
The number of documents you can index has increased to up to 1,000,000 on the Server and
up to 30,000 on the Agent. These values are based on initial default limits of 2 GB/60 MB. You
can change the 60 MB limit on the Configure Partial Matching page. While it is possible to
reconfigure the 2 GB limit by changing the size of
com.vontu.profiles.documents.maxIndexSize in \Program
Files\Symantec\DataLossPrevention\EnforceServer\15.5\Protect\config\indexer.properties,
Detecting content using Indexed Document Matching (IDM) 617
Introducing Indexed Document Matching (IDM)

Symantec recommends that you contact Symantec Support before reconfiguring properties
files.
During indexing, the system stores the document source by changing \Program
Files\Symantec\DataLossPrevention\ServerPlatformCommon\15.5\Protect\documentprofiles
(on Windows) or
/var/Symantec/DataLossPrevention/ServerPlatformCommon/15.5/documentprofiles
(on Linux).
The result of the indexing process is four separate indexes: one for detection servers (the
server index) and three for DLP Agents (the endpoint indexes). All indexes are generated
regardless of whether or not you are licensed for Endpoint Prevent or Endpoint Discover. On
the Enforce Server, the system stores the indexes in \Program
Files\Symantec\DataLossPrevention\EnforceServer\15.5\Protect\index (on Windows)
or /var/Symantec/DataLossPrevention/EnforceServer/15.5/index (on Linux).
See “About the server index files and the agent index files” on page 618.
For most IDM deployments there is no need to configure the indexer. If necessary you can
configure key settings for the indexer using the file \Program
Files\Symantec\DataLossPrevention\EnforceServer\15.5\Protect\config\Indexer.properties.

Note: Symantec recommends that you contact Symantec Support for guidance if you decide
to modify a properties file. Modifying properties incorrectly can cause serious issues with the
operation of Symantec Data Loss Prevention.

About indexing remote documents


IDM indexing can be done on the Enforce Server or remotely, using the Remote IDM Indexer.
See “Creating and modifying Indexed Document Profiles” on page 629.
Using the CIFS protocol you can remotely index documents that are stored on one or more
file shares in a Microsoft Windows-networked environment. You provide the Universal Naming
Convention (UNC) path to a shared network folder resource and index the documents that
stored in that folder or subfolders depending on the level of permission granted.
See “Using the remote SMB share option to index file shares” on page 637.
WebDAV provides extensions to the HTTP 1.1 protocol that enable collaborative editing and
management of files that are stored on remote web servers. You can index such documents
remotely by exposing them to the Enforce Server using WebDAV. For example, you can use
the remote SMB option with a UNC address and a WebDAV client to index Microsoft SharePoint
or OpenText Livelink documents.
See “Using the remote SMB share option to index SharePoint documents” on page 637.
Detecting content using Indexed Document Matching (IDM) 618
Introducing Indexed Document Matching (IDM)

Note: To index documents on a SharePoint server using the Remote SMB Share option, you
must deploy the Enforce Server to a supported Windows Server operating system host. Data
Loss Prevention depends on Windows NTLM services to mount a WebDAV server.

About the server index files and the agent index files
When you create an Indexed Document Profile and index a document data source, the
system generates four index files, one for the server and three for the endpoint. The indexes
are generated regardless of whether or not you are licensed for a particular detection server
or the DLP Agent.
See “About index deployment and logging” on page 619.
The server index is a binary file named DocSource.rdx. The server index supports exact file,
exact file contents, and partial file contents matching. If the document data source is large,
the server index may span multiple *.rdx files.
The endpoint index is comprised of one secure binary file, either EndpointDocSource.rdx or
LegacyEndpointDocSource.rdx for backward compatibility with 14.0 and 12.5 Agents. The
endpoint index supports exact file and partial file contents matching. EncryptedDocSource.rdx
is for endpoint partial matching.
See “Supported forms of matching for IDM” on page 613.
To create the index entries for exact file and exact file contents matching, the system uses the
MD5 message-digest algorithm. This algorithm is a one-way hash function that takes as input
a message of arbitrary length and produces as output a 128-bit message-digest or "fingerprint"
of the input. If the message input is a text-based document that the system can extract contents
from, such as a Microsoft Word file, the system extracts all of the file content, normalizes it by
removing whitespace, punctuation, and formatting, and creates a cryptographic hash. Otherwise,
if the message input is a file that the system cannot extract the contents from, such as an
image file, small file, or unsupported file type, the system creates a cryptographic hash based
on the binary signature of the file.

Note: To improve accuracy across different versions of the Enforce Server and DLP Agent,
only binary matching MDF is supported on the agent, whether or not the file contains text.

See “Using IDM to detect exact files” on page 620.


See “Using IDM to detect exact and partial file contents” on page 621.
In addition, for file formats the system can extract the contents from, the indexer creates hashes
for discrete sections of content or text passages. These hashes are used for partial matching
for both server and agent indexes. The system uses a selection method to store hashed
sections of partial content so that not all extractable text is indexed. The hash function ensures
Detecting content using Indexed Document Matching (IDM) 619
Introducing Indexed Document Matching (IDM)

that the server index does not contain actual document content. Table 27-3 summarizes the
types of matching supported by the endpoint and server indexes.

Table 27-3 Types of matching supported by the endpoint and server indexes

Message input Output Matches Included in index file

A single cryptographic hash Exact file contents DocSource.rdx


derived from all of the extracted
LegacyEndpointDocSource.rdx
and normalized file contents
Text-based file that the
system can extract the
One or more rolling hashes based Partial file DocSource.rdx
contents from
on discrete passages of extracted contents (10% to
EndpointDocSource.rdx
and normalized content using a 90%)
selection method EncryptedDocSource.rds

Binary file, custom file, A single cryptographic hash based Exact file binary DocSource.rdx
small file, encapsulated on the binary signature of the file
EndpointDocSource.rdx
file
LegacyEndpointDocSource.rdx
Agent only: Text-based
file that the system can
extract the contents
from.

About index deployment and logging


The Enforce Server is responsible for deploying the IDM server and endpoint indexes to the
detection and Endpoint Servers. You cannot manually deploy the indexes.
The system deploys the server index to each designated detection server in the folder \Program
Files\Symantec\DataLossPrevention\EnforceServer\15.5\Protect\index (on Windows)
or /var/Symantec/DataLossPrevention/EnforceServer/15.5/index (on Linux). At run-time,
the detection server loads the server index into random access memory (RAM) when an active
IDM policy that references that index is deployed to that detection server.
The system deploys the endpoint index (either EndpointDocSource.rdx or
LegacyEndpointDocSource.rdx) to each designated Endpoint Server. When a DLP Agent
connects to the Endpoint Server, the DLP Agent downloads the endpoint index. Assuming
agent IDM is enabled, the DLP Agent loads the endpoint index into memory when the index
is required by an active local policy.
See “Estimating endpoint memory use for agent IDM” on page 646.
You cannot manually deploy either the server or endpoint index files by copying the *.rdx file
or files from the Enforce Server to a detection server. The detection server does not monitor
the index destination folder for new index files; the detection server must be notified by the
Enforce Server that an index has been deployed. If a detection server is offline during the
Detecting content using Indexed Document Matching (IDM) 620
Introducing Indexed Document Matching (IDM)

index deployment process, the Enforce Server stops trying to deploy the index. When the
detection server comes back online the Enforce Server deploys the index to the detection
server. The same is true for DLP Agents. There is no way to manually copy the endpoint index
to the endpoint host and have the DLP Agent recognize the index.
Table 27-4 summarizes how IDM indexes are deployed and the logs files to check to
troubleshoot index deployment.

Table 27-4 IDM index deployment and logging

Platform Index file Deployment Logged

Server DocSource.rdx Sent automatically by the Enforce detection_operational.log


Server to each designated detection
Use to identify if the index profile was
server after the index is generated.
deployed to the detection server.
Loaded by the detection server into
FileReader.log
RAM at run-time.
Use to determine if the index profile is
loaded into memory.

Agent EndpointDocSource.rdx Both of these files are sent by the endpoint_server_operational.log


Enforce Server to each designated
or Use to identify if the index profile was
Endpoint Server. The agent selects
deployed to the Endpoint Server.
LegacyEndpoint the appropriate file, based on the
DocSource.rdx version of the agent. Pull the agent logs to see if the index
profile is loaded into memory.
LegacyEndpointDocSource.rdx
is for backward compatibility with 14.0
and 12.5 Agents

Downloaded by the DLP Agent based


on the agent connection interval.

Loaded into RAM at run-time when a


local, active policy requires the index.

Using IDM to detect exact files


The system performs exact file matching automatically on all binary files. In addition, if the file
format is text-based but the system is unable to c extract the contents from the file, the system
performs exact file matching. This behavior is true even if you select a Minimum Document
Exposure percentage for the IDM condition that is less than Exact. The DLP Agent performs
exact file matching on all files, both binary files and files with extractable text.
See “About the server index files and the agent index files” on page 618.
For example, an IDM rule with a minimum document exposure set to 50% automatically
attempts to match a binary file exactly because the Minimum Document Exposure setting
only applies to files that the system cannot extract the contents from. In addition, the system
Detecting content using Indexed Document Matching (IDM) 621
Introducing Indexed Document Matching (IDM)

performs exact file matching for files containing a very small amount of text, as well as files
that were encapsulated when indexed, even if text-based.
As an optimization for exact file type matching in Endpoint IDM detection, the system checks
the byte size of the file before computing the run-time hash for comparison against the index.
If the byte size does not match size of the indexed file there is no need to compute the exact
file hash. The system does not consider the file format when creating the exact file fingerprint.
Table 27-5 summarizes exact file type matching behavior.

Table 27-5 Requirements for using IDM to detect files

File format Example Description

File format from which the Proprietary or non-supported If the system cannot extract the contents from the file
system cannot extract the document format format, you can use IDM to detect that specific file
contents using exact binary matching.

See “Do not compress files in the document source”


on page 649.

Binary file GIF, MPG, AVI, CAD design, You can use IDM to detect binary file types from
JPEG files, audio/video files which you cannot extract the contents, such as
images, graphics, JPEGs, etc. Binary file detection
is not supported on stream-based channels.

File containing a small CAD files and Visio diagrams A file containing a small amount of text is treated as
amount of text a binary file even if the contents are text-based and
can have their contents extracted.

See “Using IDM to detect exact and partial file


contents” on page 621.

Encapsulated file Any file that is encapsulated when If a document data source file is encapsulated in an
indexed (even if text-based and archive file, the file contents of the subfile cannot be
can have their contents extracted and only the binary signature of the file can
extracted); for example, Microsoft be fingerprinted. This does not apply to document
Word file archived in a ZIP file archive that are indexes.

See “About the document data source” on page 616.

Using IDM to detect exact and partial file contents


The primary use case for IDM is to detect file contents (as distinguished from binary files, such
as audio or video files, for example). On both the server and the endpoint, you can use IDM
to match files exactly or partially (10% to 90%). Additionally, on the server, file contents can
be matched exactly. Symantec recommends that you use partial content match because it is
much more reliable than exact content match. File contents include text-based content of any
Detecting content using Indexed Document Matching (IDM) 622
Introducing Indexed Document Matching (IDM)

document type the system can extract the file contents from, such as Microsoft Office documents
(Word, Excel, PowerPoint), PDF, and many more.
See “Supported formats for content extraction” on page 980.
An exact file contents match means that the normalized extracted content from the file matches
exactly the content of a file that has been indexed. With partial matching on the endpoint, using
a 90% threshold generates 90% to 100% content matches. These are less strict than the
previous exact content matches and may, in some cases, match even if there are some minor
differences between the scanned file and the indexed file.
The system does not consider the file format or file size when creating the cryptographic hash
for the index or when checking for an exact file contents match against the index. A document
might contain much more content, but the system detects only the file contents that are indexed
as part of the Indexed Document Profile. For example, consider a situation where you index
a one-page document, and that one-page document is included as part of a 100-page document.
The 100-page document is considered an exact match because its content matches the
one-page document exactly.
See “About the server index files and the agent index files” on page 618.
For text-based files from which you can extract the contents, in addition to creating the MD5
fingerprint for exact file contents matching, the system uses a rolling hash algorithm to register
discrete sections or passages of content. In this case the system uses a selection method to
store hashed sections of content; not all text is hashed in the index. The index does not contain
actual document content.
Table 27-6 lists the requirements to match file contents using IDM.

Table 27-6 Requirements for using IDM to detect content

Requirement Description

File formats from The system must be able to extract the the file format and extract file content. Data Loss
which you can extract Prevention supports content extraction for over 100 file types.
the contents
See “Supported formats for content extraction” on page 980.

Unencapsulated file To match file contents, the source file cannot be encapsulated in an archive file when the
source file is indexed. If a file in the document source is encapsulated in an archive file, the
system does not index the file contents of the encapsulated file. Any encapsulated file is
considered for exact matches only, like image files and other unsupported file formats.

See “Do not compress files in the document source” on page 649.
Note: The exception to this is the main ZIP file that contains the document data source, for
those upload methods that use an archive file. See “Creating and modifying Indexed Document
Profiles” on page 629.
Detecting content using Indexed Document Matching (IDM) 623
Introducing Indexed Document Matching (IDM)

Table 27-6 Requirements for using IDM to detect content (continued)

Requirement Description

Minimum amount of For exact file contents matching, the source file must contain at a minimum 50 characters of
text normalized text before the extracted coProgram
Files\Symantec\DataLossPrevention\EnforceServertent is indexed. Normalization involves
the removal of punctuation and whitespace. A normalized character therefore is either a
number or a letter. This size is set by the min_normalized_size=50 parameter in the file
\Program Files\Symantec\DataLossPrevention
\EnforceServer\15.5\Protect\config\Indexer.properties. If file contains less
than 50 normalized characters, the system performs an exact file match against the file binary.
Note: Symantec advises that you consult with Symantec Support for guidance if you need to
change an advanced setting or edit a properties file. Incorrectly updating a properties file can
have unintended consequences.

For partial file contents matching, there must be at least 300 normalized characters. However,
the exact length is variable depending on the file contents and encoding.

See “Do not index empty documents” on page 649.

Maximum amount of The default maximum size of the document that can be processed for content extraction at
text run-time is 30,000,000 bytes. If your document is over 30,000,000 bytes you need to increase
the default maximum size in Advanced server settings. Contact Symantec Support for
assistance when changing Advanced server settings, to avoid any unintended consequences.

About using the Content Matches Document Signature policy


condition
You use the IDM condition Content Matches Document Signature From to implement IDM
detection rules and exceptions in your policies.
See “Configuring the Content Matches Document Signature policy condition” on page 646.
When you configure this condition, you specify the IDM index to use and how the condition
should match against the index using the Minimum Document Exposure setting. You can
select either Exact or partial between 10% to 90%. For example, if you select 70% for the
Minimum Document Exposure, a match occurs only if 70% or more of the hashed file contents
is detected.
See “Use parallel IDM rules to tune match thresholds” on page 654.
If a file is not text-based, its content is not extractable, is very small, or is encapsulated in an
archive file, the file is matched exactly based on its binary signature. This form of matching is
performed automatically by the system, regardless of what configuration option you choose
for the Minimum Document Exposure setting. This setting only applies to partial file contents
matching.
Detecting content using Indexed Document Matching (IDM) 624
Introducing Indexed Document Matching (IDM)

See “Using IDM to detect exact files” on page 620.


Table 27-7 describes the matching supported by the Content Matches Document Signature
From policy condition.

Table 27-7 Minimum document exposure settings for the IDM condition

Configuration setting File contents Match Example

Exact file matching File contents All of the extracted and Microsoft Word
normalized file contents, if
See “Using IDM to detect
the file is text-based and
exact and partial file
from which the content is not
contents” on page 621.
extractable

Exact content matching The endpoint performs Microsoft Word, JPG, MP3
binary matching on all files.

Partial content matching File contents Discrete passages of text Microsoft Word

See “Using IDM to detect


exact and partial file
contents” on page 621.

About white listing partial file contents


Often sensitive documents contain standard boilerplate text that does not require protection,
including front matter, headers, and footers. Information contained in document headers and
footers is likely to cause false positives. Likewise, boilerplate text, such as standard language
and non-proprietary corporate content that is repeated across confidential documents, can
cause false positives.
See “White listing file contents to exclude from partial matching” on page 627.
Removing non-sensitive boilerplate or header/footer content before indexing is usually not
feasible, especially if you have a large document data set. In this case you can configure the
system to exclude ("whitelist") non-sensitive text. You do this by adding the text to ignore to
the whitelist file. During indexing, any whitelisted content found in the source files is ignored.
At run-time the content does not cause false positives because it has been excluded.
See “Use white listing to exclude non-sensitive content from partial matching” on page 651.

Note: White listing only applies to partial file contents matching; it does not apply to exact file
contents matching. The white listing file is not checked at run-time when the system computes
the cryptographic hashes for exact file contents matching.
Detecting content using Indexed Document Matching (IDM) 625
Configuring IDM profiles and policy conditions

Configuring IDM profiles and policy conditions


Table 27-8 provides the workflow for creating IDM profiles and configuring IDM policies.
Complete the steps to ensure that your IDM rules are properly implemented and are as accurate
and efficient as possible.

Table 27-8 Implementing IDM

Step Action Description

1 Identify the content you want to protect and See “Using IDM to detect exact and partial file contents”
collect the documents that contain this on page 621.
content.
See “Using IDM to detect exact files” on page 620.

2 Prepare the documents for indexing. See “Preparing the document data source for indexing”
on page 625.

3 Whitelist headers, footers, and boilerplate See “White listing file contents to exclude from partial
text. matching” on page 627.

4 Create an Indexed Document Profile and See “Creating and modifying Indexed Document Profiles”
specify the document source. on page 629.

5 Configure any document source filters. See “Filtering documents by file name” on page 640.

6 Schedule indexing as necessary. See “Scheduling document profile indexing” on page 643.

7 Configure one ore more IDM policy conditions See “Configuring the Content Matches Document Signature
or exceptions. policy condition” on page 646.

8 Test and troubleshoot your IDM See “Troubleshooting policies” on page 445.
implementation.

Preparing the document data source for indexing


You must collect and prepare the documents you want to index. These documents are known
as the document data source.
See “About the document data source” on page 616.
A document data source is a ZIP archive file that contains the documents to index. It can also
be the files stored in a file share on a local or remote computer. A document data source ZIP
file can contain any file type and any combination of files. If you have a file share that already
contains the documents you want to protect, you can reference this share in the document
profile.
Detecting content using Indexed Document Matching (IDM) 626
Configuring IDM profiles and policy conditions

Table 27-9 Preparing the document source for indexing

Step Action Description

1 Collect all of the documents Collect all of the documents you want to index and put them in a folder.
you want to protect.
See “About the document data source” on page 616.

2 Uncompress all the files you The files you index should be in their unencapsulated, uncompressed state.
want to index. Check the document collection to make sure none of the files are
encapsulated in an archive file, such as ZIP, TAR, or RAR. If a file is
embedded in an archive file, extract the source file from the archive file and
remove the archive file.

See “Using IDM to detect exact and partial file contents” on page 621.

3 Separate the documents if To protect a large amount of content and files, create separate collections
you have more than for each set of documents over 1,000,000 files in size, with all files in their
1,000,000 files to index. unencapsulated, uncompressed state. For example, if you have 15,000,000
documents you want to index, separate the files by folders, one folder
containing 750,000 files, and another folder containing the remaining 750,000
files. or, you can change the value of
com.vontu.profiles.documents.maxIndexSize in the
Indexer.properties to accommodate larger data sets. The rule of thumb is
2 GB/1 million documents.

See “Create separate profiles to index large document sources” on page 653.

4 Decide how you are going to The indexing process is a separate process that runs on the Enforce Server.
make the document source To index the document source you must make the files accessible to the
files available to the Enforce Enforce Server. You have several options. Decide which one works best
Server. for your needs and proceeding accordingly.

See “Uploading a document archive to the Enforce Server” on page 633.

See “Referencing a document archive on the Enforce Server” on page 634.

See “Using local path on Enforce Server” on page 636.

See “Using the remote SMB share option to index file shares” on page 637.

5 Configure the document The next step is to configure the document profile, or, alternatively, if you
profile. want to exclude specific document content from detection, whitelist it.

See “Creating and modifying Indexed Document Profiles” on page 629.

See “White listing file contents to exclude from partial matching” on page 627.
Detecting content using Indexed Document Matching (IDM) 627
Configuring IDM profiles and policy conditions

White listing file contents to exclude from partial matching


You use white listing to exclude unimportant or noncritical content, such as standard boilerplate
text, document headers and footers, from the IDM index. White listing such content helps to
reduce false positives.
See “About white listing partial file contents” on page 624.
See “Use white listing to exclude non-sensitive content from partial matching” on page 651.
To exclude content from matching, you copy the content you want to exclude to a text file and
save the file as Whitelisted.txt. By default, the file must contain at least 300 non-whitespace
characters to have its content fingerprinted for white listing purposes. When you index the
document source, the Enforce Server or the Remote IDM Indexer looks for the
Whitelisted.txt file.

See “Use white listing to exclude non-sensitive content from partial matching” on page 651.
Table 27-10 describes the process for excluding document content using white listing.

Table 27-10 White listing non-sensitive content

Step Action Description

1 Copy the content you want to Copy only noncritical content you want to exclude, such as standard
exclude from matching into a text boilerplate text and document headers and footers, to the text file. By
file. default, for file contents matching the file to be indexed must contain
at least 300 characters. This default setting applies to the
Whitelisted.txt file as well. For whitelisted text you can change
this default setting.

See “Changing the default indexer properties” on page 644.

2 Save the text file as The Whitelisted.txt file is the source file for storing content you
Whitelisted.txt. want to exclude from matching.

3 Save the file to the Save the file to \Program


whitelisted directory on the Files\Symantec\DataLossPrevention\ServerPlatformCommon
Enforce Server host file system. \15.5\Protect\documentprofiles\whitelisted (on Windows)
or
/var/Symantec/DataLossPrevention/ServerPlatformCommon
/15.5/documentprofiles/whitelisted (on Linux).
Detecting content using Indexed Document Matching (IDM) 628
Configuring IDM profiles and policy conditions

Table 27-10 White listing non-sensitive content (continued)

Step Action Description

4 Configure the Indexed When you index the document data source, the Enforce Server looks
Document Profile and generate for the Whitelisted.txt file. If the file exists, the Enforce Server
the index. copies it to Whitelisted.x.txt, where x is a unique identification
number corresponding to the Indexed Document Profile. Future
indexing of the profile uses the profile-specific Whitelisted.x.txt
file, not the generic Whitelisted.txt file.

See “Creating and modifying Indexed Document Profiles” on page 629.

Manage and add Indexed Document Profiles


The Manage > Data Profiles > Indexed Documents screen lists all configured Indexed
Document Profiles in the system. From this screen you can manage existing profiles and
add new ones.

Table 27-11 Indexed Documents screen actions

Action Description

Add IDM profile Click Add Document Profile to create a new Indexed Document Profile.

See “Configuring IDM profiles and policy conditions” on page 625.

Edit IDM profile Click the name of the Document Profile, or click the pencil icon to the far right of the profile, to
modify an existing Document Profile.

See “Creating and modifying Indexed Document Profiles” on page 629.

Remove IDM profile Click the red X icon next to the far right of the document profile row to delete that profile from
the system. A dialog box confirms the deletion.
Note: You cannot edit or remove a profile if another user currently modifies that profile, or if a
policy exists that depends on that profile.

Refresh IDM profile Click the refresh arrow icon at the upper right of the Indexed Documents screen to fetch the
status latest status of the indexing process. If you are in the process of indexing, the system displays
the message "Indexing is starting." The system does not automatically update the screen when
the indexing process is complete.

Table 27-12 Indexed Documents screen details

Column Description

Document Profile The name of the Indexed Document Profile.


Detecting content using Indexed Document Matching (IDM) 629
Configuring IDM profiles and policy conditions

Table 27-12 Indexed Documents screen details (continued)

Column Description

Detection server The name of the detection server that indexes the Document Profile and the Document Profile
version.

Click the triangle icon beside the Document Profile name to display this information. It appears
beneath the name of the Document Profile.

Location The location of the file(s) on the Enforce Server that the system has profiled and indexed.

Documents The number of documents that the system has indexed for the document profile.

Status The current status of the document indexing process, which can be any of the following:

■ Next scheduled indexing (if it is not currently indexing)


■ Sending an index to a detection server
■ Indexing
■ Deploying to a detection server

In addition, beneath the status of the indexing process, the system displays the status of each
detection server, which can be any of the following:

■ Completed, including a completion date


■ Pending index completion (that is, waiting for the Enforce Server to finish indexing a file)
■ Replicating indexing
■ Creating index (internally)

Error messages The Indexed Document screen also displays any error messages in red (for example, if the
document profile is corrupted or does not exist).

See “Data Profiles” on page 375.


See “Scheduling document profile indexing” on page 643.
See “Configuring the Content Matches Document Signature policy condition” on page 646.

Creating and modifying Indexed Document Profiles


You define and configure an Indexed Document Profile at the screen Manage > Data Profiles
> Indexed Documents > Configure Document Profile. The document profile specifies the
document data source, the indexing parameters, and the indexing schedule. You must define
a document profile to implement IDM detection.
See “About the Indexed Document Profile” on page 615.
Table 27-13 describes the steps for creating and modifying IDM profiles.
Detecting content using Indexed Document Matching (IDM) 630
Configuring IDM profiles and policy conditions

Table 27-13 Configuring a document profile

Step Action Description

1 Navigate to the screen Manage You must be logged on to the Enforce Server administration console
> Data Profiles > Indexed as an administrator or policy author.
Documents.
See “Policy authoring privileges” on page 375.

2 Click Add Document Profile. Select an existing Indexed Document Profile to edit it.

See “Manage and add Indexed Document Profiles” on page 628.

3 Enter a Name for the Document Choose a name that describes the data content and the index type
Profile. (for example, "Research Docs IDM"). The name is limited to 255
characters.

See “Input character limits for policy configuration” on page 431.


Detecting content using Indexed Document Matching (IDM) 631
Configuring IDM profiles and policy conditions

Table 27-13 Configuring a document profile (continued)

Step Action Description

4 Select the Document Source Select one of the five options for indexing the document data source,
method for indexing. depending on how large your data source is and how you have
packaged it.

See “About the document data source” on page 616.


Options for making the data source available to the Enforce Server.

■ Upload Document Archive to Server Now


To use this method, you Browse and select a ZIP file containing
the documents to be indexed. The maximum size of the ZIP file
is 50 MB.
See “Uploading a document archive to the Enforce Server”
on page 633.
■ Reference Archive on Enforce Server
Use this method if you have copied the ZIP file to the file system
host where the Enforce Server is installed. The maximum size of
the ZIP file is 2 GB. This ZIP file is available for selection in the
drop-down field.
See “Referencing a document archive on the Enforce Server”
on page 634.
■ Use Local Path on Enforce Server
This method lets you index individual files that are local to the
Enforce Server. With this method the files to be indexed cannot
be archived in a ZIP file.
See “Using local path on Enforce Server” on page 636.
■ Use Remote SMB Share
See “About indexing remote documents” on page 617.
■ Import from a remotely created IDM profile

The Remote IDM Indexer is a standalone tool that lets you index
your confidential documents and files locally on the systems where
these files are stored. See Remote IDM Indexing See “About the
Remote IDM Indexer” on page 655. for more information.
■ See “Using the remote SMB share option to index SharePoint
documents” on page 637.
Detecting content using Indexed Document Matching (IDM) 632
Configuring IDM profiles and policy conditions

Table 27-13 Configuring a document profile (continued)

Step Action Description

5 Optionally, configure any Filters. You can specify file name and file size filters in the document profile.
The filters tell the system which files to include or ignore during
indexing.

See “Filter documents from indexing to reduce false positives”


on page 652.

Enter files to include in the File Name Include Filters field, or enter
files to exclude in the File Name Exclude Filters field.

See “Filtering documents by file name” on page 640.

Select file sizes to ignore, either Ignore Files Smaller Than or Ignore
Files Larger Than.

See “Filtering documents by file size” on page 642.

6 Select one of the Indexing As part of creating a document profile, you can set up a schedule for
options. indexing the document source.
You do not have to select an indexing option to create a profile that
you can reference in a policy, but you must select an indexing option
to generate the index and actually detect matches using an IDM policy.

■ Select Submit Indexing Job on Save to index the document


source immediately on save of the Document Profile.
■ Select Submit Indexing Job on Schedule to display schedule
options so that you can schedule indexing at a later time.
See “Scheduling document profile indexing” on page 643.

7 Click Save. You must save the document profile.

Configure endpoint partial content matching


You can enable or disable Endpoint partial content matching for IDM profiles on the Enforce
Server administration console at Manage > Data Profiles > Indexed Documents > Configure
Endpoint Partial Matching. This page displays a snapshot in time of all deployed profiles
with their estimated current size. When you click Save, the profiles that you have selected
have partial matching enabled.
Table 27-14 describes the steps for configuring partial content matching on the endpoint.
Detecting content using Indexed Document Matching (IDM) 633
Configuring IDM profiles and policy conditions

Table 27-14 Configuring endpoint partial content matching

Step Action Description

1 Navigate to the Manage >


Data Profiles > Indexed
Documents> screen.

2 Click Configure Partial The Configure Partial Content Matching page displays a
Matching. snapshot of all profiles that are deployed at the time you
access the page, along with their estimated current size.
Note: The Configure Partial Content Matching page is not
accessible while any IDM profile is being indexed.

3 Click the checkbox under


Note: If a profiles starts re-indexing when you are on this
Endpoint Partial Matching
page, and the profile size changes significantly, and if the
for all profiles that you want
profile is also selected for partial matching, the list of selected
to enable for partial matching.
profiles might be affected.

4 Click Save.
Note: The sum of all deployed profiles on the endpoint cannot
exceed the value of Endpoint Total Profile Size (MB), which
is set to a default 60 MB. To change this value, enter a
different value in the Endpoint Total Profile Size (MB) box.

After you click Save, the profiles that you have selected have
partial matching enabled. Click Refresh to ensure that you
have the latest status of the indexing operation.

Uploading a document archive to the Enforce Server


The Upload Document Archive to Server Now option lets you upload a ZIP file with a
maximum size of 50 MB to the Enforce Server and index its contents. To use this method of
indexing, the document source must meet the requirements described in the table Table 27-15
To upload the document archive to Enforce Server describes the process for using the Upload
Document Archive to Server Now method of indexing.
Detecting content using Indexed Document Matching (IDM) 634
Configuring IDM profiles and policy conditions

To upload the document archive to Enforce Server


1 Navigate to the screen Manage > Data Profiles > Indexed Documents > Configure
Document Profile.
2 Select the option Upload Document Archive to Server Now.
Click Browse and select the ZIP file. The ZIP file can be anywhere on the same network
as the Enforce Server.
Optionally, you can type the full path and the file name if the ZIP file is local to the Enforce
Server, for example: c:\Documents\Research.zip.
3 Specify one or more file name or file size filters (optional).
See “Filtering documents by file name” on page 640.
4 Select one of the indexing options (optional).
See “Scheduling document profile indexing” on page 643.
5 Click Save.

Table 27-15 Requirements for using the Upload Document Archive to Server Now option

Requirement Description

ZIP file only The document archive must be a ZIP file; no other encapsulation formats are supported
for this option.

50 MB or less You cannot use this option if the document archive ZIP file is more than 50 MB because
files exceeding that size limit can take too long to upload and slow the performance of the
Enforce Server. If the document archive ZIP file is over 50 MB, use the Reference Archive
on Enforce Server method instead.

UTF-8 file names only The IDM indexing process fails (and presents you with an "unexpected error") if the
document archive (ZIP file) contains non-ASCII file names in encodings other that UTF-8.
If the ZIP file contains files with non-ASCII file names, use one of the following options
instead to make the files available to the Enforce Server for indexing:

■ Use the Remote IDM Indexer.


■ Use Local Path on Enforce Server
■ Use Remote SMB Share

Referencing a document archive on the Enforce Server


You use the Reference Archive on Enforce Server option to create an IDM index based on
a ZIP file that is local to the Enforce Server. You use this option to index source documents
that are archived in a ZIP file that is larger than 50 MB.
See “About the document data source” on page 616.
Detecting content using Indexed Document Matching (IDM) 635
Configuring IDM profiles and policy conditions

Note: If the ZIP file is less than 50 MB, you can use the Upload Document Archive to Server
Now option instead. See “Uploading a document archive to the Enforce Server” on page 633.

To use the Reference Archive on Enforce Server option, you copy the ZIP file to the \Program
Files\Symantec\DataLossPrevention\EnforceServer\Protect\documentprofiles folder
on the Enforce Server file system host. Once you have copied the ZIP file to the Enforce
Server, you can select the document source from the pull-down menu at the Add Document
Profile screen. See “Creating and modifying Indexed Document Profiles” on page 629.
To reference the document archive on the Enforce Server describes the procedure for using
the Reference Archive on Enforce Server option.
To reference the document archive on the Enforce Server
1 Copy the ZIP file to the Enforce Server.
■ On Windows, copy the ZIP file to directory \Program
Files\Symantec\DataLossPrevention\ServerPlatformCommon\15.1\Protect\documentprofiles

■ On Linux, copy the ZIP file to directory


/var/Symantec/DataLossPrevention/ServerPlatformCommon/15.5/documentprofiles

See Table 27-16 on page 636.

Note: The system deletes the document data source file after the indexing process
completes.

2 Log on to the Enforce Server administration console.


3 Navigate to the screen Manage > Data Profiles > Indexed Documents > Configure
Document Profile.
4 Select the file from the Reference Archive on Enforce Server pull-down menu.

Note: A document source currently referenced by another Indexed Document Profile


does not appear in the list.

5 Specify one or more file name or file size filters (optional).


See “Filtering documents by file name” on page 640.
6 Select one of the indexing options (optional).
See “Scheduling document profile indexing” on page 643.
7 Click Save to save the document profile.
Detecting content using Indexed Document Matching (IDM) 636
Configuring IDM profiles and policy conditions

Table 27-16 Requirements to use the option Reference Archive on Enforce Server

Requirement Description

ZIP file only The document archive must be a ZIP file; no other encapsulation formats are supported
for this option.

The ZIP file can be at the most 2 GB. Consider using a third-party solution (such as Secure
FTP), to copy the ZIP file securely to the Enforce Server.

See “About the document data source” on page 616.

subfile not archived Make sure the subfiles are proper and not encapsulated in an archive (other than the
top-level profile archive).

See “Do not compress files in the document source” on page 649.

See “Do not index empty documents” on page 649.

UTF-8 file names only Do not use this method if any of the names of the files you are indexing contain non-ASCII
file names.
Use either of the following options instead:

■ Use the Remote IDM Indexer.


■ Use Local Path on Enforce Server
See “Using local path on Enforce Server” on page 636.
■ Use Remote SMB Share
See “Using the remote SMB share option to index file shares” on page 637.

Using local path on Enforce Server


The Use Local Path on Enforce Server method lets you index individual files that are local
to the Enforce Server. With this method the files to be indexed cannot be archived in a ZIP
file.
See “Creating and modifying Indexed Document Profiles” on page 629.
To use the Use Local Path on Enforce Server method of making the document source
available to the Enforce Server for indexing, you enter the local path to the directory that
contains the documents to index. For example, if you copied the files to the file system at
directory C:\Documents, you would enter C:\Documents in the field for the Use Local Path
on Enforce Server option. You must specify the exact path, not a relative path. Do not include
the actual file names in the path.

Note: If the files you index include a file that is more than 2 GB in size, the system indexes all
the files except the 2 GB file. This only applies to the Use Local Path on Enforce Server
option. It does not apply to the Reference Archive on Enforce Server option.
Detecting content using Indexed Document Matching (IDM) 637
Configuring IDM profiles and policy conditions

Using the remote SMB share option to index file shares


The Use Remote SMB Share method lets you index documents remotely using the Common
Internet File System (CIFS) protocol. To use this method of making the document source
available to the Enforce Server, you enter the Universal Naming Convention (UNC) path for
the Server Message Block (SMB) share that contains the documents to index
See “About indexing remote documents” on page 617.
See “To index remote documents on file shares using CIFS” on page 637. provides the steps
for using CIFS to index remote documents.

Note: Symantec Data Loss Prevention does not delete documents after indexing when you
use the Use Remote SMB Share option.

To index remote documents on file shares using CIFS


1 Log on to the Enforce Server administration console.
2 Navigate to the screen Manage > Data Profiles > Indexed Documents > Configure
Document Profile.
3 Select the option Use Remote SMB Share.
4 Enter the UNC Path for the SMB share that contains the documents to index.
A UNC path consists of a server name, a share name, and an optional file path, for
example: \\server\share\file_path.
5 Enter a valid user name and password for the share, and then re-enter the password.
The user you specify must have general access to the shared drive and read permissions
for the constituent files.
Optionally, you can Use Saved Credentials, in which case the credentials are available
from the pull-down menu.
See “About the credential store” on page 160.
6 Complete the configuration of the Indexed Document Profile.
See “Creating and modifying Indexed Document Profiles” on page 629.

Using the remote SMB share option to index SharePoint documents


To remotely index files on SharePoint, you expose the remote file share using WebDAV. Once
you have enabled WebDAV for SharePoint, you use the Use Remote SMB Share option and
enter the UNC path to index the remote documents. Symantec Data Loss Prevention supports
remote IDM indexing using WebDAV for SharePoint 2007 and SharePoint 2010 instances.
See “About indexing remote documents” on page 617.
Detecting content using Indexed Document Matching (IDM) 638
Configuring IDM profiles and policy conditions

Note: To index documents on a SharePoint server using the Remote SMB Share option, you
must deploy the Enforce Server to a supported Windows Server operating system host. Data
Loss Prevention depends on Windows NTLM services to mount a WebDAV server.

Table 27-17 provides the procedure for remotely indexing SharePoint documents using WebDAV

Table 27-17 Indexing of SharePoint documents

Step Task Description

1 Enable WebDAV for See “Enabling WebDAV for Microsoft IIS” on page 639.
SharePoint.

2 Start the WebClient service. From the computer where the Enforce Server is installed, start the WebClient
service using the "Services" console. If this service is "disabled," right-click it
and select Properties. Enable the service, set it to Manual, then Start it.
Note: You must have administrative privileges to enable this service.

3 Access the SharePoint From the computer where your Enforce Server is installed, access SharePoint
instance. using your browser and the following address format:

http://<server_name>:port

For example: http://protect-x64:80

4 Log on to SharePoint as an You do not need to have SharePoint administrative privileges.


authorized user.

5 Locate the documents to In SharePoint, navigate to the documents you want to scan. Often SharePoint
scan. documents are stored at the Home > Shared Documents screen. Your
documents may be stored in a different location.

6 Find the UNC path for the In SharePoint for the documents you want to scan, select the option Library
documents. > Open with Explorer. Windows Explorer should open a window and display
the documents. Look in the Address field for the path to the documents. This
address is the UNC path you need to scan the documents remotely. For
example: \\protect-x64\Shared Documents. Copy this path to the
Clipboard or a text file.

7 Create the IDM Index. See “Creating and modifying Indexed Document Profiles” on page 629.
Detecting content using Indexed Document Matching (IDM) 639
Configuring IDM profiles and policy conditions

Table 27-17 Indexing of SharePoint documents (continued)

Step Task Description

8 Configure the SharePoint To configure the remote indexing source:


remote indexing source.
■ For the Document Source field, select the Use Remote SMB Share option.
■ For the UNC Path, paste (or enter) the address you copied from the previous
step. For example: \\protect-x64\Shared Documents.
■ For the User Credentials, enter your SharePoint user name and password,
or select the same from the Saved Credentials drop-down list.
■ Select the option Submit Indexing on Save and click Save.

9 Verify success. At the Manage > Data Profiles > Indexed Documents screen you should see
that the index was successfully created. Check the "Status" and the number
of documents indexed. If the index was successfully created you can now use
it to create IDM policies.

See “Troubleshooting SharePoint document indexing” on page 640.

Enabling WebDAV for Microsoft IIS


There are various methods for enabling WebDAV for IIS. The following steps provide one
approach, in this case for a Windows Server 2008 R2. This approach is provided as an example
only. Your approach and environment may differ.
Microsoft IIS deployments that host SharePoint instances can be enabled to accept WebDAV
connections from web clients.
See “Using the remote SMB share option to index SharePoint documents” on page 637.
Enable WebDAV for SharePoint
1 Log on to the SharePoint system where you want to enable WebDAV.
2 Open the Internet Information Services (IIS) Manager console.
3 Select the server name in the IIS tree.
4 Expand the tree, click the Web Sites folder and expand it.
5 Select the SharePoint instance from the list.
6 Right-click the SharePoint instance and select New > Virtual Directory.
7 The Virtual Directory Creation Wizard appears. Click Next.
8 Enter a name in the Alias field (such as "WebDAV") and click Next.
9 Enter a directory path in the Web Site Content Directory field. It can be any directory
path as long as it exists. Click Next.
10 Select Read access and click Next.
Detecting content using Indexed Document Matching (IDM) 640
Configuring IDM profiles and policy conditions

11 Click Finish.
12 Right-click the virtual directory that you created and select Properties.
13 In the Virtual Directory tab, select the option "A redirection to a URL" and click Create.
The alias name is populated in the Application Name field.
14 Enter the SharePoint site URL in the "Redirect to" field and click OK. WebDAV is now
enabled for this SharePoint instance.

Troubleshooting SharePoint document indexing


If you cannot connect the Enforce Server computer to the SharePoint Server computer after
enabling WebDAV, make sure that you have started the WebClient service on the Enforce
Server computer. You must start this service and test the WebDAV connection before you
configure IDM indexing.
See “Using the remote SMB share option to index SharePoint documents” on page 637.
If you plan to re-index SharePoint documents periodically as they are updated, it may be useful
to map the remote network resource to the local computer where the Enforce Server is installed.
You can use the "net use" MS-DOS command to map SharePoint using the UNC path. For
example:
■ net use
This command without parameters retrieves and displays a list of network connections.
■ net use s: \\sharepoint_server\Shared Documents
This command assigns (maps) the SharePoint server to the local "S" drive.
■ net use * \\sharepoint_server\Shared Documents
This command assigns (maps) the SharePoint server to the next available letter drive.
■ net use s: /delete
This command removes the network mapping to the specified drive.

Filtering documents by file name


When you configure an Indexed Document Profile, you have the option of using filters to include
or exclude documents in your data source from being indexed. There are two types of file
name filters: File Name Include Filters and File Name Exclude Filters. Symantec recommends
that if you choose to use file name filters you select either inclusion filters or exclusion filters,
but not both.
See “Filter documents from indexing to reduce false positives” on page 652.
Table 27-18 describes the differences between the include and exclude filters for file names.
Detecting content using Indexed Document Matching (IDM) 641
Configuring IDM profiles and policy conditions

Table 27-18 File name filters distinguished

Filter Description

File Name Include Filters If the File Name Include Filters field is empty, matching is performed on all documents
in the document profile. If you enter anything in the File Name Include Filters field, it is
treated as an inclusion filter. In this case the document is indexed only if it matches the
filter you specify.

For example, if you enter *.docx in the File Name Include Filters field, the system
indexes only the *.docx files in the document source.

File Name Exclude Filters The Exclude Filters field lets you specify the documents to exclude in the matching
process.

If you leave the Exclude Filters field empty, the system performs matching on all
documents in the ZIP file or file share. If you enter any values in the field, the system
scans only those documents that do not match the filter.

The system treats forward slashes (/) and backslashes (\) as equivalent. The system ignores
whitespace at the beginning or end of the pattern. File name filtering does not support escape
characters, so you cannot match on literal question marks, commas, or asterisks.
Table 27-19 describes the syntax accepted by the File Name Filters feature. The syntax for
the Include and Exclude filters is the same.

Table 27-19 File name filtering syntax

Operator Description

Asterisk (*) Represents any number of characters.

Question mark (?) Represents a single character.

Comma (,) and newline Represents a logical OR.

Table 27-20 provides sample filters and descriptions of behavior if you enter them in the File
Name Include Filters field:

Table 27-20 File name filter examples

Filter string Description

*.txt,*.docx The system indexes only .txt and .docx files in the ZIP file or file share, ignoring
everything else.

?????.docx The system indexes files with the .docx extension and files with five-character
names, such as hello.docx and stats.docx, but not good.docx or
marketing.docx.
Detecting content using Indexed Document Matching (IDM) 642
Configuring IDM profiles and policy conditions

Table 27-20 File name filter examples (continued)

Filter string Description

*/documentation/*,*/specs/* The system indexes only files in two subdirectories below the root directory, one
called "documentation" and the other called "specs."

Example with wildcards and IDM indexing fails or ignores the filter setting if the File Name Includes / Excludes
sub-directories: filter string starts with an alphanumeric character and includes a wildcard, for
example: l*.txt. The workaround is to configure the include/exclude filter with
*\scan_dir\l*.txt
the filter string as indicated in this example, that is, *\scan_dir\l*.txt.

For example, the filter 1*.txt does not work for a file path
\\dlp.symantec.com\scan_dir\lincoln-LyceumAddress.txt. However,
if the filter is configured as *\scan_dir\l*.txt, the indexer acknowledges the
filter and index the file.

Filtering documents by file size


Filters let you specify documents to include or exclude from indexing. The types of filters include
File Name Include Filters, File Name Exclude Filters, and File Size Filters. You use file size
filters to exclude files from the matching process based on their size. Any files that match the
size filters are ignored.
See “Filtering documents by file name” on page 640.
In the Size Filters fields, specify any restrictions on the size of files the system should index.
In general you should use only one type of file size filter.
See “Filter documents from indexing to reduce false positives” on page 652.
Table 27-21 describes the file size filter options.

Table 27-21 File size filter configuration options

Filter Description

Ignore Files Smaller Than To exclude files smaller than a particular size:

■ Enter a number in the field for Ignore Files Smaller Than.


■ Select the appropriate unit of measure Bytes, KB (kilobytes), or MB (megabytes)
from the drop-down list.

For example, to prevent indexing of files smaller than one kilobyte (1 KB), enter 1 in
the field and select KB from the corresponding drop-down list.
Detecting content using Indexed Document Matching (IDM) 643
Configuring IDM profiles and policy conditions

Table 27-21 File size filter configuration options (continued)

Filter Description

Ignore Files Larger Than To exclude files larger than a particular size:
■ Enter a number in the field for Ignore Files Larger Than.
■ Select the appropriate unit of measure (Bytes, KB, or MB) from the drop-down list.

For example, to prevent indexing of files larger than two megabytes (2 MB), enter 2
in the field and select MB from the corresponding drop-down list.

Scheduling document profile indexing


When you configure a document profile, select Submit Indexing Job on Save to index the
document profile as soon as you save it. Alternatively, you can set up a schedule for indexing
the document source.
To schedule document indexing, select Submit Indexing Job on Schedule and select a
schedule from the drop-down list as described in Table 27-22.

Note: The Enforce Server can index only one document profile at a time. If one indexing
process is scheduled to start while another indexing process is running, the new process does
not begin until the first process completes.

Table 27-22 Options for scheduling Document Profile indexing

Parameter Description

Index Once On – Enter the date to index the document profile in the format MM/DD/YY. You can also click
the date widget and select a date.

At – Select the hour to start indexing.

Index Daily At – Select the hour to start indexing.

Until – Select this check box to specify a date in the format MM/DD/YY when the indexing
should stop. You can also click the date widget and select a date.

Index Weekly Day of the week – Select the day(s) to index the document.

At – Select the hour to start indexing.

Until – Select this check box to specify a date in the format MM/DD/YY when the indexing
should stop. You can also click the date widget and select a date.
Detecting content using Indexed Document Matching (IDM) 644
Configuring IDM profiles and policy conditions

Table 27-22 Options for scheduling Document Profile indexing (continued)

Parameter Description

Index Monthly Day – Enter the number of the day of each month you want the indexing to occur. The number
must be 1 through 28.

At – Select the hour to start indexing.

Until – Select this check box to specify a date in the format MM/DD/YY when the indexing
should stop. You can also click the date widget and select a date.

Changing the default indexer properties


The server index contains the MD5 fingerprint of each file that has been indexed, either raw
binary or exact extracted content if the contents of the file can be extracted, and hashes of
discrete passages of content.
See “Using IDM to detect exact and partial file contents” on page 621.
The size of the passages depends on the low_threshold_k setting in the indexer properties
file (\Program
Files\Symantec\DataLossPrevention\EnforceServer\15.5\Protect\config\indexer.properties).
Generally, there is no need to change the default settings. When you lower the default minimum,
the Enforce Server creates hashes out of smaller sections of the documents it indexes.
The default settings apply to the Whitelisted.txt file as well. If the amount of content you
need to whitelist is less than the minimum amount required for partial matching, you can adjust
the default minimum setting.
To change the default minimum for whitelisted text
1 On the Symantec Data Loss Prevention host, navigate to directory \Program
Files\Symantec\DataLossPrevention\EnforceServer\15.5\Protect\config on
Windows, or
/opt/Symantec/DataLossPrevention/EnforceServer/15.5/Protect/config on Linux.

2 Use a text editor to open file Indexer.properties


3 Locate the parameter low_threshold_k:

low_threshold_k=50
Detecting content using Indexed Document Matching (IDM) 645
Configuring IDM profiles and policy conditions

4 Change the numerical portion of the parameter value to reflect the wanted minimum
number of characters that are allowed in Whitelisted.txt.
For example, to change the minimum to 30 characters, modify the value to look like the
following:

low_threshold_k=30

The value for this parameter must match the min_normalized_size value. The default
for min_normalized_size is 50.
5 Save the file.
For more information on IDM configuration and customization, see the article "Understanding
IDM configuration and customization" at http://www.support.symantec.com/doc/TECH234899
at the Symantec Support Center.

Enabling Agent IDM


You enable exact and partial match IDM on the Windows endpoint by setting the advanced
agent configuration parameter Detection.TWO_TIER_IDM_ENABLED.str to OFF. Once two-tier
detection is OFF, the DLP Agent performs exact and partial file and exact and partial file
contents matching, assuming you have generated the endpoint index.

Note: Two-tier deployment is not supported on the Mac Agent.

See “Creating and modifying Indexed Document Profiles” on page 629.


For new installations, exact and partial match IDM on the endpoint is the default setting for
the default endpoint agent configuration (TWO_TIER_IDM_ENABLED = OFF); you do not
need to enable it.
For upgraded systems, exact and partial match IDM on the endpoint is disabled
(TWO_TIER_IDM_ENABLED = ON) so that there is no change in functionality for existing IDM
policies deployed to the endpoint. If you want to use exact match IDM on the endpoint after
upgrade, you need to turn off two-tier detection and reindex each document data source.
See “To turn two-tier detection on or off” on page 645.
To turn two-tier detection on or off
1 Log on to the Enforce Server administration console.
2 Navigate to System > Agents > Agent Configuration.
3 Select the applicable agent configuration.
4 Select the Advanced Agent Settings tab.
5 Locate the Detection.TWO_TIER_IDM_ENABLED.str parameter.
Detecting content using Indexed Document Matching (IDM) 646
Configuring IDM profiles and policy conditions

6 Change the value to either "ON" or "OFF" (case insensitive) depending on your
requirements.
See Table 27-23 on page 646.
7 Click Save at the top of the page to save the changes.
8 Apply the agent configuration to the agent group or groups.
See “Applying agent configurations to an agent group” on page 2412.

Table 27-23 Advanced agent settings for exact match IDM on the endpoint

Advanced Agent Setting parameter Value Default Detection Matching type


engine

Detection.TWO_TIER_IDM_ENABLED.str OFF New installation DLP Agent Exact file


or system
Partial file contents
upgrade from
12.5 or later.

ON System upgrade Endpoint Server Exact file


from 12.0.x
Exact file contents

Partial file contents

Estimating endpoint memory use for agent IDM


For partial matching, DLP requires about 2 KB of RAM per file, or about 60 MB for 30,000 files
for the agent. For exact matching only, DLP requires about 40 bytes per file.
See “About the server index files and the agent index files” on page 618.

Configuring the Content Matches Document Signature policy


condition
The Content Matches Document Signature From matches unstructured document content
based on the Indexed Document Profile. The Content Matches Document Signature From
condition is available for detection rules and exceptions.
See “About using the Content Matches Document Signature policy condition” on page 623.
Detecting content using Indexed Document Matching (IDM) 647
Configuring IDM profiles and policy conditions

To configure the Content Matches Document Signature condition


1 Add an IDM condition to a policy rule or exception, or modify an existing one.
See “Configuring policies” on page 413.
See “Configuring policy rules” on page 417.
See “Configuring policy exceptions” on page 426.
2 Configure the IDM condition parameters.
See Table 27-24 on page 647.
3 Save the policy configuration.

Table 27-24 Content Matches Document Signature condition parameters

Action Description

Set the Minimum Select an option from the drop-down list.


Document Exposure.
Choose Exact to match document contents exactly.

Choose a percentage between 10% and 90% to match document contents partially.

Configure Match Select how you want to count matches:


Counting.
■ Check for existence
Reports a match count of 1 if there are one or more condition matches.
■ Count all matches
Reports a match count of the exact number of matches.

See “Configuring match counting” on page 421.

Select the components to Select one of the available message components to match on:
Match On.
■ Body – The content of the message.
■ Attachments – Any files that are attached to or transferred by the message.

See “Selecting components to match on” on page 423.

Configure additional Select this option to create a compound condition. All conditions must be met to trigger or
conditions to Also Match. except a match.

You can Add any available condition from the drop-down menu.

Test and tune the policy. See “Test and tune policies to improve match accuracy” on page 453.

See “Use parallel IDM rules to tune match thresholds” on page 654.

See “Troubleshooting policies” on page 445.


Detecting content using Indexed Document Matching (IDM) 648
Best practices for using IDM

Best practices for using IDM


Indexed Document Matching (IDM) is designed to protect document content and images. IDM
relies on an index of fingerprinted documents to perform partial and derivative text-based
content matching. In addition, you can also use IDM to match indexed documents exactly
based on their binary stamp, including not only text-based documents but also graphics and
media files
Because of the broad range of matching supported by IDM, you should consider the best
practices in this section to implement IDM policies that accurately match the data you want to
protect.
Table 27-25 summarizes the IDM considerations discussed in this section, with links to individual
topics for each.

Table 27-25 IDM policy best practices

Consideration Description

Reindex IDM profiles after upgrade. See “Reindex IDM profiles after upgrade” on page 649.

Do not compress documents whose content you want to See “Do not compress files in the document source”
fingerprint. on page 649.

Prefer partial matching over exact matching on the DLP See “Prefer partial matching over exact matching on the
Agent. DLP Agent” on page 650.

Do not index text-based documents without content. See “Do not index empty documents” on page 649.

Be aware of the limitations of exact matching. See “Understand limitations of exact matching” on page 650.

Use white listing to exclude partial file contents from See “Use white listing to exclude non-sensitive content
matching and reduce false positives. from partial matching” on page 651.

Filter non-critical documents from indexing to reduce false See “Filter documents from indexing to reduce false
positives. positives” on page 652.

Change the index max size to index more than 1,000,000 See “Create separate profiles to index large document
documents. sources” on page 653.

Use remote indexing for large document sets. See “Remote IDM indexing” on page 655.

Use scheduled indexing to automate profile updates. See “Use scheduled indexing to keep profiles up to date”
on page 653.

Use multiple IDM rules in parallel to establish and tune See “Use parallel IDM rules to tune match thresholds”
match thresholds. on page 654.
Detecting content using Indexed Document Matching (IDM) 649
Best practices for using IDM

Reindex IDM profiles after upgrade


You must update each Indexed Document Matching profile by reindexing each associated
data source after performing a upgrade of Symantec Data Loss Prevention.
If you have upgraded Symantec Data Loss Prevention and you want to use partial-match IDM
on the endpoint for existing IDM policies, you must reindex the data source for each Indexed
Document Profile so that each endpoint index is generated and deployed to DLP Agents.
See “Enabling Agent IDM” on page 645.

Do not compress files in the document source


For file formats whose content can be extracted, the server indexing process opens the
document, extracts the text-based content, and fingerprints the data in full and in part (sections).
However, the indexing process cannot recursively inspect document archives that are contained
in the document set. If a document whose file contents you want to index is compressed in an
archive file (such as ZIP, RAR, or TAR) within the document data source, the system cannot
extract the contents from the file and index its content. In this case, the system only takes a
cryptographic hash of the binary file signature. The embedded file is considered for exact file
matches only, like image files and other unsupported file formats.
This behavior is specific to the design-time indexing process only. At run-time the detection
server does recursively inspect document archives and extract the text of files contained in
those archives. But, to be able to evaluate such content, the IDM index must have been able
to index all content files.
The best practice is not to include any files whose content you want to index in a document
archive. The lone exception is the document archive ZIP file that you upload or copy to the
Enforce Server that contains the entire document set. All files in that container file must be
uncompressed. If the Document Archive uploaded to the Enforce Server for indexing contains
one or more embedded archive files (such as a ZIP), the system performs an exact binary
match on any file contained in the embedded archive file
See “Creating and modifying Indexed Document Profiles” on page 629.

Do not index empty documents


You should be careful about the documents you index. In particular, avoid indexing blank or
empty documents.
For example, indexing a PPTX file containing only photographs or other graphical content but
no textual content matches other blank PPTX files exactly and produces false positives. Is this
case, even though a PPTX file contains no user-entered text, the file does contain header and
footer placeholder text that the system extracts as file contents. Because the amount of text
extracted and normalized is more than 50 non-whitespace characters, the system treats the
file as not binary and creates a cryptographic hash of all of the file contents. As a result, all
Detecting content using Indexed Document Matching (IDM) 650
Best practices for using IDM

other blank PPTX files produce exact file contents matches because the resulting MD5 of the
extracted content is the same.

Note: This behavior has not been observed with XLSX files; that is, false positives do not get
created if the blank files are different.

See “Using IDM to detect exact and partial file contents” on page 621.

Prefer partial matching over exact matching on the DLP Agent


If you are deploying IDM polices to the endpoint, partial match IDM is recommended. The main
advantage of partial match IDM on the endpoint is that matching is fast because it is done
locally by the agent instead of remotely by the server. In addition, partial match IDM lets you
use response rules directly on the endpoint.
See “Types of IDM detection” on page 614.

Understand limitations of exact matching


Exact match means just that: inbound data must match the MD5 fingerprint of either a binary
file signature or an exact match of extracted and normalized file contents. .
See “Supported forms of matching for IDM” on page 613.
Consider the following when implementing server exact match IDM:
■ White listing only applies to partial file contents matching.
■ For binary files and text-based files coming into the detection engine for exact file matching,
as an optimization the system checks the byte size of the file before computing the run-time
MD5 for comparison against the index. If the file byte sizes do not match there is no
comparison of the cryptographic hashes.
■ File type is never checked for exact file or exact file contents matching.
■ Some file formats change the byte size of a file if the file is opened by the native application
and then saved without changes, resulting in the file not matching exactly. For example, if
you open a file such as a JPEG image with Windows Picture and Fax Viewer and save the
file without making changes, the binary size of the file is nonetheless changed, resulting
in no exact match.
■ For some applications the Windows Print operation may alter the file data such that extracted
file contents does not match exactly. Known file types that are affected by this include
Microsoft Office documents.
Table 1 lists some known limitations with exact content matching. This list is not exhaustive
and there may be other file formats that change on resave.
Detecting content using Indexed Document Matching (IDM) 651
Best practices for using IDM

Table 1 Limitations of exact file content matching

File type Application Result on resave

dwg AutoCAD 2012 Does not match

jpeg Windows Picture and Fax Viewer Does not match

doc Microsoft Office Word 2007 Does not match

xls Microsoft Excel 2007 Does not match

ppt Microsoft Presentation 2007 Does not match

pdf Adobe Acrobat 9 Pro Does not match

docx Microsoft Office Word 2007 Match

xlsx Microsoft Excel 2007 Match

pptx Microsoft Presentation 2007 Match

Use white listing to exclude non-sensitive content from partial


matching
White listing is designed to let you exclude partial file contents from matching. You use white
listing to exclude headers, footers, and boilerplate content from partial matching and reduce
false positives. Information contained in document headers and footers is likely to cause false
positives. Likewise boilerplate text, such as standard language and non-proprietary corporate
content that is often repeated across confidential documents can cause false positives.
Ideally, you should remove headers and footers from documents before you index them.
However, this may not be feasible, especially if you have a large document set. As a best
practice, you should whitelist header, footer, and boilerplate content so that this text is excluded
when the server index is generated. If you use white listing, generally you can lower the
Minimum Document Exposure setting in the policy without increasing false positives because
more of the content indexed is confidential data, instead of common, repeated content.

Note: White listing does not apply to exact file or exact file contents matching.

See “About white listing partial file contents” on page 624.


See “White listing file contents to exclude from partial matching” on page 627.
Detecting content using Indexed Document Matching (IDM) 652
Best practices for using IDM

Filter documents from indexing to reduce false positives


When you configure an Indexed Document Profile, you have the option of using filters to include
or exclude documents in your data source for indexing. There are two types of filters: file name
and file size.
See “Creating and modifying Indexed Document Profiles” on page 629.
You use filtering to filter non-critical documents from indexing and ensure that your index is
protecting only confidential files and file contents. Filtering helps reduce false positives and
decrease the size of the IDM index.
See “Do not index empty documents” on page 649.
The best practice is to use either an exclusion filter or an inclusion filter for each filter type, but
not both. For example, you may not need to index all of the files you include in a document
archive or expose to the system by file share. In this case, you can enumerate the files you
want to include (inclusion filter) or list the file types you want to exclude from indexing (exclusion
filter), but you should not use both. You can also use file size filters to set a threshold for the
file size to include or exclude in the index.
See “Filtering documents by file name” on page 640.
See “Filtering documents by file size” on page 642.

Distinguish IDM exceptions from white listing and filtering


White listing lets you exclude partial file contents from matching. Filtering lets you exclude
specific documents from the indexing process. IDM exceptions, on the other hand, let you
except indexed files from exact matching at run-time.
You use the IDM condition as policy exception to exclude files from detection. To be excepted
from matching, an inbound file must be an exact match with a file in the IDM index. You cannot
use IDM exceptions to exclude content from matching. To exclude content, you must whitelist
it.

Note: White listing is not available for exact file or file contents matching; it is only available
for partial content matching.

Table 27-27 White listing, filters, and exceptions distinguished

IDM Use
Configuration

Exception Except exact file from matching

As an example, the CAN-SPAM Act policy template uses an IDM exception.


Detecting content using Indexed Document Matching (IDM) 653
Best practices for using IDM

Table 27-27 White listing, filters, and exceptions distinguished (continued)

IDM Use
Configuration

White listing Except file contents from matching

See “Use white listing to exclude non-sensitive content from partial matching” on page 651.

Filtering Include or exclude files from being indexed

See “Filter documents from indexing to reduce false positives” on page 652.

Create separate profiles to index large document sources


IDM detection is based on an Indexed Document Profile. The maximum single IDM profile size
in RAM is 2 GB. This maximum size limit is based on the overall number of the documents
being indexed. Depending on the size of the actual source files and their extracted text size,
this translates into approximately 1,000,000 files. You can change the 2 GB maximum size of
a single IDM profile index in the indexer.properties file using
com.vontu.profiles.documents.maxIndexSize.

See “About the document data source” on page 616.


If you need to index more than 1,000,000 files, the best practice is to organize the documents
into separate ZIP files or share directories. You should create a separate Indexed Document
Profile for each individual document set. Then, you can define separate rules that reference
each index and add the rules to one or more policies.

Use WebDAV or CIFS to index remote document data sources


For smaller document sets (50 MB or less), you can upload the files to the Enforce Server.
For larger document sets, consider using FTP Secure to upload the files to the Enforce Server.
Alternatively, you can remotely index documents that are stored on a file share that supports
the CIFS protocol, or on a web server that supports the WebDAV protocol, such as Microsoft
SharePoint or OpenText Livelink
See “About indexing remote documents” on page 617.

Use scheduled indexing to keep profiles up to date


You can use index scheduling to keep your IDM profiles up to date. The initial index scans all
the documents to be indexed. Any subsequent index only scans the differences between the
two. You should schedule indexing outside of normal business hours to reduce any potential
affect on the system.
See “Scheduling document profile indexing” on page 643.
Detecting content using Indexed Document Matching (IDM) 654
Best practices for using IDM

Before you set up an indexing schedule, consider the following recommendations:


■ If you update your document sources occasionally (for example, less than once a month),
there is no need to create a schedule. Index the document each time you update it.
■ Schedule indexing for times of minimal system use. Indexing affects performance throughout
the Symantec Data Loss Prevention system, and large documents can take time to index.
■ Index a document as soon as you add or modify the corresponding document profile, and
re-index the document whenever you update it. For example, consider a situation where
every Wednesday at 2:00 A.M. you update a document. In this case scheduling the index
process to run every Wednesday at 3:00 A.M. is optimal. Scheduling document indexing
daily is not recommended because that is too frequent and can degrade server performance.
■ Monitor results and modify your indexing schedule accordingly. If performance is good and
you want more timely updates, schedule more frequent document updates and indexing.
■ Symantec Data Loss Prevention performs incremental indexing. When a previously indexed
share or directory is indexed again, only the files that have changed or been added are
indexed. Any files that are no longer in the archive are deleted during this indexing. So a
reindexing operation can run significantly faster than the initial indexing operation.

Use parallel IDM rules to tune match thresholds


The primary use case for IDM policies is to detect unstructured document content based on
a percentage match requirement called the Minimum Document Exposure. This value is a
configurable parameter that specifies the minimum percentage of content in the message that
must match the IDM index to produce a match. The IDM policy default is “Exact,” which means
that, for text-based documents, all of the content of the message must match the fingerprint
to create an incident. A Minimum Document Exposure setting of 10% means that, on average,
one page of a 10 page document must match the IDM index to create an incident.
A document might contain much more content, but Symantec Data Loss Prevention protects
only the content that is indexed as part of a document profile. For example, consider a situation
where you index a one-page document, and that one-page document is included as part of a
100-page document. The 100-page document is considered an exact match because its
content matches the one-page document exactly. In addition, the matched document does
not have to be of the same file type or format as the indexed document. For example, if you
index a Word document as part of a document profile, and its contents are pasted into the
body of an email message or used to create a PDF, the engine considers it a match
A rule-of-thumb for setting the Minimum Document Exposure setting is 60%. Minimum Document
Exposures set to less than 50% typically create many false positives. Starting with rate of 60%
should give you enough information to determine whether you should go to a higher or lower
match percentage without creating excessive false positives
As an alternative, consider taking a tiered approach to establishing Minimum Document
Exposure settings. For example, you can create multiple IDM rules, each with a different
Detecting content using Indexed Document Matching (IDM) 655
Remote IDM indexing

threshold percentage, such as 80% for documents with a high match percentage, 50% for
documents with a medium match percentage, and 10% with a low match percentage. Using
this approach helps you filter out false positives and establish an accurate Minimum Document
Exposure setting for each IDM index you deploy as part of your policies.

Remote IDM indexing


This section provides instructions and reference content for using the Remote IDM Indexer.

About the Remote IDM Indexer


The Remote IDM Indexer is a standalone tool. With it you can index your confidential documents
and files locally on the systems where these files are stored. The Remote IDM Indexer frees
you from having to collect and copy all the files to the Enforce Server host for indexing.
The Remote IDM Indexer generates a preindex file (*.prdx) that is encrypted and password
protected. You upload the preindex file to the Enforce Server host for final index generation
and deployment.
The Remote IDM Indexer is supported on Windows and Linux platforms. The tool is configured
using a command line interface (CLI) or a properties file. On Windows, you can use the graphical
user interface (GUI) edition of the tool to configure it.
You can integrate the tool with external systems to schedule indexing. In addition, you can
incrementally index a data source by specifying an existing *.prdx file when you run the tool.

Table 27-28 Remote IDM Indexer features

Feature Description

Familiar installation DLP installers for Windows and Linux

Various configuration options Properties file (default)

Command-line interface (CLI)

Graphical user interface GUI (Windows)

Secure preindex file Password protected

Encrypted data contents

Incremental indexing Ability to load an existing preindex and scan only


new or updated files.

Scheduled indexing Windows Task Scheduler

Linux cron job


Detecting content using Indexed Document Matching (IDM) 656
Remote IDM indexing

Table 27-28 Remote IDM Indexer features (continued)

Feature Description

Secure upload to Enforce UI for uploading the preindex to the Enforce Server
User must provide password to complete the
indexing process.

Installing the Remote IDM Indexer


You install the Remote IDM Indexer on one or more systems where the confidential files you
want to index are stored. The process for installing a remote indexer is the same for EMDI,
EDM, and IDM.
See “About installing remote indexers” on page 589.
You can install the Remote IDM Indexer on all supported Windows and Linux platforms. See
the Symantec Data Loss Prevention System Requirements Guide for platform details.

Indexing the document data source using the GUI edition (Windows
only)
To configure the UI edition of the Remote IDM Indexer, you enter the parameters into the
required fields. Optionally you can provide additional parameters, such as a whitelist file for
filters.
On successful completion of indexing, the preindex file (*.prdx) is generated. You move this
file to the Enforce Server to complete the indexing process.
Figure 27-1 shows the GUI edition of the Remote IDM Indexer.
Table 27-29 provides instructions for configuring the GUI edition of the Remote IDM Indexer.
Detecting content using Indexed Document Matching (IDM) 657
Remote IDM indexing

Figure 27-1 Remote IDM Indexer GUI edition


Detecting content using Indexed Document Matching (IDM) 658
Remote IDM indexing

Table 27-29 Configuring the Remote IDM Indexer using the GUI edition

Step Parameters Description

1 Enter the Source URI path. The source URI is the local file path (directory folder) where the files to be
indexed are stored. It can also be a shared file system path accessible by
the host.

The files to be indexed should not be encapsulated.

If the document data source requires credentials you provide them in the
URI Credentials section.

2 Enter the Output File Specify the file path and name for the preindex file that the tool generates.
name.
Include the *.prdx file extension when you specify the output file name.

3 Optionally, enter the Specify the file path to the whitelist.txt file.
Whitelist File path.
Text in the whitelist file is ignored during detection for server-based partial
matching.

4 Optionally, enter one or Enter one or more file names to include for indexing or to exclude for indexing.
more File Name Filters.
The File Name Include Filter includes the named files for indexing.

The File Name Exclude Filter excludes the named files from indexing.

The format for the include and exclude filters accepts both comma-separated
and newline-separated values.

If you use a filter, use one type but not both. For example, if you choose to
use a file name include filter, do not also provide a file name exclude filter.

5 Optionally, enter a File Size


Filter.

6 Optionally, click Always Click Always keep files


keep files.
■ When you want to incrementally add multiple data sources to the same
pre-index file.
■ If you have a folder with content that gets moved and want to keep the
old content in the pre-index file.

7 Click Run to index the data Click Run to start the indexing process.
source immediately.
Alternatively, you can click Schedule to schedule indexing. The tool opens
the Windows Task Utility.

See “Scheduling remote indexing with the Remote IDM Indexer app for
Windows” on page 659.
Detecting content using Indexed Document Matching (IDM) 659
Remote IDM indexing

Table 27-29 Configuring the Remote IDM Indexer using the GUI edition (continued)

Step Parameters Description

8 Enter the Password for the For security purposes you must provide a password for the pre-index file.
pre-index file.
The password must meet the one of the following requirements:

■ ASCII password: a minimum of 10 characters, with at least one upper


case letter, one lower case letter, and one number.
■ Non-ASCII password: a minimum of 10 characters, including at least one
number.

The preindex file is encrypted with the password you provide.

The password you enter here is required to load the preindex into the Enforce
Server for indexing.

9 Verify indexing progress. When you click Run, the status bar shows the scanning completion
percentage.

In addition the Progress section of the interface provides the following


information:

Current Stage: States are Running, Completed, or Error.

Progress: The total number of files indexed.

Current File: The name of the file that is indexed.

Scheduling remote indexing with the Remote IDM Indexer app for
Windows
If you use the Windows GUI version of the Remote IDM Indexer, you can schedule or edit a
task directly from the tool. The following screen shots illustrate the process.
See “To schedule indexing using the Windows GUI version” on page 659.
See “To edit an existing scheduled task using the Windows GUI” on page 661.
To schedule indexing using the Windows GUI version
1 Click Schedule to open the dialog. See “Scheduling remote indexing with the Remote
IDM Indexer app for Windows” on page 659.
2 Click Create to create a new scheduled task. Or, if you already have a task created, click
Edit.
You are prompted to provide a UTF8-encoded password file in cleartext for the scheduled
job. Access to this file should be limited to the appropriate user, such as your Protect user.
Click Create and provide the credentials to the Windows host.
Detecting content using Indexed Document Matching (IDM) 660
Remote IDM indexing

3 Enter the user name and password for the Windows host where the Task Scheduler is
installed.
When you enter the appropriate credentials (generally administrator privileges are required),
the Remote IDM Indexer creates a new task in the Windows Task Scheduler. The tool
displays a dialog indicating that the task was successfully created and provides you with
the name of the task. See Figure 27-3 on page 660.
4 Click OK to close the dialog.
After you complete this operation with Windows the interface appears.
5 Select the SymantecDLP folder in the Task Scheduler Library.
Notice to the right that there is a task created named "Remote IDM Indexer <time-stamp>".
See Figure 27-4 on page 661.
6 Double-click the created task.
This action brings up the Window Task Scheduler properties dialog for this task. Using
this dialog you can schedule when the Remote IDM Indexer should run. Refer to the Task
Scheduler help for details on using the Windows Task Scheduler.

Figure 27-2 Scheduling indexing dialog

Figure 27-3 Successfully scheduled task dialog


Detecting content using Indexed Document Matching (IDM) 661
Remote IDM indexing

Figure 27-4 Symantec DLP scheduled task

To edit an existing scheduled task using the Windows GUI


1 Click Schedule to open the dialog. See Figure 27-2 on page 660.
2 Click Edit/Delete Existing Tasks to open the Windows Task Scheduler utility. Here
you can edit or delete an existing scheduled task.

Figure 27-5 Windows Task Scheduler properties configuration

See “Incremental indexing” on page 661.

Incremental indexing
You can incrementally index a remote data source by specifying an existing preindex file
(*.prdx) in the command line argument when you run the tool.
Detecting content using Indexed Document Matching (IDM) 662
Remote IDM indexing

In the GUI version of the tool you can browse to and select an existing *.prdx file for the
Output File path.
The indexing process appends newly indexed files and file contents to the existing preindex
entries.
The tool compares the last modified date of the file. If the file has been modified after the file
that was preindexed, the tool updates the preindex with the changes that were made to the
file. If the date the file was modified is the same, the pre-index is not updated. If you change
any include, exclude, or size filters in your existing preindex file, those filters are applied to
any previously indexed files. For example, for a remote data source with ten .docx files and
ten .pptx files, if your first remote indexing job has no filters, all files are indexed. If you add
an exclude filter for .docx files (-exclude_filter=*.docx) and run the indexing job again,
the .docx files are removed from the index and only the .pptx files remain.

Always keep files


You can select Always keep files in the Remote IDM Indexer GUI version for Windows or
use keep_all_files=true at the command line for Windows and Linux when you want to
incrementally add multiple data sources to the same preindex file. It keeps files which are in
the previous preindex, but not in the current data source. It also enables you to incrementally
add multiple data sources to the same preindex file. You can also use keep_all_files if you
have a folder containing content that is moved and you want to keep the old content in the
preindex file.
The previous IDM incremental indexer, and the indexer available through the Enforce Server
administration console, replaces the entire old index with a new one. For example, when
document set A is indexed and then document set B is incrementally indexed for the same
profile, the index of set A is dropped and replaced with the index of set B.

Logging and troubleshooting


Remote IDM indexing status messages are logged to the Indexer.log file.
The log file path is C:\ProgramData\Symantec\DataLossPrevention\Indexer\15.5\logs
(Windows) or /var/log/Symantec/DataLossPrevention/Indexer/15.5/ (Linux).
The log presents error messages indicating whether file access was denied or file indexing
failed.
See “Copying the preindex file to the Enforce Server host” on page 662.

Copying the preindex file to the Enforce Server host


After you have generated the preindex file you must copy it to the Enforce Server host so it
can be loaded for profiling and deployment.
Detecting content using Indexed Document Matching (IDM) 663
Remote IDM indexing

You copy the *.prdx file to the following directory on the Enforce Server host on Windows:
C:\Program
Files\Symantec\DataLossPrevention\ServerPlatformCommon\15.5\documentprofiles
or on Linux:
/var/Symantec/DataLossPrevention/ServerPlatformCommon/15.5/documentprofiles.

You can use FTP or FTP/S to copy the *.prdx file to the Enforce Server host file system.

Note: Make sure that the Enforce user who is reading and loading the .prdx file has permission
to enable copying and loading of the file.

See “Loading the remote index file into the Enforce Server” on page 663.

Loading the remote index file into the Enforce Server


The Enforce Server administration console provides a user interface for uploading remote IDM
preindexes to the Enforce Server.
The Data Loss Prevention administrator or policy author must specify the preindex password
that was entered when the preindex file was initially created.
The system uses the preindex to generate the final index that is deployed to detection servers
and agents (if Agent IDM is enabled).

Note: If you have not copied the preindex file to the proper directory on the Enforce Server
host on Windows: C:\Program
Files\Symantec\DataLossPrevention\ServerPlatformCommon\15.5\documentprofiles
or on Linux:
/var/Symantec/DataLossPrevention/ServerPlatformCommon/15.5/documentprofiles,
the file does not appear in the drop-down field for selection.

Figure 27-6 Loading the remote index into Enforce


Chapter 28
Detecting content using
Vector Machine Learning
(VML)
This chapter includes the following topics:

■ Introducing Vector Machine Learning (VML)

■ Configuring VML profiles and policy conditions

■ Best practices for using VML

Introducing Vector Machine Learning (VML)


Vector Machine Learning (VML) performs statistical analysis to protect unstructured data. The
analysis determines if content is similar to example content you train against.
With VML you do not have to locate and fingerprint all of the data you want to protect. You
also do not have to describe it and risk potential inaccuracies. Instead, you train the system
to learn the type of content you want to protect based on example documents you provide.
VML detection is based on a VML profile. You create a VML profile by uploading a
representative amount of content from a specific category of data. The system scans the
content, extracts the features, and creates a statistical model based on the frequency of
keywords in the example documents. At run-time the system applies the model to analyze and
detect the content that has the features that are statistically similar to the profile.
VML simplifies the detection of unstructured, text-based content and offers the potential for
high accuracy. The key to implementing VML is the example content you train the system
against. You must be careful to select the documents that are representative of the type of
content you want to protect. And, you must select good examples of content you want to ignore
that are closely related to the content you want to protect.
Detecting content using Vector Machine Learning (VML) 665
Introducing Vector Machine Learning (VML)

See “Configuring VML profiles and policy conditions” on page 668.

About the Vector Machine Learning Profile


The Vector Machine Learning Profile is the data profile that you define for implementing VML
policies.
For example, you might create a VML profile to protect your source code. You train the system
using positive example documents (proprietary code that you want to protect). You also train
the system using negative example documents (open source code that you do not care to
protect). A VML policy references the VML profile to analyze message data and recognize the
content that is similar to the positive features. The VML profile can be tuned, and it can be
easily updated by adding or removing documents to or from the training sets.
See “Data Profiles” on page 375.
See “Creating new VML profiles” on page 669.

About the content you train


Collecting the documents for training is the most important step in the Vector Machine Learning
process. Vector Machine Learning is only as accurate as the example content you train against.
See “Configuring VML profiles and policy conditions” on page 668.
A VML profile is based on a category of content representing a specific business use case. A
category of content comprises two training sets: positive and negative.
The positive training set is content you want to protect. More specific categorization results in
better accuracy. For example, “Customer Purchase Orders” is better than “Financial Documents”
because it is more specific.
The negative training set is content you want to ignore, yet related to the positive training set.
For example, if the positive training set is “Weekly Sales Reports," the negative training set
might contain "Sales Press Releases."
You should collect an equal amount of positive and negative content that is primarily text-based.
You do not have to collect all the content you want to protect. However, you do need to
assemble training sets large enough to produce reliable statistics.
The recommended number of documents is 250 per training set. The minimum number of
documents per training set is 50.
Table 28-1summarizes the baseline requirements for the content you collect for VML profile
training.
Detecting content using Vector Machine Learning (VML) 666
Introducing Vector Machine Learning (VML)

Table 28-1 VML training set requirements

Category of Type of data Training set Quantity Content Size


content

Positive Recommended: Content you want


250 documents to protect.

Minimum: 50
documents 30 MB per upload
Single, specific Text-based
business use case (primarily) No size limit per
Negative Approximately the Content you do
category.
same amount as not want to protect
the positive yet thematically
category. related to the
positive category.

About the base accuracy from training percentage rates


During the VML profile training process, the system extracts example document content and
converts it to raw text. The system selects features (or keywords) using a proprietary algorithm
and generates the VML profile. As part of the training process, the system calculates and
reports base accuracy rates for false positives and false negatives. The base accuracies from
training percentage rates indicate the quality of your positive and negative training sets.
The goal is to achieve 100% accuracy (0% base false rates), but obtaining this level of quality
for both training sets is usually not possible. You should reject a training profile if either the
base false positive rate or the base false negative rate is more than 5%. A relatively high base
false percentage rate indicates that the training set is not well categorized. In this case you
need to add documents to an under-represented training set or remove documents from an
over-represented training set, or both.
See “Managing training set documents” on page 676.
Table 28-2 describes what the base accuracy percentage rates from training mean in relation
to the positive and negative training sets for a given VML profile.

Table 28-2 Base accuracy rates from training

Accuracy rate Description

Base false positive rate The percentage of the content in the negative training set that is statistically similar to the
(%) positive content.

Base false negative Rrate The percentage of the content in the positive training set that is statistically similar to
(%) negative content.
Detecting content using Vector Machine Learning (VML) 667
Introducing Vector Machine Learning (VML)

About the Similarity Threshold and Similarity Score


Each VML profile has a Similarity Threshold that can be set from 0 to 10. This setting is used
to make an adjustment for imperfect information within a training set to achieve the best
accuracy possible. During detection, a message must have a Similarity Score greater than the
Similarity Threshold for an incident to be generated. The Similarity Threshold is set at the
profile level—not within a policy. It is set this way because there is an ideal Similarity Threshold
setting that is unique to your training set where the best accuracy rates can be achieved (both
in terms of false positives and false negatives).
When a VML policy detects an incident, the system displays the Similarity Score in the match
highlighting section of the Incident Snapshot in the Enforce Server administration console.
The Similarity Score indicates how similar the detected content is to the VML profile. The
higher the score the more statistically similar the message is to the positive example documents
in your VML profile.
Consider an example where a Similarity Threshold is set to 4 and a message with a Similarity
Score of 5 is detected. In this case the system reports the match as an incident and displays
the Similarity Score during match highlighting. However, if a message is detected with a
Similarity Score of 3, the system does not report a match (and no incident) because the
Similarity Score is below the Similarity Threshold.
Table 28-3 describes the Similarity Threshold and Similarity Score numbers.

Table 28-3 Similarity Threshold and Similarity Score details

Similarity Description

Similarity Threshold The Similarity Threshold is a configurable parameter between 0 and 10 that is unique to each
VML profile. The default setting is 10, which requires the most similar match between the VML
profile features and the detected message content. As such, this setting is likely to produce
fewer incidents. A setting of 0 produces the most number of matches, many of which are likely
to be false positives.

See “Adjusting the Similarity Threshold” on page 681.

Similarity Score The Similarity Score is a read-only run-time statistic between 0 and 10 reported by the system
based on the detection results of a VML policy. To report an incident, the Similarity Score must
be higher than the Similarity Threshold, otherwise the VML policy does not report a match.

About using unaccepted VML profiles in policies


The system lets you create a policy that is based on a VML profile that has never been accepted.
However, the VML profile is not active and is not deployed to a referenced policy until the
profile is initially accepted.
See “Training VML profiles” on page 672.
Detecting content using Vector Machine Learning (VML) 668
Configuring VML profiles and policy conditions

Where you have a VML policy that references a never-accepted VML profile, the result of this
configuration depends on the type of detection server. Table 28-4 describes the behavior:

Table 28-4 References to never-accepted VML profiles

Detection server Description

Discover Server Discover scanning does not begin until all policy dependencies are loaded.
A Discover scan based on a VML policy does not start until the referenced
VML profile is accepted. In this case the system displays a message in the
Discover scanning interface that indicates that the scan waits on the
dependency to load.

Network and Endpoint For a simple rule, or compound rule where the conditions are ANDed, the
Servers entire rule fails because the VML condition cannot match. If this is the only
rule in the policy, the policy does not work.

For a policy where there are multiple rules that are ORed, only the VML rule
fails; the other rules in the policy are evaluated.

See “Policy detection execution” on page 394.

Configuring VML profiles and policy conditions


Vector Machine Learning (VML) performs statistical analysis to protect unstructured data. It
also determines if content is similar to an example set of documents you train against.
See “Introducing Vector Machine Learning (VML)” on page 664.
The following table describes the process for implementing VML.

Table 28-5 Implementing VML

Step Action Description

Step 1 Collect the example documents for Collect a representative number of example documents that contain
training the system. the positive content that you want to protect and the negative
content you want to ignore.

See “About the content you train” on page 665.

Step 2 Create a new VML profile. Define a new VML profile based on the specific business category
of data from which you have derived your positive and negative
training sets.

See “Creating new VML profiles” on page 669.


Detecting content using Vector Machine Learning (VML) 669
Configuring VML profiles and policy conditions

Table 28-5 Implementing VML (continued)

Step Action Description

Step 3 Upload the example documents. Upload the example positive and negative training sets separately
to the Enforce Server.

See “Uploading example documents for training” on page 671.

Step 4 Train the VML profile. Train the system to learn the type of content you want to protect
and generate the VML profile.

See “Training VML profiles” on page 672.

Step 5 Accept or reject the trained profile. Accept the trained profile to deploy it. Or, reject the profile, update
one or both of the training sets (by adding or removing example
documents), and restart the training process.

See “About the base accuracy from training percentage rates”


on page 666.

See “Managing VML profiles” on page 677.

Step 6 Create a VML policy and test Create a VML policy that references the VML profile.
detection.
See “Configuring the Detect using Vector Machine Learning Profile
condition” on page 679.

Test and review incidents based on the Similarity Score.

See “About the Similarity Threshold and Similarity Score”


on page 667.

Step 7 Tune the VML profile. Adjust the Similarity Threshold setting as necessary to optimize
detection results.

See “Adjusting the Similarity Threshold” on page 681.

Step 8 Follow VML best practices. See “Best practices for using VML” on page 687.

Creating new VML profiles


A VML profile contains the model that is generated from the training set contents. Once you
define a VML profile, you use it to create one or more VML policies.
See “Configuring VML profiles and policy conditions” on page 668.

Note: You must have Enforce Server administrator privileges to create VML profiles.
Detecting content using Vector Machine Learning (VML) 670
Configuring VML profiles and policy conditions

To create a new VML profile


1 Click New Profile from the Manage > Data Profiles > Vector Machine Learning screen
(if you have not already done so).
2 Enter a Name for the VML profile in the Create New Profile dialog.
Use a logical name for the VML profile that corresponds to the category of data you want
to protect.
See “About the content you train” on page 665.
3 Optionally, enter a Description for the VML profile.
You may want to include a description that identifies the purpose of the VML profile.
4 Click Create to create the new VML profile.
Or, click Cancel to cancel the operation.
5 Click Manage Profile to upload example documents.
See “Uploading example documents for training” on page 671.

Working with the Current Profile and Temporary Workspace tabs


For any single VML profile there are two possible versions: Current and Temporary. The
Current Profile is the run-time version; the Temporary Profile is the design-time version. As
you develop a VML profile, you create a Current Profile that you have trained, accepted, and
perhaps deployed to one or more policies. You also create a Temporary Profile that you actively
edit and tune.
The Enforce Server administration console displays each version of the VML profile in separate
tabs:
■ Current Profile
This version is the active instance of the VML profile. This version has been successfully
trained and accepted; it is available for deployment to one or more policies.
■ Temporary Workspace
This version is an editable version of the VML profile. This version has not been trained,
or accepted, or both; it cannot be deployed to a policy.
Initially, when you create a new VML profile, the system displays only the Current Profile tab
with an empty training set. After you initially train and accept the VML profile, the Trained Set
table in the Current Profile tab is populated with details about the training set. The information
that is displayed in this table and tab is read-only.
Detecting content using Vector Machine Learning (VML) 671
Configuring VML profiles and policy conditions

To edit a VML profile


◆ Click Manage Profile to the far right of the Current Profile tab.
The system displays the editable version of the profile in the Temporary Workspace tab.
You can now proceed with training and managing the profile.
See “Training VML profiles” on page 672.
The Temporary Workspace tab remains present in the user interface until you train and
accept a new version of the VML profile. In other words, there is no way to close the Temporary
Workspace tab without training and accepting, even if you made no changes to the profile.
Once you accept a new version of the VML profile, the system overwrites the previous Current
Profile with the newly accepted version. You cannot revert to a previously accepted Current
Profile. However, you can revert to previous versions of the training set for a Temporary Profile.
See “Managing training set documents” on page 676.

Uploading example documents for training


The training set comprises the example positive and negative documents you want to train
the system against. You upload the positive and the negative documents separately.

Note: You can upload individual documents. However, we recommended that you upload a
document archive (such as ZIP, RAR, or TAR) that contains the recommended (250) or
minimum (50) number of example documents. The maximum upload size is 30 MB. You can
partition the documents across archives if you have more than 30 MB of data to upload. See
“About the content you train” on page 665.

To upload the training set


1 Click Manage Profile from the Current Profile tab (if you have not already done so).
This action enables the VML profile for editing in the Temporary Workspace tab.
See “Working with the Current Profile and Temporary Workspace tabs” on page 670.
2 Click Upload Contents (if you have not already done so).
This action opens the Upload Contents dialog.
3 Select the category of content:
■ Choose Positive: match contents similar to these to upload a positive document
archive.
■ Choose Negative: ignore contents similar to these to upload a negative document
archive.

4 Click Browse to select the document archive to upload.


Detecting content using Vector Machine Learning (VML) 672
Configuring VML profiles and policy conditions

5 Navigate the file system to where you have stored the example documents.
6 Choose the file to upload and click Open.
7 Verify that you have chosen the correct category of content: Positive or Negative.
If you mismatch the upload (select Negative but upload a Positive document archive), the
resulting profile is inaccurate.
8 Click Submit to upload the document archive to the Enforce Server.
The system displays a message indicating if the file successfully uploaded. If the upload
was successful, the document archive appears in the New Documents table. This table
displays the document type, name, size, date uploaded, and the user who uploaded it. If
the upload was not successful, check the error message and retry the upload. Click the
X icon in the Remove column to delete an uploaded document or document archive from
the training set.
9 Click Upload Contents to repeat the process for the other training set.
The profile is not complete and cannot be trained until you have uploaded the minimum
number of positive and negative example documents.
See Table 28-1 on page 666.
10 Once you have successfully uploaded both training sets you are ready to train the VML
profile.
See “Training VML profiles” on page 672.

Training VML profiles


During the profile training process, the system scans the training content, extracts key features,
and generates a statistical model. When the training process completes successfully, the
system prompts you to accept or reject the training profile. If you accept the training results,
that version of the VML profile becomes the Current Profile. The Current Profile is active and
available for use in one or more policies.
See “Configuring VML profiles and policy conditions” on page 668.
Detecting content using Vector Machine Learning (VML) 673
Configuring VML profiles and policy conditions

Table 28-6 Training the VML profile

Step Action Description

Step 1 Enable training mode. Select the VML profile you want to train from the Manage > Data Profiles >
Vector Machine Learning screen. Or, create a new VML profile.

See “Creating new VML profiles” on page 669.

Click Manage Profile to the far right of the Current Profile tab. The system
displays the profile for training in the Temporary Workspace tab.

See “Working with the Current Profile and Temporary Workspace tabs”
on page 670.

Step 2 Upload the training Familiarize yourself with the training set requirements and recommendations.
content.
See “About the content you train” on page 665.

Upload the positive and the negative training sets in separate document archives
to the Enforce Server.

See “Uploading example documents for training” on page 671.

Step 3 Adjust the memory The default value is "High" which generally results in the best training set accuracy
allocation (only if rates. Typically you do not need to change this setting. For some situations you
necessary). may want to choose a "Medium" or "Low" memory setting (for example, deploying
the profile to the endpoint).

See “Adjusting the memory allocation” on page 675.


Note: If you change the memory setting, you must do so before you train the
profile to ensure accurate training results. If you have already trained the profile,
you must retrain it again after you adjust the memory allocation.

Step 4 Start the training Click Start Training to begin the profile training process.
process. During the training process, the system:

■ Extracts the key features from the content;


■ Creates the model;
■ Calculates the predicted accuracy based on the averaged false positive and
false negative rates for the entire training set;
■ Generates the VML profile.
Detecting content using Vector Machine Learning (VML) 674
Configuring VML profiles and policy conditions

Table 28-6 Training the VML profile (continued)

Step Action Description

Step 5 Verify training When the training process completes, the system indicates if the training profile
completion. was successfully created.

If the training process failed, the system displays an error. Check the debug log
files and restart the training process.

See “Debug log files” on page 337.


On successful completion of the training process, the system displays the following
information for the New Profile:

■ Trained Example Documents


The number of example documents in each training set that the system has
trained against and profiled.
■ Accuracy Rate From Training
The quality of the training set expressed as base false positive and base false
negative percentage rates.
See “About the base accuracy from training percentage rates” on page 666.
■ Memory
■ The minimum amount of memory that is required to load the profile at run-time
for detection.

Note: If you previously accepted the profile, the system also displays the Current
Profile statistics for side-by-side comparison.

Step 6 Accept or reject the If the training process is successful, the system prompts you to accept or reject
training profile. the training profile. Your decision is based on the Accuracy Rate from Training
percentages.

See “About the base accuracy from training percentage rates” on page 666.
To accept or reject the training profile:

■ Click Accept to save the training results as the active Current Profile.
Once you accept the training profile, it appears in the Current Profile tab
and the Temporary Workspace tab is removed.
■ Click Reject to discard the training results.
The profile remains in the Temporary Workspace tab for editing. You can
adjust one or both of the training sets by adding or removing documents and
retraining the profile.
See “Managing training set documents” on page 676.

Note: A trained VML profile is not active until you accept it. The system lets you
create a policy based on a VML profile that has not been trained or accepted.
However, the VML profile is not deployed to that policy until the profile is accepted.
See “About using unaccepted VML profiles in policies” on page 667.
Detecting content using Vector Machine Learning (VML) 675
Configuring VML profiles and policy conditions

Table 28-6 Training the VML profile (continued)

Step Action Description

Step 7 Test and tune the Once you have successfully trained and accepted the VML profile, you can now
profile. use it to define policy rules and tune the VML profile.

See “Configuring the Detect using Vector Machine Learning Profile condition”
on page 679.

See “About the Similarity Threshold and Similarity Score” on page 667.
Note: For more information, refer to the Symantec Data Loss Prevention Vector
Machine Learning Best Practices Guide, available at the Symantec Support
Center at (http://www.symantec.com/docs/DOC8733).

Adjusting the memory allocation


The Memory Allocation setting determines the amount of memory that is required to load
VML the profile at run-time for policy detection. When you allocate more memory to training
the larger the VML profile, the profile becomes larger. More features are modeled. By default
this value is set to "High." You should not normally adjust this value. Resources are limited on
the endpoint. If you intend to deploy the VML profile to the endpoint, use a lower memory
setting to reduce the size of the profile.
To adjust memory allocation
1 Click Adjust beside the Memory Allocation setting.
This setting is available in the Temorary Workspace tab. If it is not available, click Manage
Profile from the Current Profile tab.
See “Working with the Current Profile and Temporary Workspace tabs” on page 670.
2 Select the desired memory allocation level.
The following options are available:
■ High
Requires a higher amount of run-time memory; generally yields higher detection
accuracy (default setting).
■ Medium
■ Low
Requires less run-time memory; may result in lower detection accuracy.

3 Click Save to save the setting.


The Memory Setting display should reflect the adjustment you made.
Detecting content using Vector Machine Learning (VML) 676
Configuring VML profiles and policy conditions

4 Click Start Training to start the training process.


You must adjust the memory allocation before you train the VML profile. If you have already
trained the profile, retrain after adjusting this setting.
See “Training VML profiles” on page 672.
5 Verify the amount of memory that is required to run the VML profile.
After you train the VML profile, the system displays the Memory Required (KB) value.
This value, represents the minimum amount of memory that is required to load the profile
at run-time.
See “Managing VML profiles” on page 677.

Managing training set documents


As you train and tune a VML profile, you may need to adjust one or both of the training sets.
For example, if you reject a training profile, you must add or remove example documents to
improve the training accuracy rates.
See “About the base accuracy from training percentage rates” on page 666.
To add documents to a training set
1 Click Manage Profile for the profile you want to edit.
The editable profile appears in the Temporary Workspace tab.
2 Click Upload Contents.
See “Uploading example documents for training” on page 671.
To remove documents from a training set
1 Click Manage Profile for the profile you want to edit.
The editable profile appears in the Temporary Workspace tab.
2 Click the red X in the Mark Removed column for the trained document you want to remove.
The removed document appears in the Removed Documents table. Repeat this process
as necessary to remove all unwanted documents from the training set.
3 Click Start Training to retrain the profile.
You must retrain and accept the updated profile to complete the document removal
process. If you do not accept the new profile the document you attempted to remove
remains part of the profile.
See “Training VML profiles” on page 672.
Detecting content using Vector Machine Learning (VML) 677
Configuring VML profiles and policy conditions

To revert removed documents


1 Click the revert icon in the Revert column for a document you have removed.
The document is added back to the training set.
2 Click Start Training to retrain the profile.
You must retrain the profile and reaccept it even though you reverted to the original
configuration.

Managing VML profiles


The Manage > Data Profiles > Vector Machine Learning screen is the home page for
managing existing VML profiles and the starting point for creating new VML profiles.
See “Configuring VML profiles and policy conditions” on page 668.

Note: You must have Enforce Server administrator privileges to manage and create VML
profiles.

Table 1 Creating and managing VML profiles

Action Description

Create new profiles. Click New Profile to create a new VML profile.

See “Creating new VML profiles” on page 669.

View and sort The system lists all existing VML profiles and their state at the Vector Machine
profiles. Learning screen.

Click the column header to sort the VML profiles by name or status.

Manage and train Select a VML profile from the list to display and manage it.
profiles.
The Current Profile tab displays the active profile.

See “Working with the Current Profile and Temporary Workspace tabs” on page 670.

Click Manage Profile to edit the profile.


The editable profile appears in the Temporary Workspace tab. From this tab you
can:

■ Upload training set documents.


See “Uploading example documents for training” on page 671.
■ Train the profile.
See “Training VML profiles” on page 672.
■ Add and remove documents from the training sets.
See “Managing training set documents” on page 676.
Detecting content using Vector Machine Learning (VML) 678
Configuring VML profiles and policy conditions

Table 1 Creating and managing VML profiles (continued)

Action Description

Monitor profiles. The system lists and describes the status of all VML profiles.
■ Memory Required (KB)
The minimum amount of memory that is required to load the profile in memory
for detection.
See “Adjusting the memory allocation” on page 675.
■ Status
The present status of the profile.
See Table 28-8 on page 678.
■ Deployment Status
The historical status of the profile.
See Table 28-9 on page 679.

Remove profiles. Click the X icon at the far right to delete an existing profile.

If you delete an existing profile, the system removes the profile metadata and the
Training Set from the Enforce Server.

The Status field displays the current state of each VML profile.

Table 28-8 Status values for VML profiles

Status value Description

Accepted on <date> The date the training profile was accepted.

Managing The current profile is enabled for editing.

Empty The profile is created, but no content is uploaded.

Awaiting Acceptance The profile is ready to be accepted.

Canceling Training The system is in the process of canceling the training.

Training Canceled The training process is canceled.

Failed The training process failed.

Training <time> The training is in progress (for the time indicated).

The Deployment Status field indicates if the VML profile has ever been accepted or not.
Detecting content using Vector Machine Learning (VML) 679
Configuring VML profiles and policy conditions

Table 28-9 Deployment Status values for VML profiles

Status value Description

Never Accepted The VML profile has never been accepted.


See “About using unaccepted VML profiles in policies”
on page 667.

Accepted on <date> The VML profile was accepted on the date indicated.

Changing names and descriptions for VML profiles


If necessary you can change the name of a VML profile or edit its description. When you are
ready to deploy a VML profile to one or more policies, give the profile a self-describing name
so policy authors can easily recognize it.

Note: You do not have to retrain a profile if you change the name or description.

To change the VML profile name or description


1 Select the VML profile from the Manage > Data Profiles > Vector Machine Learning
screen.
See “Managing VML profiles” on page 677.
2 Click the Edit link beside the name of the VML profile.
3 Edit the name and description of the profile in the Change Name and Description dialog
that appears.
4 Click OK to save the changes to the VML profile name or description.
5 Verify the changes at the home screen for the VML profile.

Configuring the Detect using Vector Machine Learning Profile


condition
Once you have trained and accepted the VML profile, you configure a VML policy using the
Detect using Vector Machine Learning Profile condition. This condition references the VML
profile to detect the content that is similar to the example content you have trained against.
See “Configuring VML profiles and policy conditions” on page 668.
Detecting content using Vector Machine Learning (VML) 680
Configuring VML profiles and policy conditions

Table 28-10 Configuring a VML policy rule

Step Action Description

Step 1 Create and train the VML See “Creating new VML profiles” on page 669.
profile.
See “Training VML profiles” on page 672.

See “About using unaccepted VML profiles in policies” on page 667.

Step 2 Configure a new or an existing See “Configuring policies” on page 413.


policy.

Step 3 Add the VML rule to the policy. From the Configure Policy screen:

■ Select Add Rule.


■ Select the Detect using Vector Machine Learning profile rule from
the list of content rules.
■ Select the VML profile you want to use from the drop-down menu.
■ Click Next.

Step 4 Configure the VML detection Name the rule and configure the rule severity.
rule.
See “Configuring policy rules” on page 417.

Step 5 Select components to match Select one or both message components to Match On:
on.
■ Body, which is the content of the message
■ Attachments, which are any files transported by the message

Note: On the endpoint, the Symantec DLP Agent matches on the entire
message, not individual message components.

See “Selecting components to match on” on page 423.

Step 6 Configure additional conditions Optionally, you can create a compound detection rule by adding more
(optional). conditions to the rule.

To add additional conditions, select the desired condition from the


drop-down menu and click Add.
Note: All conditions must match for the rule to trigger an incident.

See “Configuring compound match conditions” on page 429.

Step 7 Save the policy configuration. Click OK then click Save to save the policy.

Configuring VML policy exceptions


In some situations, you may want to implement a VML policy exception to ignore certain
content.
See “Configuring VML profiles and policy conditions” on page 668.
Detecting content using Vector Machine Learning (VML) 681
Configuring VML profiles and policy conditions

Table 28-11 Configuring a VML policy exception

Step Action Description

Step 1 Create and train the VML profile. See “Creating new VML profiles” on page 669.
See “Training VML profiles” on page 672.

Step 2 Configure a new or an existing See “Configuring policies” on page 413.


policy.

Step 3 Add a VML exception to the From the Configure Policy screen:
policy.
■ Select Add Exception.
■ Select the Detect using Vector Machine Learning profile exception
from the list of content exceptions.
■ Select the VML profile you want to use from the drop-down menu.
■ Click Next.

Step 4 Configure the policy exception. Name the exception.


Select the components you want to apply the exception to:

■ Entire Message
Select this option to compare the exception against the entire
message. If an exception is found anywhere in the message, the
exception is triggered and no matching occurs.
■ Matched Components Only
Select this option to match the exception against the same
component as the rule. For example, if the rule matches on the Body
and the exception occurs in an attachment, the exception is not
triggered.

Step 5 Configure the condition. Generally you can accept the default condition settings for policy
exceptions.

See “Configuring policy exceptions” on page 426.

Step 6 Save the policy configuration. Click OK then click Save to save the policy.

Adjusting the Similarity Threshold


You adjust the Similarity Threshold setting to tune the VML profile. The Similarity Threshold
determines how similar detected content must be to a VML profile to produce an incident.
See “About the Similarity Threshold and Similarity Score” on page 667.

Note: You do not have to retrain the VML profile after you adjust the Similarity Threshold,
unless you modify a training set based on testing results.
Detecting content using Vector Machine Learning (VML) 682
Configuring VML profiles and policy conditions

To adjust the Current Value of the Similarity Threshold


1 Click Edit beside the Similarity Threshold label for the VML profile you want to tune.
This action opens the Similarity Threshold dialog.
2 Drag the meter to the desired Curent Value setting.
You set the Similarity Threshold to a decimal value between 0 and 10. The default value
is 10, which produces fewer incidents; a setting of 0 produces more incidents.
3 Click Save to save the Similarity Threshold setting.
4 Test the VML profile using a VML policy.
Compare the Similarity Scores across matches. A detected message must have a Similarity
Score higher than the Similarity Threshold to produce an incident. Make further adjustments
to the Similarity Threshold setting as necessary to optimize and fine-tune the VML profile.
See “Configuring the Detect using Vector Machine Learning Profile condition” on page 679.

Testing and tuning VML profiles


You tune a VML profile by testing it with the Similarity Threshold set to 0. After you determine
the possible range of Similarity Scores for false positives, adjust the Similarity Threshold to
be greater than the highest Similarity Score that false positives reports. This process is known
as negative testing.
A good training set has a well-defined range where the Similarity Threshold is set to achieve
the best accuracy rates. A poor training set yields a poor accuracy result regardless of the
Similarity Threshold. A Similarity Threshold that is set too high or too low can result in a large
number of false positives or false negatives.
To determine the proper Similarity Threshold setting, the recommendation is to perform negative
testing as described in the following steps.

Table 28-12 Steps for tuning VML profiles

Step Action Description

Step 1 Train the VML profile. Follow the recommendations in this guide for defining the category and uploading
the training set documents. Adjust the memory allocation before you train the
profile. Refer to the Symantec Data Loss Prevention Administration Guide for help
performing the tasks involved.

Step 2 Set the Similarity The default Similarity Threshold is 10. At this value the system does not generate
Threshold to 0. any incidents. A setting of 0 produces the most incidents, many of which are likely
to be false positives. The purpose of setting the value to 0 is to see the entire
range of potential matches. It also servers to tune the profile to be greater than
the highest false positive score.
Detecting content using Vector Machine Learning (VML) 683
Configuring VML profiles and policy conditions

Table 28-12 Steps for tuning VML profiles (continued)

Step Action Description

Step 3 Create a VML policy. Create a policy that references the VML profile you want to tune. The profile must
be accepted to be deployable to a policy.

Step 4 Test the policy. Test the VML policy using a corpus of test data. For example, you can use the
DLP_Wikipedia_sample.zip file to test your VML policies against. Create a
mechanism to detect incidents. The mechanism can be a Discover scan target of
a local file folder where you place the test data. Or it can be a DLP Agent scan of
a copy/paste operation.

Step 5 Review any incidents. Review any matches at the Incident Snapshot screen. Verify a relatively low
Similarity Score for each match. A relatively low Similarity Score indicates a false
positive. If one or more test documents produce a match with a relatively high
Similarity Score, you have a training set quality issue. In this case you need to
review the content and if appropriate add the document(s) to the positive training
set. You then need to retrain and retune the profile.

See “Log files for troubleshooting VML training and policy detection” on page 686.

Step 6 Adjust the Similarity Review the incidents to determine the highest Similarity Score among the detected
Threshold. false positives that you have tested the profile against. Then, you can adjust the
Similarity Threshold for the profile to be greater than the highest Similarity Score
for the false positives.

For example, if the highest detected false positive has a Similarity Score of 4.5,
set the Similarity Threshold to 4.6. This setting filters the known false positives
from being reported as incidents.

Properties for configuring training


VML includes several property files for configuring VML training and logging. The following
table lists and describes relevant VML configuration properties.

Table 28-13 Property files for VML

Property file at \Protect\config\ Description

MLDTraining.properties Main property file for configuring VML training settings.


See Table 28-14 on page 684.

Manager.properties Property file for the Enforce Server; contains 1 VML setting.

See Table 28-15 on page 685.


Detecting content using Vector Machine Learning (VML) 684
Configuring VML profiles and policy conditions

Table 28-13 Property files for VML (continued)

Property file at \Protect\config\ Description

MLDTrainingLogging.properties Properties file for configuring VML logging.


See “Log files for troubleshooting VML training and policy
detection” on page 686.

The following table lists and describes the VML training parameters available for configuration
in properties file MLDTraining.properties.

Table 28-14 Relevant configuration parameters for VML training

Parameter Description

minimum_documents_per_category Specifies the minimum number of documents that are


required for each training set (positive and negative). The
default setting is 50. Reducing this number below 50 is
not recommended or supported.

See “Recommendations for training set definition”


on page 689.

mld_num_folds Specifies the number of folds to use for the k-fold


evaluation process. The default is 10.

Reducing this value speeds up the time the system takes


to train against the content because fewer folds are
evaluated. This speed up occurs potentially at the sacrifice
of visibility into profile quality. You don't need to change
this value, unless you have a large number of example
documents (and thus the training sets are very large). Or,
unless you know for certain that you have a
well-categorized overall training set.

See “Recommendations for accepting or rejecting a profile”


on page 692.

minimum_features_to_keep Specifies the minimum number of features to keep for the


profile. The default setting is 1000.

Lowering this value can help reduce the size of the profile.
However, adjusting this setting is not recommended.
Instead, use the memory allocation setting to tune the size
of the profile.

See “Guidelines for profile sizing” on page 691.


Detecting content using Vector Machine Learning (VML) 685
Configuring VML profiles and policy conditions

Table 28-14 Relevant configuration parameters for VML training (continued)

Parameter Description

significance_threshold Specifies the minimum number of times a word must occur


before it is considered a feature. The default is 2.

Increasing this value (to 3 or 4, for example) may help


reduce the size of the profile because fewer words qualify
as features. You should not adjust this setting unless
setting the memory allocation to "Low" does not produce
a small enough profile for your deployment requirements.

See “Guidelines for profile sizing” on page 691.

stopword_file Specifies the default stopword file


\config\machinelearningconfig\stopwords.txt.

Stopwords are common words, such as articles and


prepositions. During training the system ignores (does not
consider for feature extraction) any word that is contained
in the stopwords file.

If you add words to be ignored, you must use all lower


case because VML feature extraction normalizes the
content to lower case for evaluation.

logging_config_file Specifies the configuration file for standard VML logging.

See “Log files for troubleshooting VML training and policy


detection” on page 686.

native_logging_config_file Specifies the configuration file for native VML logging.

See “Log files for troubleshooting VML training and policy


detection” on page 686.

The following parameter is available for configuration in properties file


MLDTraining.properties.

Table 28-15 Configuration parameter for VML profiles

Parameter Description

DEFAULT_SIMILARITY_THRESHOLD Establishes the default value for the Similarity Threshold,


which is 10. Changing this value affects the default value
only. You can adjust the value using the Enforce Server
administration console.

See “Testing and tuning VML profiles” on page 682.


Detecting content using Vector Machine Learning (VML) 686
Configuring VML profiles and policy conditions

Log files for troubleshooting VML training and policy detection


The system provides debug log files for troubleshooting the VML training process and policy
detection. The following table lists and describes the debug log files.
See “Troubleshooting policies” on page 445.

Table 28-16 Debug log files for VML

Log file Description

machinelearning_training.log Records the accuracy from training percentage rates for


each fold of the evaluation process for each VML profile
training run.

Examines the quality of each training set at a granular,


per-fold level.

See “Recommendations for accepting or rejecting a


profile” on page 692.

machinelearning_native_filereader.log Records the "distance," which is expressed as a positive


or negative number, and the "confidence," which is a
similarity percentage, for each message evaluated by a
VML policy.

Examines all messages or documents evaluated by VML


policies, including positive matches with similarity
percentages beneath the Similarity Threshold, or
messages the system has categorized as negative
(expressed as a negative "distance" number).

See “Testing and tuning VML profiles” on page 682.

machinelearning_training_native_manager.log Records the total number of features modeled and the


number of features kept to generate the profile for each
training run.
The total number of features modeled versus the number
of features kept for the profile depends on the memory
allocation setting:

■ If "high" the system keeps 80% of the features.


■ If "medium" the system keeps 50% of the features.
■ If "low" the system keeps 30% of the features.

See “Guidelines for profile sizing” on page 691.


Detecting content using Vector Machine Learning (VML) 687
Best practices for using VML

Best practices for using VML


This section provides best practices for implementing VML policies, including best practices
for testing and tuning your VML policies.
In addition, you can download example VML training set documents from the Symantec Support
Center at http://www.symantec.com/docs/DOC8733. These documents are provided under
the Creative Commons license (http://creativecommons.org/licenses/by-sa/3.0/).
Table 28-17 provides a summary of the VML best practices that are discussed in this section.
It includes links to individual topics for more in-depth recommendations.

Table 28-17 Summary of VML best practices

Functional area Best practice

Recommended Use VML to protect unstructured, text-based content. Do not use VML to protect graphics, binary
uses for VML data, or personally identifiable information (PII).

See “When to use VML” on page 688.

Category of content Define the VML profile based on a single category of content that you want to protect. The
category of content should be derived from a specific business use case. Narrowly defined
categories are better than broadly defined ones.

See “Recommendations for training set definition” on page 689.

Positive training set Archive and upload the recommended (250) number of example documents for the positive
training set, or at least the minimum (50).

See “Guidelines for training set sizing” on page 690.

Negative training Archive and upload the example documents for the negative training set. Ideally the negative
set training set contains a similar number of well-categorized documents as the positive training set.
In addition, add some documents containing generic or neutral content to your negative training
set.

See “Guidelines for training set sizing” on page 690.

Profile sizing Consider adjusting the memory allocation to low. Internal testing has shown that setting the
memory allocation to low may improve accuracy in certain cases.

See “Guidelines for profile sizing” on page 691.

Training set quality Reject the training result and adjust the example documents if either of the base accuracy rates
from training are more than 5%.

See “Recommendations for accepting or rejecting a profile” on page 692.

Profile tuning Perform negative testing to tune the VML profile by using a corpus of testable data.

See “Testing and tuning VML profiles” on page 682.


Detecting content using Vector Machine Learning (VML) 688
Best practices for using VML

Table 28-17 Summary of VML best practices (continued)

Functional area Best practice

Profile deployment Remove accepted profiles not in use by policies to reduce detection server load. Tune the
Similarity Threshold before deploying a profile into production across all endpoints to avoid
network overhead.

See “Recommendations for deploying profiles” on page 694.

When to use VML


VML is designed to protect unstructured content that is primarily text-based. VML is well-suited
for protecting sensitive content that is highly distributed such that gathering all of it for
fingerprinting is not possible or practical. VML is also well-suited for protecting sensitive content
that you cannot adequately describe and achieve high matching accuracy.
The following table summarizes the recommended uses cases for VML.

Table 28-18 Recommended uses for VML

Use VML when Explanation

It is not possible or practical Often collecting all of the content you want to protect for fingerprinting is an impossible
to fingerprint all the data you task. This situation arises for many forms of unstructured data: marketing materials,
want to protect. financial documents, patient records, product formulas, source code, and so forth.

VML works well for this situation because you do not have to collect all of the content
you want to protect. You collect a smaller set of example documents.

You cannot adequately Often describing the data you want to protect is difficult without sacrificing some
describe the data you want to accuracy. This situation may arise when you have long keyword lists that are hard to
protect. generate, tune, and maintain.

VML works well in these situations because it automatically models the features
(keywords) you want to protect. It enables you to easily manage and update the source
content.

A policy reports frequent false Sometimes a certain category of information is a constant source of false positives.
positives. For example, a weekly sales report may consistently produce false positives for a Data
Identifier policy looking for social security numbers.

VML may work well here because you can train against the content that causes the
false positives and create a policy exception to ignore those features.
Note: The false positive contents must belong to a well-defined category for VML to
be an effective solution for this use case. See “Recommendations for training set
definition” on page 689.
Detecting content using Vector Machine Learning (VML) 689
Best practices for using VML

When not to use VML


VML is not designed to protect structured data, such as Personally Identifiable Information
(PII), or binary content, such as documents that contain mostly graphics or image files.
The following table summarizes the non-recommended uses of VML.

Table 28-19 Non-recommended uses for VML

Do not use VML to Explanation

Protect personally identifiable Exact Data Matching (EDM) and Data Identifiers are the best option for protecting the
information (PII). common types of PII.

Protect binary files and Indexed Document Matching (IDM) is the best option to protect the content that is
images. largely binary, such as image files or CAD files.

Recommendations for training set definition


A VML category is the specific business use case from which you derive your example
documents for training the VML profile. The more specific the category the better the detection
results. For example, the category "Financial Documents" is not recommended because it is
too broad. A better category classification is "Sales Forecasts" or "Quarterly Earnings" because
each is particular to a specific business use case.
A VML category contains two sets of training content: positive and negative. The positive
training set contains content you want to protect; the negative training set contains content
you want to ignore. You should derive both the positive and negative training sets from the
same category of content such that all documents are thematically related.
Using an entirely generic content for the negative training set, while possible, is not
recommended. While generic content produces good design-time training accuracy rates, you
cannot detect the content you want to protect at run-time with sufficient accuracy.

Note: While a completely generic negative training set is not recommended, seeding the
negative training set with some neutral-content documents does have value. See “Guidelines
for training set sizing” on page 690.

The following table provides some example categories and possible positive and negative
training sets comprising those categories.
Detecting content using Vector Machine Learning (VML) 690
Best practices for using VML

Table 28-20 Some example categories and training sets

Category Positive training set Negative training set

Product source code Proprietary product source code Source code from open source
projects

Product formulas Proprietary product formulas Non-proprietary product information

Quarterly earnings Pre-release earnings; sales estimates; Details of published annual accounts
accounting documents

Marketing plans Marketing plans Published marketing collateral and


advertising copy

Medical records Patient medical records Healthcare documents

Customer sales Customer purchasing patterns Publicly available consumer data

Mergers and acquisitions Confidential legal documents; M&A Publicly available materials; press
documents releases

Manufacturing methods Proprietary manufacturing methods Industry standards


and research

Guidelines for training set sizing


VML is only as accurate as the example content you train. To use VML you do not have to
locate all the data you want to protect, nor do you have to describe it. Instead, your sample
documents must accurately represent the type of content you want to protect They must also
represent content that you want to ignore. This content must be thematically related to the
positive content.
Higher numbers of example documents collected for training yield more accurate VML profiles.
A well-defined category of content contains 500 example documents: 250 positive and 250
negative. The minimum number of documents per training set is 50.
Ideally, you collect a similar number of negative and positive documents for training. You
should seed the negative training set with generic or neutral-content documents. The archive
file DLP_Wikipedia_sample.zip that is attached to this guide at the Symantec Support Center
is provided for this purpose.
As an example, your positive training set contains 250 example documents and your negative
training set contains 150 documents. You can add 100 to 200 generic documents to your
negative training set from the DLP_Wikipedia_sample.zip archive file. Internal testing has
shown that adding generic content to complement a well-defined negative training set can
improve accuracy for VML.
Detecting content using Vector Machine Learning (VML) 691
Best practices for using VML

If you cannot collect enough positive documents to meet the minimum requirement, you can
upload the under-sized training set multiple times. For example, consider a case where you
have the category of content "Sales Forecasts." For this category you have collected 25 positive
spreadsheets and 50 negative documents. In this case, you can upload the positive training
set twice to reach the minimum document threshold and equal the number of negative
documents. Note that you should use this technique for development and testing purposes
only. Production profiles should be trained against at least the minimum number of documents
for both training sets.
Table 28-21 lists the optimal, recommended, and minimum number of documents to include
in each training set.

Note: These training set guidelines assume an average document size of 3 KB. If you have
larger-sized documents, fewer in number may be sufficient.

Table 28-21 Training set size guidelines

Training set Minimum Recommended

Positive example documents 50 250

Negative example documents 50 250

Total number of documents for the


100 500
category

Recommendations for uploading documents for training


While you can upload individual documents to the Enforce Server for training, it is recommended
that you upload a document archive (ZIP, RAR, TAR) that contains the example documents
for each training set. The maximum upload size is 30 MB. There is no training set size limit.
To gather the documents for training, it is recommended that you create a staging area. For
example, consider a category called "Sales Reports." In this case you would create a folder
called \VML\training_stage\sales_reports that represents the category. Within this folder
you would create two subfolders, one for the positive training set and the other for the negative
training set (for example: \VML\training_stage\sales_reports\positive). When you are
ready to train the profile, you compress the positive subfolder and the negative subfolder into
separate document archives. You can partition the training set across archives if you have
more than 30 MB of data to upload for a training set. Do not embed an archive within an archive.

Guidelines for profile sizing


Before you train a VML profile, you can adjust the amount of memory allocated to the profile.
The amount of memory you allocate determines how many features the system models, which
Detecting content using Vector Machine Learning (VML) 692
Best practices for using VML

in turn affects the size of the profile. The higher the memory allocation setting, the more in-depth
the feature extraction and the plotting of the model, and the larger the profile. In general, for
server-based policy detection, the recommended memory allocation setting is high, which is
the default setting.
On the endpoint, the VML profile is deployed to the host computer and loaded into memory
by the DLP Agent. (Unlike EDM and IDM, VML does not rely on two-tier detection for endpoint
policies.) Because memory on the endpoint is limited, the recommendation is to allocate low
or medium memory for endpoint policies. Internal testing has shown that reducing the memory
allocation does not reduce the accuracy of the profile and may improve accuracy in certain
situations.

Table 28-22 Memory allocation recommendations

Memory allocation Description

High Default setting generally appropriate for server-based detection.

Medium Use this setting to reduce the size of the profile.

Low Use this setting for endpoint detection.

Recommendations for accepting or rejecting a profile


When you train a VML profile against the category content, the system selects features, creates
the model, and calculates the base accuracy rates for false positives and negatives. Base
accuracy rates are calculated using a standard and generally accepted process called k-folds
evaluation. The base accuracy rates provide you with an early indicator of the quality of your
category training sets.
To illustrate how the k-folds evaluation process works, assume that you have a category with
500 total example documents: 250 positive and 250 negative. During the training run, the
system divides the training set into 10 folds. Each fold is a distinct subset of the overall training
set and contain both positive and negative example documents. The system uses nine folds
to generate a VML profile, and one fold to test the profile. Any of the folds can become the
test fold for the first round of evaluation. For the next round, the next fold in the queue becomes
the test fold. This process repeats for all 10 folds. The system performs a final training run
called the cross-fold, averages the results of all folds, and generates the final model.
On successful completion of the training process, the system displays the averaged accuracy
rates and prompts you to accept or reject the training profile. The false positive accuracy rate
is the percentage of negative test documents that are misclassified as positive. The false
negative rate is the percentage of positive test documents that are misclassified as negative.
As a general guideline, you should reject the training profile if either rate is more than 5%.
Detecting content using Vector Machine Learning (VML) 693
Best practices for using VML

Note: You can use the log file machinelearning_training.log to evaluate per-fold training
accuracy rates.
See “Log files for troubleshooting VML training and policy detection” on page 686.

Guidelines for accepting or rejecting training results


You decide to accept or reject a training profile based on the false positive and false negative
percentages that the system displays to you at the end of the training process.
See “About the Similarity Threshold and Similarity Score” on page 667.
To better understand how the system calculates the Machine Learning Profile training set
accuracy rates, consider the following example.
You have a training set that includes 1000 documents, 500 positive and 500 negative. When
you train the profile, the system takes 90% of the documents, extracts the features, and creates
a model. It takes the remaining 10% of the documents and evaluates their features against
the model for similarity. It then produces false positive and false negative accuracy rates. This
process is known as the "fold." For each training set, the system evaluates ten folds, each
time comparing a different 10% of the documents against the 90%. At the end of the cycle,
the system performs a cross-fold evaluation of all ten folds. It then produces an average
accuracy percentage rate for both the positive and negative categories.
Assume that the result of the training process yields a base false positive rate of approximately
1.2% and a base false negative rate of approximately 1%. On average, 1.2% of the negative
documents in the training set are mis-categorized as positive, and 1% of the documents in the
training set are mis-categorized as negative. While the goal is 0% for both rates, in general a
percentage rate under 5% for each category is acceptable.
The percentages that are produced at the end of the training process are averages across the
10 folds. Rather than relying on the general 5% rule of thumb, the better practice is to review
the percentage rate results for each fold. To review the percentage rates, examine the log file
\ProgramData\Symantec\DataLossPrevention\EnforceServer\15.5\Protect\logs\debug\mld0.log
(Windows) or
/var/log/Symantec/DataLossPrevention/DetectionServer/15.5/debug/mld0.log (Linux).
As shown below, the individual fold rates give a reading for each of the ten folds on which you
can base your decision to accept or reject the profile.

Table 28-23 Training set accuracy evaluation process

Fold evaluation Per fold category accuracy rates and cross-fold averages

Fold 0 false positive rate 2.013422727584839 false negative rate 0.0

Fold 1 false positive rate 1.3513513803482056 false negative rate 1.7857142686843872


Detecting content using Vector Machine Learning (VML) 694
Best practices for using VML

Table 28-23 Training set accuracy evaluation process (continued)

Fold evaluation Per fold category accuracy rates and cross-fold averages

Fold 2 false positive rate 1.3513513803482056 false negative rate 0.8928571343421936

Fold 3 false positive rate 1.3513513803482056 false negative rate 1.7857142686843872

Fold 4 false positive rate 1.3513513803482056 false negative rate 0.8928571343421936

Fold 5 false positive rate 1.3513513803482056 false negative rate 2.6785714626312256

Fold 6 false positive rate 0.0 false negative rate 0.0

Fold 7 false positive rate 0.6756756901741028 false negative rate 0.0

Fold 8 false positive rate 1.3513513803482056 false negative rate 0.8928571343421936

Fold 9 false positive rate 1.3513513803482056 false negative rate 1.8018018007278442

Cross-fold Avg False Positive Rate 1.214855808019638 Avg False Negative Rate
1.0730373203754424

Recommendations for deploying profiles


Accepted VML profiles are transferred to every detection server and Symantec DLP Agent
even if those profiles are not required by the active policies on that server or endpoint. Detection
servers load all VML profiles into memory regardless of whether or not any associated VML
policies are deployed to those servers. DLP Agents only load the VML profiles that are required
by an active policy. To optimize server performance, it is recommended not to deploy (accept)
unnecessary VML profiles and remove any accepted (deployed) VML profiles that are not
required by active policies.
In addition, when you change the Similarity Threshold, the system re-syncs the entire profile
with the detection servers and DLP Agents. If you have a large VML profile and possible
bandwidth limitations (for example, deployment to many endpoints), this may cause network
congestion. In this case you should test and tune the profile at a select few endpoints before
deploying the profile into production at every endpoint on your network.
Chapter 29
Detecting content using
Form Recognition -
Sensitive Image Recognition
This chapter includes the following topics:

■ About Form Recognition detection

■ Configuring Form Recognition detection

■ Managing Form Recognition profiles

■ Advanced server settings for Form Recognition

■ Viewing a Form Recognition incident

About Form Recognition detection


Form Recognition provides the ability to detect forms that contain sensitive information, such
as tax forms, medical forms, insurance forms, and so on.
Form Recognition detects form images in a variety of image formats, including the following:
■ Microsoft Office documents
■ PDF (version 1.2 and later only)
■ PDF that use AcroForms format
■ XFA (Only the hard-copy image, or the image that you would see if you printed the form,
is supported. Soft copies, such as fillable forms, are not supported. Text extraction from
XFA is also not supported
■ JPEG (.jpg, .jpeg)
Detecting content using Form Recognition - Sensitive Image Recognition 696
Configuring Form Recognition detection

■ PNG
■ TIFF (single page or multi-page, .tif or .tiff)
■ Bitmap (.bmp, .dib)
Form Recognition is available for Network Monitor, Network Prevent for Email, Network Prevent
for Web, and Network Discover. Form Recognition is not available for Endpoint Discover,
Endpoint Prevent, or any cloud detectors.
See “Configuring Form Recognition detection” on page 696.
See “About extracting images from Microsoft Office documents for OCR and Form Recognition”
on page 706.

How Form Recognition works


Symantec Data Loss Prevention analyzes the features of your blank forms and stores the
results as key points in the Form Recognition profile. This process is called indexing. Then
the detection server compares images in network traffic or stored in data repositories to the
forms you have indexed. The extent that the detected form matches key points in indexed
blank form is called the alignment. By default, 85% of the key points must match or align for
the form to be considered a match.
The comparison between the detected image and the indexed blank form also allows Symantec
Data Loss Prevention to determine how much of the form has been filled in. The fill threshold
is represented as a range from 1-10, where 1 is a minimally filled-in form, and 10 is an entirely
filled-in form. You use the fill threshold to specify when Symantec Data Loss Prevention creates
an incident. A low fill threshold creates more incidents by detecting partially filled-in,
electronically fillable forms with at least one check-box filled, or incomplete forms. A high fill
threshold creates fewer incidents, but may not catch all possible data loss. A fill threshold of
0 detects all matching forms, including blank forms. By default, the fill threshold for a Form
Recognition profile is 1. You can specify another value when you create a profile. You can
also adjust this value for an existing profile to fine-tune your detection results.
See “Configuring Form Recognition detection” on page 696.
See “Managing Form Recognition profiles” on page 700.

Configuring Form Recognition detection


To configure Form Recognition, you collect a blank set of forms that you want to protect and
add them to a ZIP archive of single-page PDF files. This ZIP archive is called a Gallery Archive.
You then upload your gallery archive to a Form Recognition profile on the Enforce Server for
indexing. The Enforce Server indexes your forms and pushes the index out to your detection
servers. You also specify the fill threshold for the profile: the fill threshold specifies how much
of the form must be filled to trigger an incident.
Detecting content using Form Recognition - Sensitive Image Recognition 697
Configuring Form Recognition detection

Table 29-1 provides a high-level workflow for configuring Form Recognition detection:

Table 29-1 Form Recognition workflow

Step Action More information

1 Collect and prepare blank copies of the forms you want to protect. See “Preparing a Form Recognition
Gallery Archive” on page 697.

2 Configure a Form Recognition profile. Specify the Gallery Archive See “Configuring a Form Recognition
with the forms you want to detect and a Fill Threshold for creating profile” on page 698.
incidents.

3 Configure a policy with a Form Recognition detection or exception See “Configuring the Form Recognition
rule using your Form Recognition profile. detection rule” on page 699.

See “Configuring the Form Recognition


exception rule” on page 700.

Preparing a Form Recognition Gallery Archive


The Form Recognition gallery archive is a ZIP archive containing single-page PDF copies of
the blank forms you want to protect. You use the gallery archive to create a Form Recognition
profile.
Symantec recommends that you index no more than 500 total images across all Form
Recognition profiles. To improve performance, Symantec recommends creating fewer profiles
that contain more forms, rather than more profiles that contain fewer forms.
For best results, ensure that the form images in your gallery archive meet the following
guidelines:
■ The PDF files containing the form images should be at least 200 DPI.
■ Forms with electronically fillable fields must be in ArcroForm format. Other interactive form
formats are not supported for detection.
■ Each form should have a sufficient amount of text and graphical content. Sparse forms
may cause more false matches.
■ Each form should contain unique content. Forms that share very similar content are harder
to match and may cause more false matches. For example, tax forms from 2014 and 2015
would share many similar features, and would be difficult to detect if they were in the same
profile.
■ Each form should have content evenly distributed across the page. Forms with clustered
content and sparse areas are more difficult to match.
■ Each form should have either white or light-colored backgrounds. Black or dark backgrounds
are not supported.
Detecting content using Form Recognition - Sensitive Image Recognition 698
Configuring Form Recognition detection

To prepare a Form Recognition Gallery Archive


1 Collect blank copies of the forms you want to detect.
2 Save all blank copies of forms as PDF files. Consider the following guidelines as you
prepare PDF files:
■ The gallery must only contain PDF files. Symantec Data Loss Prevention ignores any
other folders and files in the ZIP archive.
■ If a form has two or more pages, separate them into single-page files, then convert to
PDF format.
For example, if your form is a single three-page Microsoft Word file titled
YourForm.docx, separate the file into three separate single-page files, then convert
them to PDF:
■ YourForm_1of3.PDF

■ YourForm_2of3.PDF

■ YourForm_3of3.PDF

■ If your form contains electronically fillable fields, use a PDF editing tool for the
conversion process that retains AcroForms formatting, for example Adobe Acrobat.
■ If your form includes several pages of un-fillable boilerplate, only add the fillable pages
to your gallery archive.

3 Add all single-page PDF files to a ZIP archive.

Configuring a Form Recognition profile


Configure a Form Recognition profile by uploading a Gallery Archive and specifying a Fill
Threshold.
See “Preparing a Form Recognition Gallery Archive” on page 697.
To configure and index a Form Recognition profile
1 Navigate to Manage > Data Profiles > Form Recognition to display the Form
Recognition Profiles screen.
2 Click Add Profile to display the Configure Form Recognition Profile.
3 Enter a name for the profile in the Name field.

Note: The name you enter is used when you configure policies and appears in the incident
snapshot for Form Recognition incidents.

4 (Optional) Enter a description for the profile in the Description field.


Detecting content using Form Recognition - Sensitive Image Recognition 699
Configuring Form Recognition detection

5 Enter a value in the Fill Threshold field.


The fill threshold is a range from 1-10, where 1 represents a form that has been filled in
minimally, and 10 a form that has been filled in completely. You can also enter 0 to detect
blank forms.

Note: For electronically filled forms, entering 1 for the fill threshold detects any electronically
filled item on a form. For example, setting the threshold to 1 detects a single selected
check-box. In contrast, setting the threshold to 1 may not detect a similar check-box that
has been filled in using a pen.

6 Upload the gallery archive by clicking Browse and selecting the gallery archive ZIP file.
7 Click Save to begin indexing the profile.
When the gallery completes indexing, you can use it to configure a Form Recognition rule
in a policy.
See “Configuring the Form Recognition detection rule” on page 699.

Configuring the Form Recognition detection rule


You configure the detection rule by specifying a Form Recognition profile.
See “Configuring a Form Recognition profile” on page 698.
The indexed forms in the profile are compared against detected forms to determine if the forms
match. The Form Recognition rule matches on attachments only.
To configure the Form Recognition detection rule
1 Go to Manage Policies > Policy List, click New, and create a new blank policy or policy
from a template.
See “Adding a new policy or policy template” on page 412.
2 Click Add Rule on the Detection tab to display the Configure Policy - Add Rule.
3 Select Detect using Form Recognition Profile in the the Form Recognition section
and select the Form Recognition profile that contains the forms you want to protect.
4 Click Next to display the Configure Policy - Edit Rule page.
5 Enter a name for the rule in the Rule Name field.
6 Choose the rule severity.
See “Policy severity” on page 374.
Detecting content using Form Recognition - Sensitive Image Recognition 700
Managing Form Recognition profiles

7 Select the conditions for the Form Recognition detection rule.


You can use the Also Match field to configure compound match conditions. See
“Compound conditions” on page 394.
8 Click OK to add the detection rule.
9 Click Save to apply the detection rule to the policy.
The new policy displays in the Policy List.

Configuring the Form Recognition exception rule


You configure the exception rule by specifying a Form Recognition profile.
See “Configuring a Form Recognition profile” on page 698.
To configure the Form Recognition exception rule
1 Go to Manage Policies > Policy List, click New, and create a new blank policy or policy
from a template.
See “Adding a new policy or policy template” on page 412.
2 Click Add Exception on the Detection tab to display the Configure Policy - Add
Exception.
3 Select Detect using Form Recognition Profile in the Form Recognition section and
select the Form Recognition profile that contains the forms you want to protect.
4 Click Next to display the Configure Policy - Edit Exception page.
5 Enter a name for the exception in the Exception Name field.
6 Select the conditions for the Form Recognition detection rule.
You can use the Also Match field to configure compound match conditions. See
“Compound conditions” on page 394.
7 Click OK to add the exception rule.
8 Click Save to apply the detection rule to the policy.
The new policy displays in the Policy List.

Managing Form Recognition profiles


The Form Recognition Profiles screen (Manage > Data Profiles > Form Recognition) to
provides a summarized view of all Form Recognition profiles. You can use this screen to
confirm that a profile was indexed successfully, view the indexing status, and so on.
Detecting content using Form Recognition - Sensitive Image Recognition 701
Managing Form Recognition profiles

Table 29-2 Form Recognition Profiles details

Element Description

Add Profile Click Add Profile to configure a new Form Recognition profile.
See “Configuring a Form Recognition profile” on page 698.

Show Entries Select a value from Show Entries to specify the number of profiles
you can view on this page.

Page navigation You can use the following buttons to change the view of profiles:

■ Click Last to view profiles with the most recent dates in ascending
order.
■ Click a number to navigate to that specific page number.
■ Click Next to view the next page.
■ Click Previous to view the previous page.

Profile Name Click the Profile Name to view or edit the profile.
Note: You can sort column data in ascending order (A-Z/1-3) by
clicking the up arrow or descending order (Z-A/3-1) by clicking the
down arrow.

Description The profile description. You can edit the description by clicking the
profile name or the pencil icon in the Actions column.

State Each profile displays one of the following states:

■ Gallery missing or invalid displays when indexing for the profile


failed. The gallery did not upload because the ZIP archive is invalid.

■ Indexing not started displays when indexing for the profile did not
start. The uploaded gallery did not process.
■ Indexing in progress displays when the uploaded gallery is
indexing.
■ Profile indexed displays when indexing for this profile is complete
and the index successfully created.
■ Invalid gallery displays when indexing for the profile failed. The
uploaded gallery did not start indexing because it is invalid.
■ Index contains no images displays when indexing for the profile
failed. The uploaded gallery did not index because it contains no
compatible files.
■ Indexing failed displays when indexing for this profile failed. The
uploaded gallery was not indexed.
■ Indexing found some unusable files displays when indexing for
the profile completes with errors. Some of the files in the uploaded
gallery cannot be indexed.
Detecting content using Form Recognition - Sensitive Image Recognition 702
Advanced server settings for Form Recognition

Table 29-2 Form Recognition Profiles details (continued)

Element Description

Gallery The gallery archive name.


You cannot edit the gallery name. You can upload a new gallery or an
existing gallery that has been renamed by clicking the profile name or
the pencil icon in the Actions column.

Usable Forms Count The total number of form images in the gallery that have been indexed
without errors and can be used in a policy.

Date Indexed The date when the profile was last indexed.

Index Version The version number of the index.

Fill Threshold The fill threshold value you provided when you configured the Form
Recognition profile. You can edit this value by clicking the profile name
or the pencil icon in the Actions column.

Actions Click the Pencil to edit profile details.

Click the red X to delete a profile. If you delete a profile, the system
removes the profile metadata and gallery from the Enforce Server.

Advanced server settings for Form Recognition


Some of the default Form Recognition server settings might require testing and fine-tuning to
determine what works best for your needs. You can modify these settings on the System >
Servers and Detectors > Overview > Server/Detector Detail - Advanced Settings page.
Symantec recommends that you contact Symantec Technical Support before modifying any
advanced server settings.
There are nine advanced settings related to Form Recognition:
■ ContentExtraction.ImageExtractorEnabled
■ ContentExtraction.MaxNumImagesToExtract
■ FormRecognition.ALIGNMENT_COEFFICIENT
■ FormRecognition.CANONICAL_FORM_WIDTH
■ FormRecognition.MAXIMUM_FORM_WIDTH
■ FormRecognition.MINIMUM_FORM_ASPECT_RATIO
■ FormRecognition.MINIMUM_FORM_WIDTH
■ FormRecognition.OPENCV_THREADPOOL_SIZE
Detecting content using Form Recognition - Sensitive Image Recognition 703
Viewing a Form Recognition incident

■ FormRecognition.PRECLASSIFIER_ACTION
You can see details about these settings here:
See “Advanced server settings” on page 285.

Viewing a Form Recognition incident


You view and remediate Form Recognition incidents as you would any Symantec Data Loss
Prevention incident. See “About incident remediation” on page 1841.
In addition to the usual incident snapshot information, Form Recognition incidents include:
■ Yellow highlighted areas on the form, which indicate form elements that align and electronic
fields that have been filled.
■ Orange highlighted areas on the form, indicating questionable areas.
■ A Similarity Score which indicates how similar the form elements are. The higher the
score, the more statistically similar the field contents are to the form fields.
Chapter 30
Detecting Content using
OCR - Sensitive Image
Recognition
This chapter includes the following topics:

■ About content detection with OCR Sensitive Image Recognition

■ OCR Server system requirements

■ Using diagnostics for sizing OCR Server deployments

■ Creating a null policy to assist in OCR diagnostics for Discover Servers

■ Using the OCR Server Sizing Estimator spreadsheet

■ Setting up OCR Servers

■ Installing an OCR Sensitive Image Recognition license

■ Creating an OCR configuration

■ Using the OCR engine

■ More about languages and Dictionaries

■ Viewing OCR incidents in reports

■ Advanced Server settings and Troubleshooting for Sensitive Image Recognition content
extraction
Detecting Content using OCR - Sensitive Image Recognition 705
About content detection with OCR Sensitive Image Recognition

About content detection with OCR Sensitive Image


Recognition
OCR (optical character recognition) Sensitive Image Recognition provides the capability to
extract text from images (scanned documents, screen shots, pictures, Microsoft office
documents, and so on) and from PDFs, enabling you to use new or preexisting text-based
detection rules on this content.
The extracted text then enters the detection chain and is processed identically to conventionally
extracted text. Incident snapshots for OCR text are similar to those for conventionally extracted
text: the text excerpt is displayed, with the detected words highlighted. OCR incidents have
visual indicators denoting that the text came from OCR, and a thumbnail of the original image.
You can set up OCR to use various languages. To improve recognition results, you can also
choose a specialized dictionary (such as legal, financial, or medical) to enable supplemental
spell checking. You can also set up a customized dictionary to deal with proper nouns or other
terms specific to your business.
While OCR content extraction can integrate with both Windows and Linux detection servers,
Symantec supports installing the OCR Server on Windows servers only. OCR content extraction
is not supported on the Windows Agents, macOS Agents, the Data Loss Prevention cloud
services, or the Data Loss Prevention appliances (both virtual and physical). For information
on supported versions of Windows servers, see the Symantec Data Loss Prevention System
Requirements Guide at
http://www.symantec.com/docs/DOC10602
See “Installing an OCR Sensitive Image Recognition license” on page 711.

Detection types supported for OCR extraction


The following detection types are supported for OCR extraction:
■ Network Monitor
■ Network Prevent for Email
■ Network Prevent for Web
■ Network Discover
■ Cloud Prevent for Office 365 on Azure

File types supported for OCR extraction


Images of the following file types are extracted and sent to OCR:
■ JPEG (.jpg, .jpeg)
Detecting Content using OCR - Sensitive Image Recognition 706
OCR Server system requirements

■ PNG
■ TIFF (single page or multi-page, .tif or .tiff)
■ Bitmap (.bmp)
■ Images extracted from PDF files, such as pages from a scanned document.
■ Images extracted from Microsoft Office documents.

About extracting images from Microsoft Office documents for OCR


and Form Recognition
You can extract images from Microsoft Office documents for OCR and Form Recognition
detection in Symantec Data Loss Prevention15.5. Data Loss Prevention can extract image file
formats including BMP, PNG, and JPG from Word, Excel, and PowerPoint. This capability is
dynamically enabled by default and can be disabled or statically enabled by changing the
ContentExtraction.ImageExtractor Advanced setting.

See “Advanced Server settings and Troubleshooting for Sensitive Image Recognition content
extraction” on page 715.

OCR Server system requirements


The OCR (optical character recognition) Server has specific hardware, operating system, and
server settings requirements, different from the Data Loss Prevention Enforce Server and
detection servers. You can find the latest information on these requirements in the article
"Symantec Data Loss Prevention OCR Server System Requirements and OCR Server Sizing
Estimator" at the Symantec Support Center at http://www.symantec.com/docs/doc10612.html
See “Using diagnostics for sizing OCR Server deployments” on page 706.

Using diagnostics for sizing OCR Server deployments


When you enable OCR.RECORD_REQUEST_STATISTICS on a given detection server, the detection
server starts logging. It collects metrics on the images that it encounters that are suitable for
OCR submission. Not all images that the detection server encounters are suitable for OCR
submission. For example, the images that are the wrong dimensions or are unlikely to contain
text that can be transcribed won’t be submitted to OCR for processing.
You can measure the proportion of files and messages that Data Loss Prevention inspects
and that contain images that can be submitted to OCR. The resulting metrics can be used to
help you properly size and scale your OCR Server deployment. First, you need to set the
OCR.RECORD_REQUEST_STATISTICS Advanced Server setting to true. Then, Symantec
recommends that you allow the detection server to operate normally for one calendar week.
Detecting Content using OCR - Sensitive Image Recognition 707
Using diagnostics for sizing OCR Server deployments

The system collects metrics on the images that are encountered and logs the results in the
OcrRequestsRecord0.log for the last 24 hours. If you let the server run for one calendar week,
you can plot the “trailing 24 hour” data over this longer interval. This longer run enables you
to see the peaks and valleys of your potential OCR image load. During this process, no incidents
are created and only the images that are suitable for submission to OCR are counted.

Note: You do not have to have the Data Loss Prevention Symantec Data Loss Prevention
Sensitive Image Recognition add-on license to use this feature. You can estimate sizing
requirements for an OCR Server deployment in advance of purchasing the DLP Sensitive
Image Recognition add-on license that includes the OCR feature.

Figure 30-1 is a sample of an OcrRequestsRecord0.log showing a snapshot of the results.


In it you can see samples of the values that you can enter in the OCR Server Sizing Estimator
spreadsheet to help you to size your OCR Server deployment.

Figure 30-1 Sample OcrRequestsRecord0.log results

After you run the OCR diagnostics, disable OCR.RECORD_REQUEST_STATISTICS to disable


logging to the OcrRequestRecord0.log file.
To run diagnostics for OCR sizing for the Network Prevent for Email, Network Prevent for Web,
and Network Monitor data-in-motion channels
1 Go to System > Servers and Detectors > Overview and select a detection server.
2 Click Server Settings.
3 Set OCR.RECORD_REQUEST_STATISTICS to true.
4 Click Save.
5 Restart the detection server.
6 Let the detection server run for a week and collect metrics. This process works best for
the data in motion channels, such as Network Prevent for Email, Network Prevent for
Web, and Network Monitor.
Detecting Content using OCR - Sensitive Image Recognition 708
Creating a null policy to assist in OCR diagnostics for Discover Servers

7 Consult the OcrRequestsRecord0.log to get the values to enter in the OCR Server Sizing
Estimator spreadsheet.
8 Go to the OCR Server Sizing Estimator spreadsheet at
https://www.symantec.com/docs/DOC10612.
9 Enter data in the green cells from the log for the following values:
Percentage of messages containing images requiring OCR (OCR messages)
Estimated average number of images per OCR message
10 The spreadsheet calculates the number of OCR Servers that you need to deploy for the
image traffic of each detection server in your Symantec Data Loss Prevention deployment.
11 Set OCR.RECORD_REQUEST_STATISTICS to false to disable logging.
You use a different technique for estimating OCR Server sizing requirements for Network
Discover. See “Creating a null policy to assist in OCR diagnostics for Discover Servers”
on page 708.

Creating a null policy to assist in OCR diagnostics for


Discover Servers
When you enable OCR.RECORD_REQUEST_STATISTICS on a given detection server, the detection
server starts logging. It collects metrics on the images that it encounters that are suitable for
OCR submission. Not all images that the detection server encounters are suitable for OCR
submission. For example, the images that are the wrong dimensions or are unlikely to contain
text that can be transcribed won’t be submitted to OCR for processing.
For Network Discover, you can directly measure the proportion of images suitable for submission
to OCR for each Discover scan target by enabling the OCR.RECORD_REQUEST_STATISTICS
advanced setting before you run a scan against that target. To expedite the scan process,
Symantec recommends binding a null policy to the Discover scan target.
Creating a null policy group
1 Go to System > Servers and Detectors > Policy Groups.
2 Click Add.
3 In the Name field, enter a name, such as Null Group and a description.
4 Set the Policy Group to Null.
5 Check all boxes under Servers and Detectors to assign the policy group to all servers.
6 Click Save.
Detecting Content using OCR - Sensitive Image Recognition 709
Creating a null policy to assist in OCR diagnostics for Discover Servers

Creating a null policy that is suspended


1 Go to Manage > Policies > Policy List.
2 Click New.
3 Click Add a blank policy.
4 Add a Name, Null Policy.
5 Set the Policy Group to Null.
6 Set the Status to Suspended.
7 On the Detection tab, click Add Rule then add an existing rule such as Message
Attachment or File Name Match or Message Attachment or File Type Match with no
exceptions to the policy.
Bind the suspended policy to the Null policy group and run the scans
1 Go to Manage > Discover Scanning > Discover Targets.
2 Click New Target, and selection File System from the pull-down menu. On the General
tab, type the Name of the Discover target.
3 In the General section, create a new scan named, for example, "Fileshare Scan."
4 Select the Null Policy policy group.
5 Under Scan Execution select Always scan all items.
6 Indicate the targets that you want to scan.
7 Under Scan Schedule, schedule the scans.
When the scans are invoked, Discover crawls all of the scan targets. Files that are detected
in the pipeline are analyzed and metrics for images are collected. Make sure that
OCR.RECORD_REQUEST_STATISTICS is enabled. However, incidents are not generated
since there’s no active policy that is associated with the scans.
The scans take time, since crawling remote repositories is a time-consuming operation,
but the crawling goes faster than normal since no policies are executed.
8 After the scan operation is complete, unbind the null policy group from the scans and
re-bind the appropriate policy groups.
9 Consult the OcrRequestsRecord0.log to get the values to enter in the OCR Server Sizing
Estimator spreadsheet. See Figure 30-1 on page 707.
10 Go to the OCR Server Sizing Estimator spreadsheet at
https://www.symantec.com/docs/DOC10612.
Detecting Content using OCR - Sensitive Image Recognition 710
Using the OCR Server Sizing Estimator spreadsheet

11 Enter the data from the log into the green cells in the spreadsheet for the following values:
Percentage of messages containing images requiring OCR (OCR messages)
Estimated average number of images per OCR message
12 The spreadsheet calculates the number of OCR Servers that you need to deploy for the
image traffic of each detection server in your Symantec Data Loss Prevention deployment.
13 Set OCR.RECORD_REQUEST_STATISTICS to false to disable logging.
See “Using the OCR Server Sizing Estimator spreadsheet” on page 710.

Using the OCR Server Sizing Estimator spreadsheet


The OCR Server Sizing Estimator spreadsheet can help you to estimate how many OCR
Servers you need for each detection server in your deployment. The spreadsheet and directions
on how to use it are available at the Symantec Support Center at
https://www.symantec.com/docs/doc10612.html
See “Setting up OCR Servers” on page 710.

Setting up OCR Servers


OCR content extraction also requires installation of an OCR Server. You configure the OCR
Server (micro service) from the Enforce Server administration console. Symantec recommends
that you install the OCR Server on dedicated hardware, or on VMs with dedicated resources,
because of its high processing requirements. A certificate for communication between the
OCR client on the Enforce Server and the OCR Server is also required.
The OCR Server is an independent server, separate from any Data Loss Prevention detection
server. You can configure the detection server to talk to an OCR address (IP address or host
name). That address can either be a single OCR Server, or a single load balancer in front of
several OCR Servers. You can use an external load balancer or another technology, such as
Windows Network Load Balancing.
Note: A detection server can only be configured with a single OCR Server address. You can
use the IP address or host name for a single OCR Server. Or, you can use the virtual IP
address for a load balancer (or pair of load balancers) that front-ends multiple OCR Servers.
If you want to configure a detection server to communicate with a pool of OCR Servers, the
detection server is limited to supporting configuration of a single OCR Server address. You
must front-end multiple OCR Servers by a load balancer that provides that single address. In
addition, we only support load balancers without persistence enabled.
In the single OCR Server case, it can be installed on a separate computer, or on the same
computer as the detection server (not recommended). Configuration information is included
Detecting Content using OCR - Sensitive Image Recognition 711
Installing an OCR Sensitive Image Recognition license

with the request, so OCR Servers can service requests from different detection servers that
are configured differently.
For example, you can configure one detection server to detect English with the highest possible
OCR accuracy. Then, you can configure another detection server to detect Japanese, with
the highest possible speed. In this case, the same OCR Server is able to handle both types
of requests. Symantec recommends that you install the OCR Server on a computer separate
from the detection server. However, Symantec supports co-locating of the OCR Server with
a detection server.
You install an OCR Server using the Symantec DLP OCR Server Installer setup wizard.
To install an OCR Server
1 Open the OCR Server Installer.
2 Double click OCRServerInstaller64.
3 Click Next.
4 Select desired Destination directory. Click Next. The installer runs.
5 Click Finish when the installation is complete.
Now the OCR service is running and is ready to receive OCR requests.
See “Creating an OCR configuration” on page 711.

Installing an OCR Sensitive Image Recognition license


When you first purchase Symantec Data Loss Prevention, upgrade to a later version, or
purchase additional product modules, you must install one or more Symantec Data Loss
Prevention license files. To use OCR (optical character recognition), you must install the
Symantec Data Loss Prevention OCR Sensitive Image recognition license. License files have
names in the format name.slf.
See “Installing a new license file” on page 234. for more information on adding a license to
Symantec Data Loss Prevention.
See “OCR Server system requirements” on page 706.

Creating an OCR configuration


Adding an OCR profile
1 Go to System > Settings > OCR Engine Configuration.
2 Click Add OCR Engine Configuration.
Detecting Content using OCR - Sensitive Image Recognition 712
Creating an OCR configuration

Configuring the OCR Engine


1 Enter the Name of the profile.
2 Enter an optional Description of the profile.
3 Enter the OCR server hostname of the server where the OCR requests should be sent.
It can be a single load balancer or an individual OCR Server.
4 Enter the Port number of the port where requests should be sent. The default port is 8555.
5 Enter the OCR Engine timeout (seconds) value. This setting defines how long before
an OCR request should be timed out. The default timeout is 30.
The timeout is how much time the request is allowed to spend inside the OCR (optical
character recogniton) Server, and does not include transit time or other delays.
The timeout needs to be set with the other content timeout settings in the Advanced
Settings. As with other content extraction operations, if the timeout is reached, the OCR
component is skipped and the previously extracted content moves on to detection.

6 Enter a value for Accuracy vs speed. By default, the OCR Server sets the value
dynamically for each document. The Sensitive Image Recognition pre-classifier is on the
detection server inspects each image and determines if it is suitable for OCR content
extraction (and form recognition). It then determines which preset is most appropriate. If
you uncheck this box, you can select a preset to use for all images. You can choose from
Accurate, Balanced, or Fast. This strategy can be appropriate for Discover scans, where
accuracy is prioritized over time.
7 In the Supported Languages section, select the candidate languages for OCR.
You can select one or more languages, and then the OCR Server selects a language
from that pool to use for the image. Symantec assumes that documents are primarily one
language (for example, all French, or all English, as opposed to mixed English and French).
The number of languages should be as small as possible. The more languages you select,
the slower the processing speed.
Even if a language is not selected, you may still get accurate text from that language. For
example, you can select English and German and submit a mixed English-French image
the OCR Server. It may choose English and still return some French text. The language
selection affects which spell-check dictionary to use. It also affects the pool of characters
to choose from if a character in the image is unclear.
8 In the Languages and Dictionaries Specialized Dictionaries section, you enable
supplemental spell checking for different businesses (legal, financial, medical) across
different languages.
Detecting Content using OCR - Sensitive Image Recognition 713
Using the OCR engine

9 In the Languages and Dictionaries Custom Dictionary section, specify the name of
your custom dictionary file to aid recognition accuracy. For example, if certain proper
nouns give the OCR Server difficulty, you can place them in this custom dictionary.
Using Dictionaries and spell checking improves recognition results for low-quality scans
and images (such as faxes). If the characters are crisp and clean they are easier for the
engine to read, and the Dictionaries are less useful.
10 The custom dictionary is a text file, with one entry per line. This text file must be placed
in the dictionary directory of each server at c:\Symantec\DLPOCR\Protect\bin.
Assign a profile to a detection server
1 Go to System > Servers and Detectors > Overview.
2 Select a monitor.
3 On the Server/Detector Detail page, click Configure.
4 On the Configure Server page, click OCR Engine. In OCR Engine Configuration select
the configuration that you want to use for the server.
5 Click Save.
See “Using the OCR engine” on page 713.

Using the OCR engine


You can see all of your OCR configurations and add an OCR Engine configuration on the OCR
Engine Configuration page. On this page you can
■ Click Add OCR Engine Configuration to add a new configuration.
■ Click the name of the configuration or the pencil icon to edit an existing configuration.
■ Click the red X to delete a configuration.
See “Server configuration—basic”on page 705 on page 705.
See “Viewing OCR incidents in reports” on page 715.

More about languages and Dictionaries


Instead of choosing from a pool of languages, the OCR Server assumes that all selected
languages may be in the image. This is a good strategy for the mixed language document use
case, but selecting more than four languages is not recommended, as it can adversely affect
both speed and accuracy.
Detecting Content using OCR - Sensitive Image Recognition 714
More about languages and Dictionaries

Specialized Dictionaries available for OCR content extraction


The following specialized Dictionaries are available for OCR content extraction:
■ Dutch Legal Dictionary
■ Dutch Medical Dictionary
■ English Financial Dictionary
■ English Legal Dictionary
■ English Medical Dictionary
■ French Legal Dictionary
■ French Medical Dictionary
■ German Legal Dictionary
■ German Medical Dictionary

Languages supported for OCR extraction


The following languages are supported for OCR extraction:
■ Arabic
■ Chinese (Simplified)
■ Chinese (Traditional)
■ Czech
■ Danish
■ Dutch
■ English
■ Finnish
■ French
■ German
■ Greek
■ Hungarian
■ Italian
■ Japanese
■ Korean
■ Norwegian
Detecting Content using OCR - Sensitive Image Recognition 715
Viewing OCR incidents in reports

■ Polish
■ Portuguese
■ Portuguese (Brazilian)
■ Romany
■ Russian
■ Spanish
■ Swedish
■ Turkish
Other languages can be detected if they use supported character sets.

Viewing OCR incidents in reports


OCR incidents are flagged and detected text is highlighted in yellow in incident reports.
Thumbnails of the page are included in the incident. Clicking on the thumbnail enables you to
view a larger version of the image. This image contains the extracted text that violates the
Symantec Data Loss Prevention policy.

Advanced Server settings and Troubleshooting for


Sensitive Image Recognition content extraction
The following tables detail Advanced settings and troubleshooting tips for Sensitive Image
Recognition content extraction.

Table 30-1 Advanced settings for OCR and FR image extraction

Advanced setting State Behavior

ContentExtraction.ImageExtractorEnabled =1 Default value ■ Enabled when a Form


Recognition rule is present
Dynamically enables
and enabled or when an
and disables extraction
OCR configuration is
of images, with no
assigned to the Monitor.
restarts required.
■ Disabled when no Form
Recognition rule is present
and enabled or when no
OCR configuration is
assigned to the Monitor.

ContentExtraction.ImageExtractorEnabled =2 Always enabled Images are extracted.


Detecting Content using OCR - Sensitive Image Recognition 716
Advanced Server settings and Troubleshooting for Sensitive Image Recognition content extraction

Table 30-1 Advanced settings for OCR and FR image extraction (continued)

Advanced setting State Behavior

ContentExtraction.ImageExtractorEnabled =0 Always disabled No images are extracted.

ContentExtraction.MaxNumImages_to_Extract Set to 10 by default. The first 10 images are


=10 extracted. You can change this
setting to any value.

Note: You must restart the server when you change Advanced settings.

Consult the following table for troubleshooting tips when using image extraction for OCR and
FR.

Table 30-2 Troubleshooting image extraction for OCR and FR

Issue/Log info Solution

No images are extracted even though you have Check if the ContentExtraction.ImageExtractorEnabled
a Form Recognition rule present or OCR setting is equal to 0. Change it to 1 or 2.
configuration assigned to the monitor.
Make sure that a policy with the Form Recognition rule is not
suspended.

Log messages in ContentExtractionHost_File ■ INFO | cehost |Verity [10320] | [9580] | Extract Images Enabled
Reader.log | src\VerityImplinternal.c (246)
■ INFO | cehost |Verity [3544] | [2064] | Update Plugin
Configuration: Extract Images Enabled | src\VerityImplinternal.c
(969)
■ INFO | cehost |OfficeOpenXMLPlugin [3544] |[2064] | Updated
Plugin Configuration: Extract Images - True |
OfficeOpenXMLExtractor.cs (104)

Only 10 images are extracted out of many more Change the value of ContentExtraction.MaxNumImages to
images that are present in your document. Extract =10 in the Advanced settings to a greater value.

Log settings ■ Set com.vontu.detection.level = FINEST in


FileReaderlogging.properties.
■ Set "cehhost" category to TRACE in
log4cxx_config_filereader.xml
Chapter 31
Detecting content using
data identifiers
This chapter includes the following topics:

■ Introducing data identifiers

■ Configuring data identifier policy conditions

■ Modifying system data identifiers

■ Creating custom data identifiers

■ Best practices for using data identifiers

Introducing data identifiers


Symantec Data Loss Prevention provides data identifiers to detect specific instances of
described content. Data identifiers let you quickly implement precise, short-form data matching
with minimal effort.
Data identifiers are algorithms that combine pattern matching with data validators to detect
content. Patterns are similar to regular expressions but more efficient because they are tuned
to match the data precisely. Validators are accuracy checks that focus the scope of detection
and ensure compliance.
For example, the "Credit Card Number" system data identifier detects numbers that match a
specific pattern. The matched pattern is validated by a "Luhn check," which is an algorithm.
In this case the validation is performed on the first 15 digits of the number that evaluates to
equal the 16th digit.
Symantec Data Loss Prevention provides pre-configured data identifiers that you can use to
detect commonly used sensitive data, such as credit card, social security, and driver's license
numbers. Most data identifiers come in three breadths—wide, medium, and narrow—so you
Detecting content using data identifiers 718
Introducing data identifiers

can fine-tune your detection results. Data identifiers offer broad support for detecting
international content.
If a system-defined data identifier does not meet your needs, you can modify it. You can also
define your own custom data identifiers to detect any content that you can describe.
See “System-defined data identifiers” on page 718.
See “Selecting a data identifier breadth” on page 739.

System-defined data identifiers


Symantec Data Loss Prevention provides several system-defined data identifiers to help you
detect and validate pattern-based sensitive data.

Table 31-1 System data identifiers

Category Description

Personal Identity Detect various types of identification numbers for the regions of Africa, Asia Pacific, Europe,
North America, and South America.

See “Personal identity data identifiers” on page 718.

Financial Detect financial identification numbers, such as credit card numbers and ABA routing numbers.

See “Financial data identifiers” on page 729.

Healthcare Detect U.S. and international drug codes, and other healthcare-related pattern-based sensitive
data.

See “Healthcare data identifiers” on page 729.

Information Detect IP addresses.


Technology
See “Information technology data identifiers” on page 730.

International International keywords for PII data identifiers.


keywords
See “International keywords for PII data identifiers” on page 730.

Personal identity data identifiers


Symantec Data Loss Prevention provides various data identifiers for detecting personally
identifiable information (PII) for the regions of Africa, Asia Pacific, Europe, North America, and
South America.
Table 31-2 lists system-defined data identifiers for the Middle East and Africa region.
Detecting content using data identifiers 719
Introducing data identifiers

Table 31-2 African personal identity

Data identifier Description

South African Personal Identification Number See “South African Personal Identification Number”
on page 1469.

Table 31-3 lists system-defined data identifiers for the Asia Pacific region.

Table 31-3 Asia Pacific personal identity

Data identifier Description

Australia Driver's License Number See “Australia Driver's License Number” on page 1018.

Australian Business Number See “Australian Business Number wide breadth”


on page 1020.

Australian Company Number See “Australian Company Number” on page 1022.

Australian Passport Number See “Australian Passport Number” on page 1027.

Australian Tax File Number See “Australian Tax File Number” on page 1029.

China Passport Number See “China Passport Number” on page 1079.

Hong Kong ID See “Hong Kong ID” on page 1215.

India RuPay Card Number See “India RuPay Card Number” on page 1252.

Indian Aadhaar Card Number See “Indian Aadhaar Card Number” on page 1249.

Indian Permanent Account Number See “Indian Permanent Account Number” on page 1251.

Indonesian Identity Card Number See “Indonesian Identity Card Number” on page 1255.

Israel Personal Identification Number See “Israel Personal Identification Number” on page 1276.

Japan Driver's License Number See “Japan Driver's License Number” on page 1285.

Japan Passport Number See “Japan Passport Number” on page 1287.

Japanese Juki-Net Identification Number See “Japanese Juki-Net Identification Number”


on page 1289.\

Japanese My Number - Corporate See “Japanese My Number - Corporate” on page 1291.

Japanese My Number - Personal See “Japanese My Number - Personal” on page 1292.

Kazakhstan Passport Number See “Kazakhstan Passport Number” on page 1295.

Korea Passport Number See “Korea Passport Number” on page 1296.


Detecting content using data identifiers 720
Introducing data identifiers

Table 31-3 Asia Pacific personal identity (continued)

Data identifier Description

Korean Residence Registration Number for Foreigners See “Korea Residence Registration Number for Foreigners”
on page 1298.

Korean Residence Registration Number for Korean See “Korea Residence Registration Number for Korean”
on page 1300.

Macau Individual Identification Number See “Macau National Identification Number” on page 1331.

Malaysia Passport Number See “Malaysia Passport Number” on page 1333.

Malaysian MyKad Number See “Malaysian MyKad Number (MyKad)” on page 1335.

New Zealand Driver's License Number See “New Zealand Driver's Licence Number” on page 1370.

New Zealand National Health Index Number See “New Zealand National Health Index Number”
on page 1371.

New Zealand Passport Number See “New Zealand Passport Number” on page 1373.

People's Republic of China ID See “People's Republic of China ID” on page 1384.

Singapore NRIC See “Singapore NRIC data identifier” on page 1451.

Sri Lanka National Identity Number See “Sri Lanka National Identity Number” on page 1490.

Taiwan ID See “Taiwan ROC ID” on page 1515.

Thailand Passport Number See “Thailand Passport Number” on page 1517.

Thailand Personal Identification Number See “Thailand Personal Identification Number” on page 1519.

United Arab Emirates Personal Number See “United Arab Emirates Personal Number” on page 1544.

Table 31-4 lists system-defined data identifiers for the European region.

Table 31-4 European personal identity

Data identifier Description

Austria Passport Number See “Austria Passport Number” on page 1030.

Austria Tax Identification Number See “Austria Tax Identification Number” on page 1031.

Austria Value Added Tax (VAT) Number See “Austria Value Added Tax (VAT) Number” on page 1033.

Austrian Social Security Number See “Austrian Social Security Number” on page 1036.

Belgian National Number See “Belgian National Number” on page 1039.


Detecting content using data identifiers 721
Introducing data identifiers

Table 31-4 European personal identity (continued)

Data identifier Description

Belgium Driver's License Number See “Belgium Driver's Licence Number” on page 1042.

Belgium Passport Number See “Belgium Passport Number” on page 1044.

Belgium Tax Identification Number See “Belgium Tax Identification Number” on page 1045.

Belgium Value Added Tax (VAT) Number See “Belgium Value Added Tax (VAT) Number”
on page 1047.

Bulgaria Value Added Tax (VAT) Number See “Bulgaria Value Added Tax (VAT) Number”
on page 1060.

Bulgarian Uniform Civil Number - EGN See “Bulgarian Uniform Civil Number - EGN” on page 1063.

Burgerservicenummer See “Burgerservicenummer” on page 1066.

Codice Fiscale See “Codice Fiscale” on page 1081.

Croatia National Identification Number See “Croatia National Identification Number” on page 1104.

Cyprus Tax Identification Number See “Cyprus Tax Identification Number” on page 1109.

Cyprus Value Added Tax (VAT) Number See “Cyprus Value Added Tax (VAT) Number” on page 1111.

Czech Republic Driver's License Number See “Czech Republic Driver's Licence Number”
on page 1112.

Czech Republic Personal Identification Number See “Czech Republic Personal Identification Number”
on page 1114.

Czech Republic Tax Identification Number See “Czech Republic Tax Identification Number”
on page 1117.

Czech Republic Value Added Tax (VAT) Number See “Czech Republic Value Added Tax (VAT) Number”
on page 1121.

Denmark Personal Identification Number See “Denmark Personal Identification Number” on page 1126.

Denmark Tax Identification Number See “Denmark Tax Identification Number” on page 1128.

Denmark Value Added Tax (VAT) Number See “Denmark Value Added Tax (VAT) Number”
on page 1130.

Estonia Driver's Licence Number See “Estonia Driver's Licence Number” on page 1147.

Estonia Personal Identification Code See “Estonia Personal Identification Code” on page 1151.

Estonia Passport Number See “Estonia Passport Number” on page 1149.


Detecting content using data identifiers 722
Introducing data identifiers

Table 31-4 European personal identity (continued)

Data identifier Description

Estonia Value Added Tax (VAT) Number See “Estonia Value Added Tax (VAT) Number” on page 1153.

European Health Insurance Card Number See “European Health Insurance Card Number”
on page 1156.

Finland Driver's Licence Number See “Finland Driver's Licence Number” on page 1165.

Finland European Health Insurance Number See “Finland European Health Insurance Number”
on page 1167.

Finland Passport Number See “Finland Passport Number” on page 1169.

Finland Tax Identification Number See “Finland Tax Identification Number” on page 1171.

Finland Value Added Tax (VAT) Number See “Finland Value Added Tax (VAT) Number” on page 1173.

Finnish Personal Identification Number See “Finnish Personal Identification Number” on page 1175.

France Driver's License Number See “France Driver's License Number” on page 1177.

France Health Insurance Number See “France Health Insurance Number” on page 1179.

France Tax Identification Number See “France Tax Identification Number” on page 1181.

France Value Added Tax (VAT) Number See “France Value Added Tax (VAT) Number” on page 1182.

French INSEE Code See “French INSEE Code” on page 1185.

French Passport Number See “French Passport Number” on page 1187.

French Social Security Number See “French Social Security Number” on page 1188.

German Passport Number See “German Passport Number” on page 1190.

German Personal ID Number See “German Personal ID Number” on page 1192.

Germany Driver's License Number See “Germany Driver's License Number” on page 1194.

Germany Tax Identification Number See “Germany Tax Identification Number” on page 1198.

Germany Value Added Tax (VAT) Number See “Germany Value Added Tax (VAT) Number”
on page 1196.

Greece Passport Number See “Greece Passport Number” on page 1200.

Greece Social Security Number (AMKA) See “Greece Social Security Number (AMKA)” on page 1202.

Greece Value Added Tax (VAT) Number See “Greece Value Added Tax (VAT) Number” on page 1206.
Detecting content using data identifiers 723
Introducing data identifiers

Table 31-4 European personal identity (continued)

Data identifier Description

Greek Tax Identification Number See “Greek Tax Identification Number” on page 1204.

Hungarian Social Security Number See “Hungarian Social Security Number” on page 1221.

Hungarian Tax Identification Number See “Hungarian Tax Identification Number” on page 1223.

Hungarian VAT Number See “Hungarian VAT Number” on page 1225.

Hungary Driver's Licence Number See “Hungary Driver's Licence Number” on page 1217.

Hungary Passport Number See “Hungary Passport Number” on page 1219.

Iceland National Identification Number See “Iceland National Identification Number” on page 1241.

Iceland Passport Number See “Iceland Passport Number” on page 1245.

Iceland Value Added Tax (VAT) Number See “Iceland Value Added Tax (VAT) Number” on page 1247.

Ireland Passport Number See “Ireland Passport Number” on page 1266.

Ireland Tax Identification Number See “Ireland Tax Identification Number” on page 1268.

Ireland Value Added Tax (VAT) Number See “Ireland Value Added Tax (VAT) Number” on page 1271.

Irish Personal Public Service Number See “Irish Personal Public Service Number” on page 1274.

Italy Driver's License Number See “Italy Driver's Licence Number” on page 1278.

Italy Health Insurance Number See “Italy Health Insurance Number” on page 1280.

Italy Passport Number See “Italy Passport Number” on page 1282.

Italy Value Added Tax (VAT) Number See “Italy Value Added Tax (VAT) Number” on page 1283.

Latvia Driver's Licence Number See “Latvia Driver's Licence Number” on page 1303.

Latvia Passport Number See “Latvia Passport Number” on page 1305.

Latvia Personal Identification Number See “Latvia Personal Identification Number” on page 1306.

Latvia Value Added Tax (VAT) Number See “Latvia Value Added Tax (VAT) Number” on page 1308.

Liechtenstein Passport Number See “Liechtenstein Passport Number” on page 1311.

Lithuania Personal Identification Number See “Lithuania Personal Identification Number” on page 1312.

Lithuania Tax Identification Number See “Lithuania Tax Identification Number” on page 1315.
Detecting content using data identifiers 724
Introducing data identifiers

Table 31-4 European personal identity (continued)

Data identifier Description

Lithuania Value Added Tax Number See “Lithuania Value Added Tax (VAT) Number”
on page 1317.

Luxembourg National Register of Individuals Number See “Luxembourg National Register of Individuals Number”
on page 1320.

Luxembourg Passport Number See “Luxembourg Passport Number” on page 1322.

Luxembourg Tax Identification Number See “Luxembourg Tax Identification Number” on page 1324.

Luxembourg Value Added Tax (VAT) Number See “Luxembourg Value Added Tax (VAT) Number”
on page 1327.

Malta National Identification Number See “Malta National Identification Number” on page 1337.

Malta Tax Identification Number See “Malta Tax Identification Number” on page 1339.

Malta Value Added Tax (VAT) Number See “Malta Value Added Tax (VAT) Number” on page 1342.

Netherlands Bank Account Number See “Netherlands Bank Account Number” on page 1359.

Netherlands Driver's License Number See “Netherlands Driver's License Number” on page 1362.

Netherlands Passport Number See “Netherlands Passport Number” on page 1363.

Netherlands Tax Identification Number See “Netherlands Tax Identification Number” on page 1364.

Netherlands Value Added Tax (VAT) Number See “Netherlands Value Added Tax (VAT) Number”
on page 1367.

Norway Driver's Licence Number See “Norway Driver's Licence Number” on page 1375.

Norway National Identification Number See “Norway National Identification Number” on page 1377.

Norway Value Added Tax Number See “Norway Value Added Tax Number” on page 1379.

Norwegian Birth Number See “Norwegian Birth Number” on page 1382.

Poland Driver's Licence Number See “Poland Driver's Licence Number” on page 1386.

Poland European Health Insurance Number See “Poland European Health Insurance Number”
on page 1387.

Poland Passport Number See “Poland Passport Number” on page 1389.

Poland Value Added Tax (VAT) Number See “Poland Value Added Tax (VAT) Number” on page 1391.

Polish Identification Number See “Polish Identification Number” on page 1394.


Detecting content using data identifiers 725
Introducing data identifiers

Table 31-4 European personal identity (continued)

Data identifier Description

Polish REGON Number See “Polish REGON Number” on page 1396.

Polish Social Security Number (PESEL) See “Polish Social Security Number (PESEL)” on page 1398.

Polish Tax Identification Number (NIP) See “Polish Tax Identification Number” on page 1400.

Portugal Driver's Licence Number See “Portugal Driver's Licence Number” on page 1402.

Portugal National Identification Number See “Portugal National Identification Number” on page 1404.

Portugal Passport Number See “Portugal Passport Number” on page 1407.

Portugal Tax Identification Number See “Portugal Tax Identification Number” on page 1408.

Portugal Value Added Tax (VAT) Number See “Portugal Value Added Tax (VAT) Number”
on page 1411.

Romania Driver's Licence Number See “Romania Driver's Licence Number” on page 1416.

Romania National Identification Number See “Romania National Identification Number” on page 1419.

Romania Value Added Tax (VAT) Number See “Romania Value Added Tax (VAT) Number”
on page 1420.

Romanian Numerical Personal Code (CNP) See “Romanian Numerical Personal Code” on page 1425.

Russian Passport Identification Number See “Russian Passport Identification Number” on page 1427.

Russian Taxpayer Identification Number See “Russian Taxpayer Identification Number” on page 1428.

SEPA Creditor Identifier Number North See “SEPA Creditor Identifier Number North” on page 1430.

SEPA Creditor Identifier Number South See “SEPA Creditor Identifier Number South” on page 1437.

SEPA Creditor Identifier Number West See “SEPA Creditor Identifier Number West” on page 1441.

Serbia Unique Master Citizen Number See “Serbia Unique Master Citizen Number” on page 1445.

Serbia Value Added Tax (VAT) Number See “Serbia Value Added Tax (VAT) Number” on page 1448.

Slovakia Driver's Licence Number See “Slovakia Driver's Licence Number” on page 1451.

Slovakia National Identification Number See “Slovakia National Identification Number” on page 1453.

Slovakia Passport Number See “Slovakia Passport Number” on page 1457.

Slovakia Value Added Tax (VAT) Number See “Slovakia Value Added Tax (VAT) Number”
on page 1459.
Detecting content using data identifiers 726
Introducing data identifiers

Table 31-4 European personal identity (continued)

Data identifier Description

Slovenia Passport Number See “Slovenia Passport Number” on page 1461.

Slovenia Tax Identification Number See “Slovenia Tax Identification Number” on page 1463.

Slovenia Unique Master Citizen Number See “Slovenia Unique Master Citizen Number” on page 1465.

Slovenia Value Added Tax (VAT) Number See “Slovenia Value Added Tax (VAT) Number”
on page 1467.

Spain Driver's License Number See “Spain Driver's Licence Number” on page 1477.

Spain Value Added Tax (VAT) Number See “Spain Value Added Tax (VAT) Number” on page 1474.

Spanish Customer Account Number See “Spanish Customer Account Number” on page 1479.

Spanish DNI Identification Number See “Spanish DNI ID” on page 1481.

Spanish Passport Number See “Spanish Passport Number” on page 1483.

Spanish Social Security Number See “Spanish Social Security Number ” on page 1485.

Spanish Tax Identification (CIF) See “Spanish Tax Identification (CIF)” on page 1487.

Sweden Driver's Licence Number See “Sweden Driver's Licence Number” on page 1492.

Sweden Personal Identification Number See “Sweden Personal Identification Number” on page 1501.

Sweden Tax Identification Number See “Sweden Tax Identification Number” on page 1494.

Sweden Value Added Tax (VAT) Number See “Sweden Value Added Tax (VAT) Number”
on page 1496.

Swedish Passport Number See “Swedish Passport Number” on page 1499.

Swiss AHV Number See “Swiss AHV Number” on page 1505.

Swiss Social Security Number (AHV) See “Swiss Social Security Number (AHV)” on page 1507.

Switzerland Health Insurance Card Number See “Switzerland Health Insurance Card Number”
on page 1509.

Switzerland Passport Number See “Switzerland Passport Number” on page 1511.

Switzerland Value Added Tax (VAT) Number See “Switzerland Value Added Tax (VAT) Number”
on page 1513.

Turkish Identification Number See “Turkish Identification Number” on page 1521.

UK Bank Account Number Sort Code See “UK Bank Account Number Sort Code” on page 1523.
Detecting content using data identifiers 727
Introducing data identifiers

Table 31-4 European personal identity (continued)

Data identifier Description

UK Driver's Licence Number See “UK Drivers Licence Number” on page 1525.

UK Electoral Roll Number See “UK Electoral Roll Number” on page 1527.

UK Passport Number See “UK Passport Number” on page 1532.

UK National Health Service (NHS) Number See “UK National Health Service (NHS) Number”
on page 1528.

UK National Insurance Number See “UK National Insurance Number” on page 1530.

UK Tax ID Number See “UK Tax ID Number” on page 1534.

UK Value Added Tax (VAT) Number See “UK Value Added Tax (VAT) Number” on page 1536.

Ukraine Identity Card See “Ukraine Identity Card” on page 1539.

Ukraine Passport (Domestic) See “Ukraine Passport (Domestic)” on page 1541.

Ukraine Passport (International) See “Ukraine Passport (International)” on page 1543.

Table 31-5 lists system-defined data identifiers for the North American region.

Table 31-5 North American personal identity

Data identifier Description

Canada Driver's License Number See “Canada Driver's License Number” on page 1067.

Canada Passport Number See “Canada Passport Number” on page 1070.

Canada Permanent Residence (PR) Number See “Canada Permanent Residence (PR) Number”
on page 1072.

Canadian Social Insurance Number See “Canadian Social Insurance Number” on page 1074.

Driver's License Number – CA State See “Driver's License Number – CA State ” on page 1133.

Driver's License Number – FL, MI, MN States See “Driver's License Number - FL, MI, MN States”
on page 1134.

Driver's License Number – IL State See “Driver's License Number - IL State” on page 1136.

Driver's License Number – NJ State See “Driver's License Number - NJ State” on page 1138.

Driver's License Number – NY State See “Driver's License Number - NY State” on page 1139.

Driver's License Number -WA State See “Driver's License Number - WA State” on page 1140.
Detecting content using data identifiers 728
Introducing data identifiers

Table 31-5 North American personal identity (continued)

Data identifier Description

Driver's License Number - WI State See “Driver's License Number - WI State” on page 1142.

Mexican Personal Registration and Identification See “Mexican Personal Registration and Identification
Number Number” on page 1346.

Mexican Tax Identification Number See “Mexican Tax Identification Number” on page 1349.

Mexican Unique Population Registry Code (CURP) See “Mexican Unique Population Registry Code”
on page 1351.

Mexico CLABE Number See “Mexico CLABE Number” on page 1353.

Randomized US Social Security Number (SSN) See “Randomized US Social Security Number (SSN)”
on page 1414.

US Individual Tax ID Number (ITIN) See “US Individual Tax Identification Number (ITIN)”
on page 1546.

US Passport Number See “US Passport Number” on page 1548.

US Social Security Number (SSN) See “US Social Security Number (SSN)” on page 1550.
Note: This data identifer is replaced by the Randomized
US SSN data identifier.

US ZIP+4 Postal Codes See “US ZIP+4 Postal Codes” on page 1553.

Table 31-6 lists system-defined data identifiers for the South American region.

Table 31-6 South American personal identity

Data identifier Description

Argentina Tax Identification Number See “Argentina Tax Identification Number” on page 1015.

Brazilian Election Identification Number See “Brazilian Election Identification Number” on page 1049.

Brazilian National Registry of Legal Entities Number See “Brazilian National Registry of Legal Entities Number”
on page 1053.

Brazilian Natural Person Registry Number See “Brazilian Natural Person Registry Number (CPF)”
on page 1055.

Chilean National Identification Number See “Chilean National Identification Number” on page 1077.

Colombian Addresses See “Colombian Addresses” on page 1082.


Detecting content using data identifiers 729
Introducing data identifiers

Table 31-6 South American personal identity (continued)

Data identifier Description

Colombian Cell Phone Number See “Colombian Cell Phone Number” on page 1085.

Colombian Personal Identification Number See “Colombian Personal Identification Number”


on page 1088.

Colombian Tax Identification Number See “Colombian Tax Identification Number” on page 1090.

Venezuela National Identification Number See “Venezuela National Identification Number”


on page 1555.

Financial data identifiers


Table 31-7 lists system-defined data identifiers for detecting financial identification numbers,
such as credit card numbers and ABA routing numbers.

Table 31-7 Financial data identifiers

Data identifier Description

ABA Routing Number See “ABA Routing Number” on page 1013.

Credit Card Number See “Credit Card Number” on page 1095.

Credit Card Magnetic Stripe Data See “Credit Card Magnetic Stripe Data” on page 1092.

CUSIP Number See “CUSIP Number” on page 1106.

IBAN Central See “IBAN Central” on page 1227.

IBAN East See “IBAN East” on page 1231.

IBAN West See “IBAN West” on page 1237.

International Securities Identification Number See “International Securities Identification Number”


on page 1259.

SWIFT Code See “SWIFT Code ” on page 1503.

Healthcare data identifiers


Table 31-8 lists system-defined data identifiers for detecting U.S. and international drug codes,
and healthcare provider and consumer information.
Detecting content using data identifiers 730
Introducing data identifiers

Table 31-8 Healthcare

Data identifier Description

Australian Medicare Number See “Australian Medicare Number” on page 1024.

British Columbia Personal Healthcare Number See “British Columbia Personal Healthcare Number”
on page 1058.

Drug Enforcement Agency (DEA) Number See “Drug Enforcement Agency (DEA) Number”
on page 1145.

Healthcare Common Procedure Coding System See “Healthcare Common Procedure Coding System
(HCPCS CPT Code) (HCPCS CPT Code)” on page 1208.

Health Insurance Claim Number See “Health Insurance Claim Number” on page 1212.

Medicare Beneficiary Identifier See “Medicare Beneficiary Identifier” on page 1344.

National Drug Code See “National Drug Code (NDC)” on page 1355.

National Provider Identifier Number See “National Provider Identifier Number” on page 1357.

Information technology data identifiers


See Table 31-9 on page 730. lists system-defined data identifiers for detecting information
technology related patterns, such as IPv4 and IPv6 addresses, and mobile device identification
numbers.

Table 31-9 Information technology

Data identifier Description

International Mobile Equipment Identity Number See “International Mobile Equipment Identity Number”
on page 1257.

IP Address See “IP Address” on page 1261.

IPv6 Address See “IPv6 Address” on page 1263.

International keywords for PII data identifiers


Symantec Data Loss Prevention lets you modify system data identifiers and customize the
input keywords to detect a broad range of international content.
See “Extending and customizing data identifiers” on page 731.
See “Use custom keywords for system data identifiers” on page 869.
Detecting content using data identifiers 731
Introducing data identifiers

Extending and customizing data identifiers


You can customize data identifiers to suit your requirements. You can extend system-defined
data identifiers by modifying them. And, you can create new data identifiers for custom data
matching.
The most common use case for modifying a system-defined data identifier is to edit the data
input for a validator that accepts data input. For example, if the data identifier implements the
"Find keywords" validator, you may want to add or remove values from the list of keywords.
Another use case may involve adding or removing validators to or from the data identifier, or
changing one or more of the patterns defined by the data identifier.
See “Cloning a system data identifier before modifying it” on page 777.
To create a custom data identifier, you implement one or more detection pattern(s), select one
or more data validators, provide the data input if the validator requires it, and choose a data
normalizer.
See “Custom data identifier configuration” on page 814.
Policy authors can reuse modified and custom data identifiers in one or more policies.

About data identifier configuration


You can configure three types of data identifiers:
■ Instance – defined at the policy level
See “Configuring data identifier policy conditions” on page 734.
■ Modified – configured at the system-level
See “Modifying system data identifiers” on page 776.
■ Custom – created at the system-level
See “Creating custom data identifiers” on page 811.
The type of data identifier you implement depends on your business requirements. For most
use cases, configuring a policy instance using a non-modified, system-defined data identifier
is sufficient to accurately detect data loss. Should you need to, you can extend a system-defined
data identifier by modifying it, or you can implement one or more custom data identifiers to
detect unique data.
Data identifier configuration done at the policy instance-level is specific to that policy.
Modifications you make to data identifiers at the system-level apply to all data identifiers derived
from the modified data identifier.

About data identifier breadths


System data identifiers are implemented by breadth. The breadth defines the scope of detection
for that data identifier. Each data identifier implements at least one breadth of detection. The
Detecting content using data identifiers 732
Introducing data identifiers

widest option available for the data identifier is likely to produce the most false positive matches;
the narrowest option produces the least. Generally the validators and often the patterns differ
among breadths.
See “Using data identifier breadths” on page 738.
For example, the Driver's License Number – CA State data identifier provides wide and medium
breadths, with the medium breadth using a keyword validator.

Note: Not all system data identifiers provide each breadth of detection. Refer to the complete
list of data identifiers and breadths to determine what is available.
See “Selecting a data identifier breadth” on page 739.

About optional validators for data identifiers


Optional validators help you refine the scope of detection for a data identifier. When you
configure a data identifier instance, you can select among five optional validators.
See “Using optional validators” on page 762.
The type of characters accepted by each optional validator depends on the data identifier.
See “Acceptable characters for optional validators” on page 764.

Note: Optional validators only apply to the policy instance you are actively configuring; they
do not apply system-wide.

About data identifier patterns


Data identifiers implement patterns to match data. The data identifier pattern syntax is similar
to the regular expression language, but more limited. For example, the data identifier pattern
syntax does not support some regular expression features, including grouping, lookahead and
lookbehind expressions, and many special characters (notably the dot "." character). In addition,
the system only allows the use of ASCII characters for data identifier patterns.
See “Using the data identifier pattern language” on page 814.
When you edit a system data identifier, the system exposes the pattern for viewing and editing.
The system-defined data identifier patterns have been tuned and optimized for precise content
matching.
See “Selecting a data identifier breadth” on page 739.
In addition, you can create a custom data identifier in which case you are required to implement
at least one pattern. The best way to understand how to write patterns is to examine the
system-defined data identifier patterns.
Detecting content using data identifiers 733
Introducing data identifiers

See “Writing data identifier patterns to match data” on page 817.


The data identifier pattern language is a subset of the regular expression language.
See “Data identifier pattern language specification” on page 815.

About pattern validators


Pattern validators are validation checks applied to data matched by a data identifier pattern.
Validators help refine the scope of detection and reduce false positives. Many validators allow
for data input. For example, the Keyword validator lets you enter a list of keywords.
See “Using pattern validators” on page 818.
When you modify a data identifier, you can edit the input values for any validator that accepts
data.
See “Editing pattern validator input” on page 778.
When you modify a data identifier, you can add and remove pattern validators. When you
create custom data identifiers, you can configure one or more validators. The system also
provides you with the ability to author a custom script validator to define your own validation
check.
See “Selecting pattern validators” on page 829.

About data normalizers


A data normalizer reconciles the data detected by the data identifier pattern with the format
expected by the normalizer. You cannot modify the normalizer of a system-defined data
identifier. When you create a custom data identifier, you select a data normalizer.
See “Acceptable characters for optional validators” on page 764.
See “Selecting a data normalizer” on page 830.

About cross-component matching


Data identifiers support component matching. This means that you can configure data identifiers
to match on one or more message components. However, if the data identifier implements a
validator (optional or required), such as Find keywords, the validated data and the matched
data must exist in the same component to trigger or except an incident.
See “Detection messages and message components” on page 391.
For example, consider a scenario where you implement the Randomized US Social Security
Number (SSN) data identifier. This data identifier detects on various 9-digits patterns and uses
a keyword validator to narrow the scope of detection. (The keyword and phrases in the list are
"social security number, ssn, ss#"). If the detection engine receives a message with the number
Detecting content using data identifiers 734
Configuring data identifier policy conditions

pattern 123-45-6789 and the keyword "social security number" and both data items are
contained in the message attachment component, the detection engine reports a match.
However, if the attachment contains the number but the body contains the keyword validator,
the detection engine does not consider this to be a match.
See “Configuring the Content Matches data identifier condition” on page 737.

About unique match counting


Data identifiers, keywords, and regular expressions support unique match counting. This
feature lets you count only those pattern matches that are unique.
Unique match counting is useful when you are only concerned with detecting the presence of
unique patterns and not with detecting every matched pattern. For example, you could use
unique match counting to trigger an incident if a document contains 10 or more unique social
security numbers. In this case, if a document contained 10 instances of the same social security
number, the policy would not trigger an incident.
See “Using unique match counting” on page 775.
See “Configuring unique match counting” on page 775.

Configuring data identifier policy conditions


Table 31-10 lists and describes the configuration options for data identifier conditions.
See “Introducing data identifiers” on page 717.
See “Configuring the Content Matches data identifier condition” on page 737.

Table 31-10 Policy instance data identifier configuration

Selectable at the policy level Not configurable

■ Breadth ■ Patterns
You can implement any breadth the data identifier You cannot modify the match patterns at the instance
supports at the instance level. level.
■ Optional Validators ■ Mandatory Validators
You can select one or more optional validators at You cannot modify, add, or remove required validators at
the instance level. the instance level.

Workflow for configuring data identifier policies


Table 31-11 describes the workflow for implementing system-defined data identifiers.
Detecting content using data identifiers 735
Configuring data identifier policy conditions

Table 31-11 Workflow for implementing data identifiers

Step Action Description

1 Decide the type of data See “Introducing data identifiers” on page 717.
identifier you want to
implement.

2 Decide the data identifier See “About data identifier breadths” on page 731.
breadth.

3 Configure the data See “Configuring the Content Matches data identifier condition” on page 737.
identifier.

4 Test and tune the data See “Best practices for using data identifiers” on page 833.
identifier policy.

Managing and adding data identifiers


The Manage > Policies > data identifiers screen lists all data identifiers, including system-
and custom-defined. From this screen you manage and modify existing data identifiers, and
add new ones.
See “Introducing data identifiers” on page 717.

Table 31-12 Manage data identifiers

Action Description

Edit a data identifier. Select the data identifier from the list to modify it.

See “Selecting a data identifier breadth” on page 739.

See “Extending and customizing data identifiers” on page 731.

See “Editing data identifiers” on page 736.

Define a custom data Click Add data identifier to create a custom data identifier.
identifier.
See “Custom data identifier configuration” on page 814.

See “Workflow for creating custom data identifiers” on page 812.

Sort and view data The list is sorted alphabetical by Name.


identifiers.
You can also sort by the Category.

A pencil icon to the left means that the data identifier is modified from its original state, or is
custom.
Detecting content using data identifiers 736
Configuring data identifier policy conditions

Table 31-12 Manage data identifiers (continued)

Action Description

Remove a data Click the X icon on the right side to delete a data identifier.
identifier.
The system does not let you delete system data identifiers. You can only delete custom data
identifiers.

Editing data identifiers


You can modify system-defined data identifiers, including the patterns, validators, and validator
input. Modifications are propagated to any policy that declares the data identifier. You cannot
rename a system data identifier. Consider manually creating a cloned copy before you modify
a system data identifier.
See “Extending and customizing data identifiers” on page 731.

Note: The system does not export data identifiers in a policy template. The system exports a
reference to the system data identifier. The target system where the policy template is imported
provides the actual data identifier. If you modify a system-defined data identifier, the
modifications do not export to the template.

Table 31-13 Workflow for editing data identifiers

Step Action Description

1 Clone the system data Clone the system data identifier before you modify it.
identifier you want to modify.
See “Cloning a system data identifier before modifying it” on page 777.

See “Clone system-defined data identifiers before modifying to preserve


original state” on page 835.

2 Edit the cloned data identifier. If you modify a system data identifier, click the plus sign to display the breadth
and edit the data identifier.

See “Selecting a data identifier breadth” on page 739.

3 Edit one or more Patterns. You can modify any pattern that the Data Identifier provides.

See “Writing data identifier patterns to match data” on page 817.

4 Edit the data input for any See “Editing pattern validator input” on page 778.
validator that accepts input.
See “List of pattern validators that accept input data” on page 778.

5 Optionally, you can add or See “Selecting pattern validators” on page 829.
remove Validators, as
necessary.
Detecting content using data identifiers 737
Configuring data identifier policy conditions

Table 31-13 Workflow for editing data identifiers (continued)

Step Action Description

6 Save the data identifier. Click Save to save the modifications.


Once the data identifier is saved, the icon at the Data Identifiers screen
indicates that it is modified from its original state, or is custom.

See “Managing and adding data identifiers” on page 735.


Note: Click Cancel to not save the Data Identifier.

7 Implement the data identifier See “Configuring the Content Matches data identifier condition” on page 737.
in a policy rule or exception.

Configuring the Content Matches data identifier condition


You can configure the Content Matches data identifier condition in policy detection rules and
exceptions.
See “Introducing data identifiers” on page 717.

Table 31-14 Configuring the Content Matches data identifier condition

Step Action Description

1 Add a data identifier rule Select the Content Matches data identifier condition at the Add Detection
or exception to a policy, Rule or Add Exception screen.
or configure an existing
See “Adding a rule to a policy” on page 415.
one.
See “Adding an exception to a policy” on page 424.

2 Choose a data identifier. Choose a data identifier from the list and click Next.

See “System-defined data identifiers” on page 718.

3 Select a Breadth of Use the breadth option to narrow the scope of detection.
detection.
See “About data identifier breadths” on page 731.

Wide is the default setting and detects the broadest set of matches. Medium
and narrow breadths, if available, check additional criteria and detect fewer
matches.

See “Selecting a data identifier breadth” on page 739.

4 Select and configure one Optional validators restrict the match criteria and reduce false positives.
or more Optional
See “About optional validators for data identifiers” on page 732.
Validators.
Detecting content using data identifiers 738
Configuring data identifier policy conditions

Table 31-14 Configuring the Content Matches data identifier condition (continued)

Step Action Description

5 Configure Match Select how you want to count matches:


Counting.
■ Check for existence
Do not count multiple matches; report a match count of 1 for one or more
matches.
■ Count all matches
Count each match; specify the minimum number of matches to report an
incident.
See “Configuring match counting” on page 421.
■ Count all unique matches
This is the default setting.
See “About unique match counting” on page 734.
See “Configuring unique match counting” on page 775.

6 Configure the message Select one or more message components on which to match.
components to Match
On the endpoint, the detection engine matches the entire message, not
On.
individual components.

See “Selecting components to match on” on page 423.

If the data identifier uses optional or required keyword validators, the keyword
must be present in the same component as the matched data identifier content.

See “About cross-component matching” on page 733.

7 Configure additional Optionally, you can Add one or more additional conditions from any available
conditions to Also Match. in the Also Match condition list.

All conditions in a compound rule or exception must match to trigger or except


an incident.

See “Configuring compound match conditions” on page 429.

Using data identifier breadths


Each system data identifier provides one or more breadths of detection. When you configure
a system data identifier instance, or when you modify a system data identifier, you select which
breadth to implement. Not all breadth options are available for each data identifier.
See “About data identifier breadths” on page 731.
Detecting content using data identifiers 739
Configuring data identifier policy conditions

Table 31-15 Available rule breadths for system data identifiers

Breadth Description

Wide The wide breadth defines a single or multiple patterns to create the greatest number of matches.
In general this breadth produces a higher rate of false positives than the medium and narrow
breadths.

Medium The medium breadth may refine the detection pattern(s) and/or add one or more data validators
to limit the number of matches.

Narrow The narrow breadth offers the tightest patterns and strictest validation to provide the most accurate
positive matches. In general this option requires the presence of a keyword or other validating
restriction to trigger a match.

Selecting a data identifier breadth


You cannot change the normalizer that a system data identifier implements. This information
is useful to know when you implement one or more optional validators.
See “Acceptable characters for optional validators” on page 764.

Table 31-16 System data identifier breadths and normalizers

Data identifier Breadth(s) Normalizer

ABA Routing Number Wide Digits

See “ABA Routing Number” on page 1013. Medium

Narrow

Argentina Tax Identification Number Wide Digits

See “Argentina Tax Identification Number” on page 1015. Medium

Narrow

Australia Driver's License Number Wide Digits and Letters

See “Australia Driver's License Number” on page 1018. Narrow

Australian Business Number Wide Digits

See “Australian Business Number wide breadth” on page 1020. Medium

Narrow

Australian Company Number Wide Digits

See “Australian Company Number” on page 1022. Medium

Narrow
Detecting content using data identifiers 740
Configuring data identifier policy conditions

Table 31-16 System data identifier breadths and normalizers (continued)

Data identifier Breadth(s) Normalizer

Australian Medicare Number Wide Digits


See “Australian Medicare Number” on page 1024. Medium

Narrow

Australian Passport Number Wide Lowercase

See “Australian Passport Number” on page 1027. Narrow

Australian Tax File Number Wide Digits

See “Australian Tax File Number” on page 1029. Medium

Narrow

Austria Passport Number Wide Digits and Letters

See “Austria Passport Number” on page 1030. Narrow

Austria Tax Identification Number Wide Digits

See “Austria Tax Identification Number” on page 1031. Narrow

Austria Value Added Tax (VAT) Number Wide Digits and Letters

See “Austria Value Added Tax (VAT) Number” on page 1033. Medium

Narrow

Austrian Social Security Number Wide Digits

See “Austrian Social Security Number” on page 1036. Medium

Narrow

Belgian National Number Wide Digits

See “Belgian National Number” on page 1039. Medium

Narrow

Belgium Driver's License Number Wide Digits

See “Belgium Driver's Licence Number” on page 1042. Narrow

Belgium Passport Number Wide Digits and Letters

See “Belgium Passport Number” on page 1044. Narrow

Belgium Tax Identification Number Wide Digits

See “Belgium Tax Identification Number” on page 1045. Narrow


Detecting content using data identifiers 741
Configuring data identifier policy conditions

Table 31-16 System data identifier breadths and normalizers (continued)

Data identifier Breadth(s) Normalizer

Belgium Value Added Tax (VAT) Number Wide Digits and Letters
See “Belgium Value Added Tax (VAT) Number” on page 1047. Medium

Narrow

Brazilian Election Identification Number Wide Digits

See “Brazilian Election Identification Number” on page 1049. Medium

Narrow

Brazilian National Registry of Legal Entities Number Wide Digits

See “Brazilian National Registry of Legal Entities Number” Medium


on page 1053.
Narrow

Brazilian Natural Person Registry Number Wide Digits

See “Brazilian Natural Person Registry Number (CPF)” Medium


on page 1055.
Narrow

British Columbia Personal Healthcare Number Wide Digits

See “British Columbia Personal Healthcare Number” Medium


on page 1058.
Narrow

Bulgaria Value Added Tax (VAT) Number Wide Digits and Letters

See “Bulgaria Value Added Tax (VAT) Number” on page 1060. Medium

Narrow

Bulgarian Uniform Civil Number - EGN Wide Digits

See “Bulgarian Uniform Civil Number - EGN” on page 1063. Medium

Narrow

Burgerservicenummer Wide Digits

See “Burgerservicenummer” on page 1066. Narrow

Canada Driver's License Number Wide Digits and Letters

See “Canada Driver's License Number” on page 1067. Medium

Narrow

Canada Passport Number Wide Digits and Letters

See “Canada Passport Number” on page 1070. Narrow


Detecting content using data identifiers 742
Configuring data identifier policy conditions

Table 31-16 System data identifier breadths and normalizers (continued)

Data identifier Breadth(s) Normalizer

Canada Permanent Residence (PR) Number Wide Digits and Letters


See “Canada Permanent Residence (PR) Number” Narrow
on page 1072.

Canadian Social Insurance Number Wide Digits

See “Canadian Social Insurance Number” on page 1074. Medium

Narrow

Chilean National Identification Number Wide Digits and Letters

See “Chilean National Identification Number” on page 1077. Medium

Narrow

China Passport Number Wide Digits and Letters

See “China Passport Number” on page 1079. Narrow

Codice Fiscale Wide Digits and Letters

See “Codice Fiscale” on page 1081. Narrow

Colombian Addresses Wide Lowercase

See “Colombian Addresses” on page 1082. Narrow

Colombian Cell Phone Number Wide Digits

See “Colombian Cell Phone Number” on page 1085. Narrow

Colombian Personal Identification Number Wide Digits

See “Colombian Personal Identification Number” on page 1088. Narrow

Colombian Tax Identification Number Wide Digits

See “Colombian Tax Identification Number” on page 1090. Narrow

Credit Card Magnetic Stripe Data Medium Digits

See “Credit Card Magnetic Stripe Data” on page 1092.

Credit Card Number Wide Digits

See “Credit Card Number” on page 1095. Medium

Narrow
Detecting content using data identifiers 743
Configuring data identifier policy conditions

Table 31-16 System data identifier breadths and normalizers (continued)

Data identifier Breadth(s) Normalizer

Croatia National Identification Number Wide Digits and Letters


See “Croatia National Identification Number” on page 1104. Medium

Narrow

CUSIP Number Wide Lowercase

See “CUSIP Number” on page 1106. Medium

Narrow

Cyprus Tax Identification Number Wide Digits and Letters

See “Cyprus Tax Identification Number” on page 1109. Medium

Narrow

Cyprus Value Added Tax (VAT) Number Wide Digits and Letters

See “Cyprus Value Added Tax (VAT) Number” on page 1111. Medium

Narrow

Czech Republic Driver's Licence Number Wide Digits and Letters

See “Czech Republic Driver's Licence Number” on page 1112. Narrow

Czech Republic Personal Identification Number Wide Digits

See “Czech Republic Personal Identification Number” Medium


on page 1114.
Narrow

Czech Republic Tax Identification Number Wide Digits

See “Czech Republic Tax Identification Number” on page 1117. Medium

Narrow

Czech Republic Value Added Tax (VAT) Number Wide Digits and Letters

See “Czech Republic Value Added Tax (VAT) Number” Medium


on page 1121.
Narrow

Denmark Personal Identification Number Wide Digits and Letters

See “Denmark Personal Identification Number” on page 1126. Medium

Narrow
Detecting content using data identifiers 744
Configuring data identifier policy conditions

Table 31-16 System data identifier breadths and normalizers (continued)

Data identifier Breadth(s) Normalizer

Denmark Tax Identification Number Wide Digits


See “Denmark Tax Identification Number” on page 1128. Medium

Narrow

Denmark Value Added Tax (VAT) Number Wide Digits and Letters

See “Denmark Value Added Tax (VAT) Number” on page 1130. Medium

Narrow

Driver's License Number – CA State Wide Lowercase

See “Driver's License Number – CA State ” on page 1133. Medium

Driver's License Number – FL, MI, MN States Wide Lowercase

See “Driver's License Number - FL, MI, MN States” Medium


on page 1134.

Driver's License Number – IL State Wide Lowercase

See “Driver's License Number - IL State” on page 1136. Medium

Driver's License Number – NJ State Wide Lowercase

See “Driver's License Number - NJ State” on page 1138. Medium

Driver's License Number – NY State Wide Lowercase

See “Driver's License Number - NY State” on page 1139. Medium

Driver's License Number – WA State Wide Lowercase

See “Driver's License Number - WA State” on page 1140. Medium

Narrow

Driver's License Number – WI State Wide Digits and Letters

See “Driver's License Number - WI State” on page 1142. Medium

Narrow

Drug Enforcement Agency (DEA) Number Wide Lowercase

See “Drug Enforcement Agency (DEA) Number” on page 1145. Medium

Narrow

Estonia Driver's Licence Number Wide Digits and Letters

See “Estonia Driver's Licence Number” on page 1147. Narrow


Detecting content using data identifiers 745
Configuring data identifier policy conditions

Table 31-16 System data identifier breadths and normalizers (continued)

Data identifier Breadth(s) Normalizer

Estonia Passport Number Wide Digits and Letters


See “Estonia Passport Number” on page 1149. Narrow

Estonia Personal Identification Code Wide Digits

See “Estonia Personal Identification Code” on page 1151. Medium

Narrow

Estonia Value Added Tax (VAT) Number Wide Digits and Letters

See “Estonia Value Added Tax (VAT) Number” on page 1153. Medium

Narrow

European Health Insurance Card Number Wide Digits

See “European Health Insurance Card Number” on page 1156. Narrow

Finland Driver's Licence Number Wide Digits and Letters

See “Finland Driver's Licence Number” on page 1165. Medium

Narrow

Finland European Health Insurance Number Wide Digits

See “Finland European Health Insurance Number” on page 1167. Narrow

Finland Passport Number Wide Digits and Letters

See “Finland Passport Number” on page 1169. Narrow

Finland Tax Identification Number Wide Do nothing

See “Finland Tax Identification Number” on page 1171. Medium

Narrow

Finland Value Added Tax (VAT) Number Wide Digits and Letters

See “Finland Value Added Tax (VAT) Number” on page 1173. Medium

Narrow

Finnish Personal Identification Number Wide Lowercase

See “Finnish Personal Identification Number” on page 1175. Medium

Narrow

France Driver's License Number Wide Digits

See “France Driver's License Number” on page 1177. Narrow


Detecting content using data identifiers 746
Configuring data identifier policy conditions

Table 31-16 System data identifier breadths and normalizers (continued)

Data identifier Breadth(s) Normalizer

France Health Insurance Number Wide Digits


See “France Health Insurance Number” on page 1179. Narrow

France Tax Identification Number Wide Digits

See “France Tax Identification Number” on page 1181. Narrow

France Value Added Tax (VAT) Number Wide Digits and Letters

See “France Value Added Tax (VAT) Number” on page 1182. Medium

Narrow

French INSEE Code Wide Digits

See “French INSEE Code” on page 1185. Narrow

French Passport Number Wide Digits and Letters

See “French Passport Number” on page 1187. Narrow

French Social Security Number Wide Digits and Letters

See “French Social Security Number” on page 1188. Medium

Narrow

German Passport Number Wide Lowercase

See “German Passport Number” on page 1190. Medium

Narrow

German Personal ID Number Wide Lowercase

See “German Personal ID Number” on page 1192. Medium

Narrow

Germany Driver's License Number Wide Digits and Letters

See “Germany Driver's License Number” on page 1194. Narrow

Germany Tax Identification Number Wide Digits

See “Germany Tax Identification Number” on page 1198. Medium

Narrow

Germany Value Added Tax (VAT) Number Wide Digits and Letters

See “Germany Value Added Tax (VAT) Number” on page 1196. Medium

Narrow
Detecting content using data identifiers 747
Configuring data identifier policy conditions

Table 31-16 System data identifier breadths and normalizers (continued)

Data identifier Breadth(s) Normalizer

Greece Passport Number Wide Digits and Letters


See “Greece Passport Number” on page 1200. Narrow

Greece Social Security Number (AMKA) Wide Digits

See “Greece Social Security Number (AMKA)” on page 1202. Medium

Narrow

Greece Value Added Tax (VAT) Number Wide Digits and Letters

See “Greece Value Added Tax (VAT) Number” on page 1206. Medium

Narrow

Greek Tax Identification Number Wide Digits

See “Greek Tax Identification Number” on page 1204. Medium

Narrow

Healthcare Common Procedure Coding System (HCPCS Medium Digits and Letters
CPT Code)
Narrow
See “Healthcare Common Procedure Coding System (HCPCS
CPT Code)” on page 1208.

Health Insurance Claim Number Wide Digits and Letters

See “Health Insurance Claim Number” on page 1212. Medium

Narrow

Hong Kong ID Wide Lowercase

See “Hong Kong ID” on page 1215. Narrow

Hungarian Social Security Number Wide Digits

See “Hungarian Social Security Number” on page 1221. Medium

Narrow

Hungarian Tax Identification Number Wide Digits

See “Hungarian Tax Identification Number” on page 1223. Medium

Narrow

Hungarian VAT Number Wide Lowercase

See “Hungarian VAT Number” on page 1225. Medium

Narrow
Detecting content using data identifiers 748
Configuring data identifier policy conditions

Table 31-16 System data identifier breadths and normalizers (continued)

Data identifier Breadth(s) Normalizer

Hungary Driver's Licence Number Wide Digits and Letters


See “Hungary Driver's Licence Number” on page 1217. Narrow

Hungary Passport Number Wide Digits and Letters

See “Hungary Passport Number” on page 1219. Medium

Narrow

IBAN Central Wide Do nothing

See “IBAN Central” on page 1227. Narrow

IBAN East Wide Do nothing

See “IBAN East” on page 1231. Narrow

IBAN West Wide Do nothing

See “IBAN West” on page 1237. Narrow

Iceland National Identification Number Wide Digits

See “Iceland National Identification Number” on page 1241. Medium

Narrow

Iceland Passport Number Wide Digits and Letters

See “Iceland Passport Number” on page 1245. Narrow

Iceland Value Added Tax (VAT) Number Wide Digits and Letters

See “Iceland Value Added Tax (VAT) Number” on page 1247. Narrow

India RuPay Card Number Wide Digits

See “India RuPay Card Number” on page 1252. Medium

Narrow

Indian Aadhaar Card Number Wide Digits

See “Indian Aadhaar Card Number” on page 1249. Medium

Narrow

Indian Permanent Account Number Wide Digits and Letters

See “Indian Permanent Account Number” on page 1251. Narrow


Detecting content using data identifiers 749
Configuring data identifier policy conditions

Table 31-16 System data identifier breadths and normalizers (continued)

Data identifier Breadth(s) Normalizer

Indonesian Identity Card Number Wide Digits


See “Indonesian Identity Card Number” on page 1255. Medium

Narrow

International Mobile Equipment Identity Number Wide Digits

See “International Mobile Equipment Identity Number” Medium


on page 1257.
Narrow

International Securities Identification Number Wide Lowercase

See “International Securities Identification Number” Medium


on page 1259.
Narrow

IP Address Wide Do nothing

See “IP Address” on page 1261. Medium

Narrow

IPv6 Address Wide Do nothing

See “IPv6 Address” on page 1263. Medium

Narrow

Ireland Passport Number Wide Digits and Letters

See “Ireland Passport Number” on page 1266. Narrow

Ireland Tax Identification Number Wide Digits and Letters

See “Ireland Tax Identification Number” on page 1268. Medium

Narrow

Ireland Value Added Tax (VAT) Number Wide Digits and Letters

See “Ireland Value Added Tax (VAT) Number” on page 1271. Medium

Narrow

Irish Personal Public Service Number Wide Lowercase

See “Irish Personal Public Service Number” on page 1274. Medium

Narrow
Detecting content using data identifiers 750
Configuring data identifier policy conditions

Table 31-16 System data identifier breadths and normalizers (continued)

Data identifier Breadth(s) Normalizer

Israel Personal Identification Number Wide Digits


See “Israel Personal Identification Number” on page 1276. Medium

Narrow

Italy Driver's Licence Number Wide Digits and Letters

See “Italy Driver's Licence Number” on page 1278. Narrow

Italy Health Insurance Number Wide Digits and Letters

See “Italy Health Insurance Number” on page 1280. Narrow

Italy Passport Number Wide Digits and Letters

See “Italy Passport Number” on page 1282. Narrow

Italy Value Added Tax (VAT) Number Wide Digits and Letters

See “Italy Value Added Tax (VAT) Number” on page 1283. Medium

Narrow

Japan Driver's License Number Wide Digits

See “Japan Driver's License Number” on page 1285. Medium

Narrow

Japan Passport Number Wide Digits and Letters

See “Japan Passport Number” on page 1287. Narrow

Japanese Juki-Net Identification Number Wide Digits

See “Japanese Juki-Net Identification Number” on page 1289. Medium

Narrow

Japanese My Number - Corporate Wide Digits

See “Japanese My Number - Corporate” on page 1291. Narrow

Japanese My Number - Personal Wide Digits

See “Japanese My Number - Personal” on page 1292. Medium

Narrow

Kazakhstan Passport Number Wide Digits and Letters

See “Kazakhstan Passport Number” on page 1295. Narrow


Detecting content using data identifiers 751
Configuring data identifier policy conditions

Table 31-16 System data identifier breadths and normalizers (continued)

Data identifier Breadth(s) Normalizer

Korea Passport Number Wide Digits and Letters


See “Korea Passport Number” on page 1296. Narrow

Korea Residence Registration Number for Foreigners Wide Digits

See “Korea Residence Registration Number for Foreigners” Medium


on page 1298.
Narrow

Korea Residence Registration Number for Korean Wide Digits

See “Korea Residence Registration Number for Korean” Medium


on page 1300.
Narrow

Latvia Driver's Licence Number Wide Digits and Letters

See “Latvia Driver's Licence Number” on page 1303. Narrow

Latvia Passport Number Wide Digits and Letters

See “Latvia Passport Number” on page 1305. Narrow

Latvia Personal Identification Number Wide Digits

See “Latvia Personal Identification Number” on page 1306. Medium

Narrow

Latvia Value Added Tax (VAT) Number Wide Digits and Letters

See “Latvia Value Added Tax (VAT) Number” on page 1308. Medium

Narrow

Liechtenstein Passport Number Wide Digits and Letters

See “Liechtenstein Passport Number” on page 1311. Narrow

Lithuania Personal Identification Number Wide Digits

See “Lithuania Personal Identification Number” on page 1312. Medium

Narrow

Lithuania Tax Identification Number Wide Digits

See “Lithuania Tax Identification Number” on page 1315. Medium

Narrow
Detecting content using data identifiers 752
Configuring data identifier policy conditions

Table 31-16 System data identifier breadths and normalizers (continued)

Data identifier Breadth(s) Normalizer

Lithuania Value Added Tax Number Wide Digits and Letters


See “Lithuania Value Added Tax (VAT) Number” on page 1317. Medium

Narrow

Luxembourg National Register of Individuals Number Wide Digits

See “Luxembourg National Register of Individuals Number” Medium


on page 1320.
Narrow

Luxembourg Passport Number Wide Digits and Letters

See “Luxembourg Passport Number” on page 1322. Narrow

Luxembourg Tax Identification Number Wide Digits

See “Luxembourg Tax Identification Number” on page 1324. Medium

Narrow

Luxembourg Value Added Tax (VAT) Number Wide Digits and Letters

See “Luxembourg Value Added Tax (VAT) Number” Medium


on page 1327.
Narrow

Macau Individual Identification Number Wide Digits

See “Macau National Identification Number” on page 1331. Narrow

Malaysia Passport Number Wide Digits and Letters

See “Malaysia Passport Number” on page 1333. Narrow

Malaysian MyKad Number Wide Digits

See “Malaysian MyKad Number (MyKad)” on page 1335. Medium

Narrow

Malta National Identification Number Wide Digits and Letters

See “Malta National Identification Number” on page 1337. Narrow

Malta Tax Identification Number Wide Digits and Letters

See “Malta Tax Identification Number” on page 1339. Narrow

Malta Value Added Tax (VAT) Number Wide Digits and Letters

See “Malta Value Added Tax (VAT) Number” on page 1342. Medium

Narrow
Detecting content using data identifiers 753
Configuring data identifier policy conditions

Table 31-16 System data identifier breadths and normalizers (continued)

Data identifier Breadth(s) Normalizer

Medicare Beneficiary Identifier Wide Digits and Letters


See “Medicare Beneficiary Identifier” on page 1344. Medium

Narrow

Mexican Personal Registration and Identification Number Wide Digits and Letters

See “Mexican Personal Registration and Identification Number” Medium


on page 1346.
Narrow

Mexican Tax Identification Number Wide Digits and Letters

See “Mexican Tax Identification Number” on page 1349. Medium

Narrow

Mexican Unique Population Registry Code (CURP) Wide Lowercase

See “Mexican Unique Population Registry Code” on page 1351. Medium

Narrow

Mexico CLABE Number Wide Digits

See “Mexico CLABE Number” on page 1353. Medium

Narrow

National Drug Code Wide Do nothing

See “National Drug Code (NDC)” on page 1355. Medium

Narrow

National Provider Identifier Number Wide Digits

See “National Provider Identifier Number” on page 1357. Medium

Narrow

Netherlands Bank Account Number Wide Digits and Letters

See “Netherlands Bank Account Number” on page 1359. Medium

Narrow

Netherlands Driver's License Number Wide Digits

See “Netherlands Driver's License Number” on page 1362. Narrow

Netherlands Passport Number Wide Digits and Letters

See “Netherlands Passport Number” on page 1363. Narrow


Detecting content using data identifiers 754
Configuring data identifier policy conditions

Table 31-16 System data identifier breadths and normalizers (continued)

Data identifier Breadth(s) Normalizer

Netherlands Tax Identification Number Wide Digits


See “Netherlands Tax Identification Number” on page 1364. Medium

Narrow

Netherlands Value Added Tax (VAT) Number Wide Digits and Letters

See “Netherlands Value Added Tax (VAT) Number” Medium


on page 1367.
Narrow

New Zealand Driver's License Number Wide Digits and Letters

See “New Zealand Driver's Licence Number” on page 1370. Narrow

New Zealand National Health Index Number Wide Lowercase

See “New Zealand National Health Index Number” on page 1371. Medium

Narrow

New Zealand Passport Number Wide Digits and Letters

See “New Zealand Passport Number” on page 1373. Narrow

Norway Driver's Licence Number Wide Digits

See “Norway Driver's Licence Number” on page 1375. Narrow

Norway National Identification Number Wide Digits

See “Norway National Identification Number” on page 1377. Medium

Narrow

Norway Value Added Tax Number Wide Digits and Letters

See “Norway Value Added Tax Number” on page 1379. Medium

Narrow

Norwegian Birth Number Wide Digits

See “Norwegian Birth Number” on page 1382. Medium

Narrow

People's Republic of China ID Wide Lowercase

See “People's Republic of China ID” on page 1384. Narrow

Poland Driver's Licence Number Wide Digits

See “Poland Driver's Licence Number” on page 1386. Narrow


Detecting content using data identifiers 755
Configuring data identifier policy conditions

Table 31-16 System data identifier breadths and normalizers (continued)

Data identifier Breadth(s) Normalizer

Poland European Health Insurance Number Wide Digits


See “Poland European Health Insurance Number” on page 1387. Narrow

Poland Passport Number Wide Digits and Letters

See “Poland Passport Number” on page 1389. Narrow

Poland Value Added Tax (VAT) Number Wide Digits and Letters

See “Poland Value Added Tax (VAT) Number” on page 1391. Medium

Narrow

Polish Identification Number Wide Digits and Letters

See “Polish Identification Number” on page 1394. Medium

Narrow

Polish REGON Number Wide Digits

See “Polish REGON Number” on page 1396. Medium

Narrow

Polish Social Security Number (PESEL) Wide Digits

See “Polish Social Security Number (PESEL)” on page 1398. Medium

Narrow

Polish Tax Identification Number Wide Digits

See “Polish Tax Identification Number” on page 1400. Medium

Narrow

Portugal Driver's Licence Number Wide Digits and Letters

See “Portugal Driver's Licence Number” on page 1402. Narrow

Portugal National Identification Number Wide Digits and Letters

See “Portugal National Identification Number” on page 1404. Medium

Narrow

Portugal Passport Number Wide Digits and Letters

See “Portugal Passport Number” on page 1407. Narrow


Detecting content using data identifiers 756
Configuring data identifier policy conditions

Table 31-16 System data identifier breadths and normalizers (continued)

Data identifier Breadth(s) Normalizer

Portugal Tax Identification Number Wide Digits


See “Portugal Tax Identification Number” on page 1408. Medium

Narrow

Portugal Value Added Tax (VAT) Number Wide Digits and Letters

See “Portugal Value Added Tax (VAT) Number” on page 1411. Medium

Narrow

Randomized US Social Security Number (SSN) Medium Digits

See “Randomized US Social Security Number (SSN)” Narrow


on page 1414.

Romania Driver's Licence Number Wide Lowercase

See “Romania Driver's Licence Number” on page 1416. Narrow

Romania National Identification Number Wide Digits

See “Romania National Identification Number” on page 1419. Medium

Narrow

Romania Value Added Tax (VAT) Number Wide Digits and Letters

See “Romania Value Added Tax (VAT) Number” on page 1420. Medium

Narrow

Romanian Numerical Personal Code Wide Digits

See “Romanian Numerical Personal Code” on page 1425. Medium

Narrow

Russian Passport Identification Number Wide Digits

See “Russian Passport Identification Number” on page 1427. Narrow

Russian Taxpayer Identification Number Wide Digits

See “Russian Taxpayer Identification Number” on page 1428. Medium

Narrow

SEPA Creditor Identifier Number North Wide Digits and Letters

See “SEPA Creditor Identifier Number North” on page 1430. Medium

Narrow
Detecting content using data identifiers 757
Configuring data identifier policy conditions

Table 31-16 System data identifier breadths and normalizers (continued)

Data identifier Breadth(s) Normalizer

SEPA Creditor Identifier Number South Wide Digits and Letters


See “SEPA Creditor Identifier Number South” on page 1437. Medium

Narrow

SEPA Creditor Identifier Number West Wide Digits and Letters

See “SEPA Creditor Identifier Number West” on page 1441. Medium

Narrow

Serbia Unique Master Citizen Number Wide Digits

See “Serbia Unique Master Citizen Number” on page 1445. Medium

Narrow

Serbia Value Added Tax (VAT) Number Wide Digits and Letters

See “Serbia Value Added Tax (VAT) Number” on page 1448. Medium

Narrow

Singapore NRIC Wide Lowercase

See “Singapore NRIC data identifier” on page 1451.

Slovakia Driver's Licence Number Wide Digits and Letters

See “Slovakia Driver's Licence Number” on page 1451. Narrow

Slovakia National Identification Number Wide Digits and Letters

See “Slovakia National Identification Number” on page 1453. Medium

Narrow

Slovakia Passport Number Wide Digits and Letters

See “Slovakia Passport Number” on page 1457. Narrow

Slovakia Value Added Tax (VAT) Number Wide Digits and Letters

See “Slovakia Value Added Tax (VAT) Number” on page 1459. Medium

Narrow

Slovenia Passport Number Wide Digits and Letters

See “Slovenia Passport Number” on page 1461. Narrow


Detecting content using data identifiers 758
Configuring data identifier policy conditions

Table 31-16 System data identifier breadths and normalizers (continued)

Data identifier Breadth(s) Normalizer

Slovenia Tax Identification Number Wide Digits


See “Slovenia Tax Identification Number” on page 1463. Medium

Narrow

Slovenia Unique Master Citizen Number Wide Digits

See “Slovenia Unique Master Citizen Number” on page 1465. Medium

Narrow

Slovenia Value Added Tax (VAT) Number Wide Digits and Letters

See “Slovenia Value Added Tax (VAT) Number” on page 1467. Medium

Narrow

South African Personal Identification Number Wide Digits

See “South African Personal Identification Number” Medium


on page 1469.
Narrow

Spain Driver's License Number Wide Digits and Letters

See “Spain Driver's Licence Number” on page 1477. Narrow

Spain Value Added Tax (VAT) Number Wide Digits and Letters

See “Spain Value Added Tax (VAT) Number” on page 1474. Medium

Narrow

Spanish Customer Account Number Wide Digits

See “Spanish Customer Account Number” on page 1479. Medium

Narrow

Spanish DNI ID Wide Digits and Letters

See “Spanish DNI ID” on page 1481. Narrow

Spanish Social Security Number Wide Digits

See “Spanish Social Security Number ” on page 1485. Medium

Narrow

Spanish Tax Identification (CIF) Wide Digits and Letters

See “Spanish Tax Identification (CIF)” on page 1487. Medium

Narrow
Detecting content using data identifiers 759
Configuring data identifier policy conditions

Table 31-16 System data identifier breadths and normalizers (continued)

Data identifier Breadth(s) Normalizer

Sri Lanka National Identity Number Wide Digits and Letters


See “Sri Lanka National Identity Number” on page 1490. Medium

Narrow

Sweden Driver's Licence Number Wide Digits

See “Sweden Driver's Licence Number” on page 1492. Medium

Narrow

Sweden Tax Identification Number Wide Digits

See “Sweden Tax Identification Number” on page 1494. Medium

Narrow

Sweden Value Added Tax (VAT) Number Wide Digits and Letters

See “Sweden Value Added Tax (VAT) Number” on page 1496. Medium

Narrow

Swedish Passport Number Wide Digits and Letters

See “Swedish Passport Number” on page 1499. Narrow

Swedish Personal Identification Number Wide Digits

See “Sweden Personal Identification Number” on page 1501. Medium

Narrow

SWIFT Code Wide Swift

See “SWIFT Code ” on page 1503. Narrow

Swiss AHV Number Wide Digits

See “Swiss AHV Number” on page 1505. Narrow

Swiss Social Security Number (AHV) Wide Digits

See “Swiss Social Security Number (AHV)” on page 1507. Medium

Narrow

Switzerland Health Insurance Card Number Wide Digits

See “Switzerland Health Insurance Card Number” on page 1509. Narrow

Switzerland Passport Number Wide Digits and Letters

See “Switzerland Passport Number” on page 1511. Narrow


Detecting content using data identifiers 760
Configuring data identifier policy conditions

Table 31-16 System data identifier breadths and normalizers (continued)

Data identifier Breadth(s) Normalizer

Switzerland Value Added Tax (VAT) Number Wide Lowercase


See “Switzerland Value Added Tax (VAT) Number” Medium
on page 1513.
Narrow

Taiwan ROC ID Wide Do nothing

See “Taiwan ROC ID” on page 1515. Narrow

Thailand Passport Number Wide Digits and Letters

See “Thailand Passport Number” on page 1517. Narrow

Thailand Personal Identification Number Wide Digits

See “Thailand Personal Identification Number” on page 1519. Medium

Narrow

Turkish Identification Number Wide Digits

See “Turkish Identification Number” on page 1521. Medium

Narrow

UK Bank Account Number Sort Code Wide Digits

See “UK Bank Account Number Sort Code” on page 1523. Medium

Narrow

UK Driver's Licence Number Wide Digits and Letters

See “UK Drivers Licence Number” on page 1525. Medium

Narrow

UK Electoral Roll Number Narrow Lowercase

See “UK Electoral Roll Number” on page 1527.

UK National Health Service (NHS) Number Medium Digits

See “UK National Health Service (NHS) Number” on page 1528. Narrow

UK National Insurance Number Wide Lowercase

See “UK National Insurance Number” on page 1530. Medium

Narrow
Detecting content using data identifiers 761
Configuring data identifier policy conditions

Table 31-16 System data identifier breadths and normalizers (continued)

Data identifier Breadth(s) Normalizer

UK Passport Number Wide Do nothing


See “UK Passport Number” on page 1532. Medium

Narrow

UK Tax ID Number Wide Do nothing

See “UK Tax ID Number” on page 1534. Medium

Narrow

UK Value Added Tax (VAT) Number Wide Digits and Letters

See “UK Value Added Tax (VAT) Number” on page 1536. Medium

Narrow

Ukraine Identity Card Wide Digits

See “Ukraine Identity Card” on page 1539. Medium

Narrow

Ukraine Passport (Domestic) Wide Digits

See “Ukraine Passport (Domestic)” on page 1541. Narrow

Ukraine Passport (International) Wide Digits and Letters

See “Ukraine Passport (International)” on page 1543. Narrow

United Arab Emirates Personal Number Wide Digits

See “United Arab Emirates Personal Number” on page 1544. Medium

Narrow

US Individual Tax ID Number (ITIN) Wide Digits

See “US Individual Tax Identification Number (ITIN)” Medium


on page 1546.
Narrow

US Passport Number Wide Digits

See “US Passport Number” on page 1548. Narrow

US Social Security Number (SSN) Wide Digits

See “US Social Security Number (SSN)” on page 1550. Medium

Narrow
Detecting content using data identifiers 762
Configuring data identifier policy conditions

Table 31-16 System data identifier breadths and normalizers (continued)

Data identifier Breadth(s) Normalizer

US ZIP+4 Postal Codes Wide Digits and Letters


See “US ZIP+4 Postal Codes” on page 1553. Medium

Narrow

Venezuela National ID Number Wide Digits and Letters

See “Venezuela National Identification Number” on page 1555. Medium

Narrow

Using optional validators


Table 31-17 lists the optional validators policy authors can configure for system data identifiers.
See “About optional validators for data identifiers” on page 732.

Table 31-17 Available optional validators for policy instances

Optional validator Description

Require beginning Match the characters that begin (lead) the matched data item.
characters
For example, for the CA Drivers License data identifier, you could require the beginning
character to be the letter "C." In this case the engine matches a license number C6457291.

See “Acceptable characters for optional validators” on page 764.

Require ending characters Match the characters that end (trail) the matched data item.

See “Acceptable characters for optional validators” on page 764.

Exclude beginning Exclude from matching characters that begin (lead) the matched data.
characters
See “Acceptable characters for optional validators” on page 764.

Exclude ending Exclude from matching the characters that end (trail) the matched data item.
characters
See “Acceptable characters for optional validators” on page 764.
Detecting content using data identifiers 763
Configuring data identifier policy conditions

Table 31-17 Available optional validators for policy instances (continued)

Optional validator Description

Find keywords Match one or more keywords or key phrases in addition to the matched data item. Can
check for the proximity of matched data against a list of keywords.

Keywords can also be scanned for case sensitivity. Then a check is performed for the
proximity of the matched data identifier patterns against a list of keywords. An incident is
generated when all of the data identifier patterns in the rule match. Captured keywords
are highlighted in incidents. Proximity, case sensitivity, and validator highlighting are
disabled by default and must be enabled to work.

The keyword must be detected in the same message component as the data identifier
content to report a match.

See “About cross-component matching” on page 733.

This optional validator accepts any characters (numbers, letters, others).

See “Acceptable characters for optional validators” on page 764.

See “List of pattern validators that accept input data” on page 778.

Exact Match Data Lookup tokens around a pattern for an Exact Match Data Identifier index and validate the
Identifier Check pattern.

See “Adding an EMDI check to a built-in or custom data identifier condition in a policy”
on page 487.

Configuring optional validators


You implement optional validators to refine the scope of a data identifier defined in a policy
instance. System and custom data identifiers support the configuration of optional validators.
See “About optional validators for data identifiers” on page 732.
The type of input allowed by an optional validator (numbers, letters, characters) depends on
the data identifier. If you enter unacceptable input characters and attempt to save the
configuration, the system reports an error.
For example, the US Social Security Number (SSN) data identifier accepts numbers only. If
you configure the "Require ending character" optional validator and provide input as letters,
you receive the following error when you attempt to save the configuration: Input to "Require
ending characters" Validator is incorrect: List contains non-number character.
See “Acceptable characters for optional validators” on page 764.
Detecting content using data identifiers 764
Configuring data identifier policy conditions

To configure an optional validator


1 Click the plus sign beside the Optional Validators label for the data identifier instance
you are configuring.
See “Configuring the Content Matches data identifier condition” on page 737.
2 Select one or more optional validators.
See “About optional validators for data identifiers” on page 732.
3 Provide the expected input for each optional validator you select.
Each value can be of any length. Use commas to separate multiple values.
4 Click Save to save the configuration.
If the system displays an error message, make sure you have entered the correct type of
expected character input.
See “Acceptable characters for optional validators” on page 764.

Acceptable characters for optional validators


Each optional validator requires you to enter in some data values. You must enter the
appropriate type of data according for that data identifier. Table 31-18 lists the acceptable data
type for each data identifier/optional validator pairing.
See “About optional validators for data identifiers” on page 732.

Note: The Find keyword optional validator accepts any characters as values for all data
identifiers .

The type of data expected by the optional validator depends on the data identifier. Most data
identifier/optional validator pairings accept numbers only; some accept alphanumeric values,
and a few accept any characters. If you enter unacceptable input and attempt to save the
policy, the system reports an error.
See “Configuring optional validators” on page 763.

Table 31-18 Acceptable characters for optional validators

Data Identifier Exclude/require Exclude/require


beginning characters ending characters

ABA Routing Number Numbers only Numbers only

Argentina Tax Identification Number Numbers only Numbers only

Australia Driver's License Number Alphanumeric Alphanumeric


Detecting content using data identifiers 765
Configuring data identifier policy conditions

Table 31-18 Acceptable characters for optional validators (continued)

Data Identifier Exclude/require Exclude/require


beginning characters ending characters

Australian Business Number Numbers only Numbers only

Australian Company Number Numbers only Numbers only

Australian Medicare Number Numbers only Numbers only

Australian Passport Number Letters only (normalized Numbers only


to lowercase)

Australian Tax File Number Numbers only Numbers only

Austria Passport Number Alphanumeric Alphanumeric

Austria Tax Identification Number Numbers only Numbers only

Austria Value Added Tax (VAT) Number Letters only Numbers only

Austrian Social Security Number Numbers only Numbers only

Belgian National Number Numbers only Numbers only

Belgium Driver's Licence Number Numbers only Numbers only

Belgium Passport Number Alphanumeric Alphanumeric

Belgium Tax Identification Number Numbers only Numbers only

Belgium Value Added Tax (VAT) Number Letters only Numbers only

Brazilian Election Identification Number Numbers only Numbers only

Brazilian National Registry of Legal Entities Number Numbers only Numbers only

Brazilian Natural Person Registry Number Numbers only Numbers only

British Columbia Personal Number Numbers only Numbers only

Bulgaria Value Added Tax (VAT) Number Letters only Numbers only

Bulgarian Uniform Civil Number - EGN Numbers only Numbers only

Burgerservicenummer Numbers only Numbers only

Canada Driver's License Number Alphanumeric Alphanumeric

Canada Passport Number Letters only Numbers only

Canada Permanent Resident (PR) Number Letters only Numbers only


Detecting content using data identifiers 766
Configuring data identifier policy conditions

Table 31-18 Acceptable characters for optional validators (continued)

Data Identifier Exclude/require Exclude/require


beginning characters ending characters

Canadian Social Insurance Number Numbers only Numbers only

Chilean National Identification Number Alphanumeric Alphanumeric

China Passport Number Alphanumeric Alphanumeric

Codice Fiscale Letters only Letters only

Columbian Addresses Numbers only Numbers only

Colombian Cell Phone Number Numbers only Numbers only

Columbian Personal Identification Number Numbers only Numbers only

Colombian Tax Identification Number Numbers only Numbers only

Common Procedure Coding System (HCPCS CPT Code) Alphanumeric Alphanumeric

Credit Card Magnetic Stripe Data Numbers only Numbers only

Credit Card Number Numbers only Numbers only

Croatia National Identification Number Alphanumeric Alphanumeric

CUSIP Number Alphanumeric (normalized Alphanumeric


to lowercase) (normalized to lowercase)

Cyprus Tax Identification Number Letters only Numbers only

Cyprus Value Added Tax (VAT) Number Alphanumeric Alphanumeric

Czech Republic Driver's Licence Number Letters only Numbers only

Czech Republic Personal Identification Number Numbers only Numbers only

Czech Republic Tax Identification Number Numbers only Numbers only

Czech Republic Value Added Tax (VAT) Number Letters only Numbers only

Denmark Personal Identification Number Alphanumeric Alphanumeric

Denmark Tax Identification Number Numbers only Numbers only

Denmark Value Added Tax (VAT) Number Letters only Numbers only

Driver's License Number – CA State Letters only (normalized Numbers only


to lowercase)
Detecting content using data identifiers 767
Configuring data identifier policy conditions

Table 31-18 Acceptable characters for optional validators (continued)

Data Identifier Exclude/require Exclude/require


beginning characters ending characters

Driver's License Number – FL, MI, MN States Letters only (normalized Numbers only
to lowercase)

Driver's License Number – IL State Letters only (normalized Numbers only


to lowercase)

Driver's License Number – NJ State Letters only (normalized Numbers only


to lowercase)

Driver's License Number – NY State Numbers only Numbers only

Driver's License Number - WA State Alphanumeric (normalized Alphanumeric


to lowercase) (normalized to lowercase)

Driver's License Number - WI State Letters only Numbers only

Drug Enforcement Agency (DEA) Number Letters only (normalized Numbers only
to lowercase)

Estonia Driver's Licence Number Letters only Numbers only

Estonia Passport Number Letters only Numbers only

Estonia Personal Identification Number Numbers only Numbers only

Estonia Value Added Tax (VAT) Number Letters only Numbers only

European Health Insurance Card Number Numbers only Numbers only

Finland Driver's Licence Number Alphanumeric Alphanumeric

Finland European Health Insurance Number Numbers only Numbers only

Finland Passport Number Letters only Numbers only

Finland Tax Identification Number Alphanumeric Alphanumeric

Finland Value Added Tax (VAT) Number Letters only Numbers only

Finnish Personal Identification Number Alphanumeric (normalized Alphanumeric


to lowercase) (normalized to lowercase)

France Driver's Licence Number Numbers only Numbers only

France Health Insurance Number Numbers only Numbers only

France Tax Identification Number Numbers only Numbers only


Detecting content using data identifiers 768
Configuring data identifier policy conditions

Table 31-18 Acceptable characters for optional validators (continued)

Data Identifier Exclude/require Exclude/require


beginning characters ending characters

France Value Added Tax (VAT) Number Letters only Numbers only

French INSEE Code Numbers only Numbers only

French Passport Number Alphanumeric Alphanumeric

French Social Security Number Alphanumeric Alphanumeric

German Passport Number Alphanumeric (normalized Alphanumeric


to lowercase) (normalized to lowercase)

German Personal Identification Number Alphanumeric (normalized Alphanumeric


to lowercase) (normalized to lowercase)

German Driver's Licence Number Alphanumeric Alphanumeric

German Tax Identification Number Numbers only Numbers only

German Value Added Tax (VAT) Number Letters only Numbers only

Greece Passport Number Letters only Numbers only

Greece Social Security Number (AMKA) Numbers only Numbers only

Greece Value Added Tax (VAT) Number Letters only Numbers only

Greek Tax Identification Number Numbers only Numbers only

Health Insurance Claim Number Alphanumeric Alphanumeric

Hong Kong ID Alphanumeric Alphanumeric

Hungarian Social Security Number Numbers only Numbers only

Hungarian Tax Identification Number Numbers only Numbers only

Hungarian VAT Number Letters only (normalized Numbers only


to lowercase)

Hungary Driver's Licence Number Letters only Numbers only

Hungary Passport Number Letters only Numbers only

IBAN Central Alphanumeric Alphanumeric

IBAN East Alphanumeric Alphanumeric

IBAN West Alphanumeric Alphanumeric


Detecting content using data identifiers 769
Configuring data identifier policy conditions

Table 31-18 Acceptable characters for optional validators (continued)

Data Identifier Exclude/require Exclude/require


beginning characters ending characters

Iceland National Identification Number Numbers only Numbers only

Iceland Passport Number Letters only Numbers only

Iceland Value Added Tax (VAT) Number Letters only Numbers only

India RuPay Card Number Numbers only Numbers only

Indian Aadhar Card Number Numbers only Numbers only

Indonesian Identity Card Number Letters only Letters only

International Mobile Equipment Identity Number Numbers only Numbers only

International Securities Identification Number Letters only (normalized Numbers only


to lowercase)

IP Address Any characters Any characters

IPv6 Address Alphanumeric Alphanumeric

Ireland Passport Number Letters only Numbers only

Ireland Tax Identification Number Alphanumeric Alphanumeric

Ireland Value Added Tax (VAT) Number Letters only Numbers only

Irish Personal Public Service Number Numbers only Letters only (normalized
to lowercase)

Israel Personal Identification Number Numbers only Numbers only

Italy Driver's Licence Number Letters only Letters only

Italy Health Insurance Number Letters only Letters only

Italy Passport Number Alphanumeric Alphanumeric

Italy Value Added Tax (VAT) Number Letters only Numbers only

Japan Driver's License Number Numbers only Numbers only

Japan Passport Number Letters only Numbers only

Japanese Juki-Net ID Number Numbers only Numbers only

Japanese My Number - Corporate Numbers only Numbers only


Detecting content using data identifiers 770
Configuring data identifier policy conditions

Table 31-18 Acceptable characters for optional validators (continued)

Data Identifier Exclude/require Exclude/require


beginning characters ending characters

Japanese My Number - Personal Numbers only Numbers only

Kazakhstan Passport Number Letters only Numbers only

Korea Passport Number Alphanumeric Alphanumeric

Korea Residence Registration Number for Foreigners Numbers only Numbers only

Korea Residence Registration Number for Korean Numbers only Numbers only

Latvia Driver's Licence Number Letters only Numbers only

Latvia Passport Number Letters only Numbers only

Latvia Personal Identification Number Numbers only Numbers only

Latvia Value Added Tax (VAT) Number Letters only Numbers only

Liechtenstein Passport Number Letters only Numbers only

Lithuania Personal Identification Number Numbers only Numbers only

Lithuania Tax Identification Number Numbers only Numbers only

Lithuania Value Added Tax (VAT) Number Letters only Numbers only

Luxembourg National Register of Individuals Number Numbers only Numbers only

Luxembourg Passport Number Alphanumeric Alphanumeric

Luxembourg Tax Identification Number Numbers only Numbers only

Luxembourg Value Added Tax (VAT) Number Letters only Numbers only

Macau National Identification Number Numbers only Numbers only

Malaysia Passport Number Letters only Numbers only

Malaysian MyKad Number (MyKad) Numbers only Numbers only

Malta National Identification Number Numbers only Letters only

Malta Tax Identification Number Alphanumeric Alphanumeric

Malta Value Added Tax (VAT) Number Alphanumeric Alphanumeric

Medicare Beneficiary Number Alphanumeric Alphanumeric


Detecting content using data identifiers 771
Configuring data identifier policy conditions

Table 31-18 Acceptable characters for optional validators (continued)

Data Identifier Exclude/require Exclude/require


beginning characters ending characters

Mexican Personal Registration and Identification Number Alphanumeric Alphanumeric

Mexican Tax Identification Number Alphanumeric Alphanumeric

Mexican Unique Population Registry Code Alphanumeric (normalized Alphanumeric


to lowercase) (normalized to lowercase)

Mexico CLABE Number Numbers only Numbers only

National Drug Code (NDC) Numbers only Numbers only

National Provider Identifier Number Numbers only Numbers only

Netherlands Bank Account Number Alphanumeric Alphanumeric

Netherlands Driver's Licence Number Numbers only Numbers only

Netherlands Passport Number Alphanumeric Alphanumeric

Netherlands Tax Identification Number Numbers only Numbers only

Netherlands Value Added Tax (VAT) Number Letters only Numbers only

New Zealand Driver's License Number Letters only Numbers only

New Zealand National Health Index Number Letters only (normalized Numbers only
to lowercase)

New Zealand Passport Number Letters only Numbers only

Norway Driver's Licence Number Numbers only Numbers only

Norway National Identification Number Numbers only Numbers only

Norway Value Added Tax Number Alphanumeric Alphanumeric

Norwegian Birth Number Numbers only Numbers only

People's Republic of China ID Alphanumeric (normalized Alphanumeric


to lowercase) (normalized to lowercase)

Poland Driver's Licence Number Numbers only Numbers only

Poland European Health Insurance Number Numbers only Numbers only

Poland Passport Number Letters only Numbers only

Poland Value Added Tax (VAT) Number Letters only Numbers only
Detecting content using data identifiers 772
Configuring data identifier policy conditions

Table 31-18 Acceptable characters for optional validators (continued)

Data Identifier Exclude/require Exclude/require


beginning characters ending characters

Polish Identification Number Letters only Numbers only

Polish REGON Number Numbers only Numbers only

Polish Social Security Number (PESEL) Numbers only Numbers only

Polish Tax Identification Number Numbers only Numbers only

Portugal Driver's Licence Number Letters only Numbers only

Portugal National Identification Number Alphanumeric Alphanumeric

Portugal Passport Number Letters only Numbers only

Portugal Tax Identification Number Numbers only Numbers only

Portugal Value Added Tax (VAT) Number Letters only Numbers only

Randomized US Social Security Number (SSN) Numbers only Numbers only

Romania Driver's Licence Number Alphanumeric (normalized Alphanumeric


to lowercase) (normalized to lowercase)

Romania National Identification Number Numbers only Numbers only

Romania Numerical Personal Code Numbers only Numbers only

Romania Value Added Tax (VAT) Number Letters only Numbers only

Romanian Numerical Personal Code Numbers only Numbers only

Russian Passport Identification Number Numbers only Numbers only

Russian Taxpayer Identification Number Numbers only Numbers only

SEPA Creditor Identifier Number North Alphanumeric Alphanumeric

SEPA Creditor Identifier Number South Alphanumeric Alphanumeric

SEPA Creditor Identifier Number West Alphanumeric Alphanumeric

Serbia Unique Master Citizen Number Numbers only Numbers only

Serbia Value Added Tax (VAT) Number Alphanumeric Alphanumeric

Singapore NRIC Alphanumeric (normalized Alphanumeric


to lowercase) (normalized to lowercase)
Detecting content using data identifiers 773
Configuring data identifier policy conditions

Table 31-18 Acceptable characters for optional validators (continued)

Data Identifier Exclude/require Exclude/require


beginning characters ending characters

Slovakia Driver's Licence Number Letters only Numbers only

Slovakia National Identification Number Alphanumeric Alphanumeric

Slovakia Passport Number Letters only Numbers only

Slovakia Value Added Tax (VAT) Number Letters only Numbers only

Slovenia Passport Number Letters only Numbers only

Slovenia Tax Identification Number Numbers only Numbers only

Slovenia Unique Master Citizen Number Numbers only Numbers only

Slovenia Value Added Tax (VAT) Number Letters only Numbers only

South African Personal Identification Number Numbers only Numbers only

Spain Driver's Licence Number Alphanumeric Alphanumeric

Spain Value Added Tax (VAT) Number Alphanumeric Alphanumeric

Spanish Customer Account Number Numbers only Numbers only

Spanish DNI ID Alphanumeric Alphanumeric

Spanish Passport Number Alphanumeric Alphanumeric

Spanish Social Security Number Numbers only Numbers only

Spanish Tax ID (CIF) Alphanumeric Alphanumeric

Sri Lanka National Identification Number Alphanumeric Alphanumeric

Sweden Driver's Licence Number Numbers only Numbers only

Sweden Personal Identification Number Numbers only Numbers only

Sweden Tax Identification Number Numbers only Numbers only

Sweden Value Added Tax (VAT) Number Letters only Numbers only

Swedish Passport Number Alphanumeric Alphanumeric

SWIFT Code Alphanumeric Alphanumeric

Swiss AHV Number Numbers only Numbers only


Detecting content using data identifiers 774
Configuring data identifier policy conditions

Table 31-18 Acceptable characters for optional validators (continued)

Data Identifier Exclude/require Exclude/require


beginning characters ending characters

Swiss Social Security Number (AHV) Alphanumeric Alphanumeric

Switzerland Health Insurance Card Number Numbers only Numbers only

Switzerland Passport Number Letters only Numbers only

Switzerland Value Added Tax (VAT) Number Alphanumeric (normalized Alphanumeric


to lowercase) (normalized to lowercase)

Taiwan ROC ID Alphanumeric Alphanumeric

Thailand Passport Number Letters only Numbers only

Thailand Personal ID Number Numbers only Numbers only

Turkish Identification Number Numbers only Numbers only

UK Bank Account Number Sort Code Numbers only Numbers only

UK Driver's Licence Number Alphanumeric (normalized Alphanumeric


to lowercase) (normalized to lowercase)

UK Electoral Roll Number Letters only (normalized Numbers only


to lowercase)

UK National Health Service (NHS) Number Numbers only Numbers only

UK National Insurance Number Letters only (normalized Letters only (normalized


to lowercase) to lowercase)

UK Passport Number Numbers only Numbers only

UK Tax ID Number Numbers only Numbers only

UK Value Added Tax (VAT) Number Letters only Numbers only

Ukraine Identity Card Numbers only Numbers only

Ukraine Passport (Domestic) Numbers only Numbers only

Ukraine Passport (International) Alphanumeric Alphanumeric

United Arab Emirates Personal Number Numbers only Numbers only

US Individual Tax Identification Number (ITIN) Numbers only Numbers only

US Passport Number Numbers only Numbers only


Detecting content using data identifiers 775
Configuring data identifier policy conditions

Table 31-18 Acceptable characters for optional validators (continued)

Data Identifier Exclude/require Exclude/require


beginning characters ending characters

US Social Security Number (SSN) Numbers only Numbers only

US ZIP+4 Postal Codes Letters only Numbers only

Venezuela National ID Number Letters only Numbers only

Using unique match counting


When you define a new data identifier rule, a new keyword rule, or a new regular expression
rule Count all unique matches is the default method for counting matches.
The following table describes unique match counting characteristics.

Table 31-19 Unique match counting characteristics

Unique match counting Description


characteristic

First match is unique A unique match is the first match found in a message component.

See “Detection messages and message components” on page 391.

Match count updated for each unique The match count is incremented by 1 for each unique pattern match.
match

Only unique matches are highlighted Duplicate matches are neither counted nor highlighted at the Incident Snapshot
screen

See “Remediating incidents” on page 1844.

Uniqueness does not span message For example, if the same SSN appears in both the message body and
components attachment, two unique matches will be generated, not one. This is because
each instance is detected in a separate message component.

Compound rule with data identifier In a compound rule combining a data identifier condition with a keyword condition
and keyword proximity conditions that specifies keyword proximity logic, the reported match will be the first match
found

Configuring unique match counting


Count all unique matches is the default selection for new data identifiers you create. After
upgrading Data Loss Prevention, you may need to manually configure pre-existing data identifier
rules to use unique match counting, if you have not done so prior to upgrade
See “About unique match counting” on page 734.
Detecting content using data identifiers 776
Modifying system data identifiers

To configure unique match counting


1 Select the policy containing the data identifier rule or rules you want to update at the
Manage > Policies > Policy List screen.
2 Select the data identifier rule at the Configure Policy screen.
3 Select the match counting option Count all unique matches.
4 Click OK to apply the unique match counting configuration change.
5 Click Save to save the policy change.
6 Test unique match counting.
Create an incident with multiple instances of a data identifier pattern, such as several
instances of the same social security number in the same message component (for
example, in an email attachment).
At the Incident Snapshot verify that only unique matches are highlighted and counted.

Modifying system data identifiers


The system lets you modify system-defined data identifiers, but you cannot delete them. Any
modifications you make to the configuration of a system-defined data identifier take effect
system-wide. This means that the modifications apply to any policies that actively or
subsequently declare the data identifier.
There is no way to automatically revert a data identifier to its original configuration once it is
modified. Before you modify a system data identifier, consider cloning it.
, and any custom data identifiers that you have created. Any modification you make to a data
identifier takes effect system wide. This means the modifications apply to any policy that
declares the modified data identifier.
The system does not include modified data identifiers in policies exported as templates. Before
modifying a system data identifier, export any policies that declare it.
See “Editing data identifiers” on page 736.
See “Editing pattern validator input” on page 778.

Note: The system does not export modified and custom data identifiers in a policy template.
The system exports a reference to the system data identifier. The target system where the
policy template is imported provides the actual data identifier. See “Clone system-defined data
identifiers before modifying to preserve original state” on page 835.

See “Editing data identifiers” on page 736.


Detecting content using data identifiers 777
Modifying system data identifiers

Table 31-20 System data identifier modification options

Modifiable at the system level Not configurable

■ Patterns ■ Name, Description, and Category


You can edit one or more data identifier patterns at You cannot modify the name, description, or category of
the system level. a system data identifier.
■ Active Validators ■ Breadth
You can add or remove required validators at the You cannot define a new detection breadth for a system
system level. data identifier; you can only modify an existing breadth.
■ Data Entry ■ Optional Validators
You can edit the input of an active validator for a You cannot define optional validators at the system level.
system data identifier. You can only configure optional validators at the policy
level.
■ Data Normalizer
You cannot modify the type of data normalizer
implemented by a system data identifier.
■ Delete
You cannot delete a system data identifier.

Cloning a system data identifier before modifying it


The Enforce Server does not provide an automated mechanism for cloning a system data
Identifier.
See “Extending and customizing data identifiers” on page 731.
Before you modify a system data Identifier, consider manually cloning it so you can revert to
the original configuration, if necessary. At the least, you should export a policy as a template
before you modify any system data Identifier declared by that policy.
To manually clone a system data identifier
1 Review the original configuration of the data identifier you want to modify.
2 Create a custom data identifier.
See “Workflow for creating custom data identifiers” on page 812.
3 Copy the configuration of the original data identifier to the custom data identifier.
Add the pattern(s), validator(s), any data input, and the normalizer.
See “Selecting a data identifier breadth” on page 739.
4 Save the custom data identifier.
5 Modify the custom data identifier to suit your needs.
Detecting content using data identifiers 778
Modifying system data identifiers

Editing pattern validator input


At the system-level you can edit the data input that a required validator accepts. Not all
validators accept data input.
See “About pattern validators” on page 733.
To edit required validator input
1 Edit the data identifier by selecting it from the Manage > Policies > data identifiers
screen.
2 Select the Rule Breadth you want to modify.
Generally, the medium and narrow breadth options include validators that accept data
input.
3 Select the editable validator from the Active Validators list whose input you want to edit.
For example, select Find keywords.
See “List of pattern validators that accept input data” on page 778.
4 Edit the input for the validator in the Description and Data Entry field.
5 Select the qualities you want for the keyword;
■ Proximity - To find a keyword only within the set proximity of the matched patterns,
check this box and also indicate the Word Distance or proximity.
■ Case sensitive - Check this box if you want to search for a case-sensitive match.
■ Highlight keywords in incident - Check this box if you want to highlight the matched
keywords in incidents.

6 Click Update Validator to save the changes you have made to the validator input.
Click Discard Changes to not save the changes.
7 Click Save to save the data identifier.

List of pattern validators that accept input data


The following table lists all available pattern validators that require data input. The input data
is editable at the system-level definition of the data identifier.

Note: Input you use for beginning and ending validators concern the text of the match itself.
Input you use for prefix and suffix validators concern characters before and after matched text.
Detecting content using data identifiers 779
Modifying system data identifiers

Table 31-21 Pattern validators that accept input data

Validator Description

Exact Match Enter a comma-separated list of values. If the values are numeric, do NOT enter
any dashes or other separators. Each value can be of any length.

Exclude beginning characters Enter a comma-separated list of values. If the values are numeric, do NOT enter
any dashes or other separators. Each value can be of any length.

Exclude ending characters Enter a comma-separated list of values. If the values are numeric, do NOT enter
any dashes or other separators. Each value can be of any length.

Exclude exact match Enter a comma-separated list of values. Each value can be of any length.

Exclude prefix Enter a comma-separated list of values. Each value can be of any length.

Exclude suffix Enter a comma-separated list of values. Each value can be of any length.

Find keywords Enter a comma-separated list of values. Each value can be of any length.

Require beginning characters Enter a comma-separated list of values. If the values are numeric, do NOT enter
any dashes or other separators. Each value can be of any length.

Require ending characters Enter a comma-separated list of values. If the values are numeric, do NOT enter
any dashes or other separators. Each value can be of any length.

Editing keywords for international PII data identifiers


Data identifiers offer broad support for detecting international content.
See “Introducing data identifiers” on page 717.
Some international data identifiers offer a wide breadth of detection only. In this case you can
implement the Find Keywords optional validator to narrow the scope of detection. Implementing
this optional validator may help you eliminate any false positives that your policy matches.
See “Selecting a data identifier breadth” on page 739.
Detecting content using data identifiers 780
Modifying system data identifiers

To use keywords for international data identifiers


1 Create a policy using one of the system-provided international data identifiers that is listed
in the table.
See “List of keywords for international system data identifiers” on page 780.
2 Select the Find Keywords optional validator.
See “Configuring the Content Matches data identifier condition” on page 737.
3 Copy and past the appropriate comma-separated keywords from the list to the Find
Keywords optional validator field.
See “Configuring optional validators” on page 763.

List of keywords for international system data identifiers


Table 31-22 provides keywords for several system-defined international data identifiers. You
can modify the specified data identifier using the corresponding keyword(s).
See “Extending and customizing data identifiers” on page 731.
See “Introducing data identifiers” on page 717.
See “Selecting a data identifier breadth” on page 739.

Table 31-22 Keyword list for international PII data identifiers

Data identifier Language Keywords English translation

Argentina Tax Spanish Número de Identificación Fiscal, Tax identification number,


Identification Number número de contribuyente, taxpayer number, Argentina tax
Número de identificación fiscal identification number, Argentina
Argentina, Argentina número de taxpayer number
contribuyente

Austria Passport German REISEPASS, ÖSTERREICHISCH Passport, Austrian passport


Number REISEPASS, reisepass

Austria Tax German Österreich, Steuernummer Austria, tax number


Identification Number

Austria Value Added German MwSt, Umsatzsteuernummer, VAT, sales tax number, VAT
Tax (VAT) Number MwSt Nummer, number, VAT identification
Ust.-Identifikationsnummer, number, sales tax, UID number
umsatzsteuer, Umsatzsteuer-
Identifikationsnummer
Detecting content using data identifiers 781
Modifying system data identifiers

Table 31-22 Keyword list for international PII data identifiers (continued)

Data identifier Language Keywords English translation

Austrian Social German sozialversicherungsnummer, Social insurance number, social


Security Number soziale sicherheit security number, insurance
kein,Versicherungsnummer, number, Austrian SSN, Austrian
Österreichischen SSN, social insurance
Österreichischen
Sozialversicherungs

Belgian National French Numéro national, numéro de National number, security number,
Number sécurité, numéro d'assuré, number of insured, national
identifiant national, identification, national
identifiantnational#, identification #, national number
Numéronational# #

Belgium Driver's German, French, Führerschein, Fuhrerschein, Driver's license, driver's license
License Number Frisian Fuehrerschein, number, driving permit, driving
Führerscheinnummer, permit number
Fuhrerscheinnummer,
Fuehrerscheinnummer,
Führerscheinnummer,
Fuhrerscheinnummer,
Fuehrerscheinnummer,
Führerschein- Nr, Fuhrerschein-
Nr, Fuehrerschein- Nr, permis de
conduire,
rijbewijs,Rijbewijsnummer,
Numéro permis conduire

Belgium Passport Dutch, German, Paspoort, paspoort, Passport, passport number,


Number French paspoortnummer, Reisepass passport book, passport card
kein, Reisepass, Passnummer,
Passeport, Passeport livre,
Passeport carte, numéro
passeport

Belgium Tax Dutch, German, Numéro de registre national, National registry number, tax
Identification Number French numéro d'identification fiscale, identification number, tax number
belasting aantal,Steuernummer

Belgium Value Added German, French Numéro T.V.A, VAT number, tax identification
Tax (VAT) Number Umsatzsteuer-Identifikationsnummer, number
Umsatzsteuernummer
Detecting content using data identifiers 782
Modifying system data identifiers

Table 31-22 Keyword list for international PII data identifiers (continued)

Data identifier Language Keywords English translation

Brazilian Election Brazilian número identificação, Identification number, voter


Identification Number Portuguese identificação do eleitor, ID eleitor identification, electoral
eleição, número identificação identification number, Brazilian
eleitoral, Número identificação electoral identification number,
eleitoral brasileira,
IDeleitoreleição#

Brazilian National Brazilian Brasileira ID Legal, entidades Brazilian legal identification, legal
Registry of Legal Portuguese jurídicas ID,Registro Nacional de entities ID, National Registry of
Entities Number Pessoas Jurídicas n º, Legal Entities No
BrasileiraIDLegal#

Brazilian Natural Brazilian Cadastro de Pessoas Físicas, Registration of individuals,


Person Registry Portuguese Brasileiro Pessoa Natural Número Brazilian Natural Person Registry
Number de Registro, pessoa natural Number, natural person registry
número de registro, pessoas number, individual registration
singulares registro NO number

British Columbia French MSP nombre, soins de santé no, MSP Number, MSP no, personal
Personal Healthcare soins de santé personnels healthcare number, Healthcare
Number nombre, MSPNombre#, No, PHN
soinsdesanténo#

Bulgaria Value Added Bulgarian номер на таксата, ДДС, ДДС#, Fee number, VAT, VAT number,
Tax (VAT) Number ДДС номер., ДДС номер.#, value added tax
номер на данъка върху
добавената стойност, данък
върху добавената стойност,
ДДС номер

Bulgarian Uniform Civil Bulgarian Униформ граждански номер, Uniform civil number, Uniform ID,
Number - EGN Униформ ID, Униформ Uniform civil ID, Bulgarian uniform
граждански ID, Униформ civil number
граждански не., български
Униформ граждански номер,
УниформгражданскиID#,
Униформгражданскине.#

Burgerservicenummer Dutch Persoonsnummer, sofinummer, person number, social-fiscal


sociaal-fiscaal nummer, number (abbreviation),
persoonsgebonden social-fiscal number,
person-related number

Canada Driver's French permis de conduire Driver's license


License Number
Detecting content using data identifiers 783
Modifying system data identifiers

Table 31-22 Keyword list for international PII data identifiers (continued)

Data identifier Language Keywords English translation

Canada Passport French numéro passeport, No passeport, Passport number, passport no.,
Numbert passeport# passport#

Canada Permanent French numéro résident permanent, permanent resident number,


Resident (PR) Number résident permanent non, résident permanent resident no, permanent
permanent no., carte résident resident number, permanent
permanent, numéro carte résident resident card, permanent resident
permanent, pr non card number, pr no

Chilean National Spanish Chilena número identificación, Chileand identification number,


Identification Number nacional identidad, número national identity, identification
identificación, número number, national identification
identificación nacional, identidad number, identity number, Unique
número, National Role
NúmerodeIdentificación#,
Identidadchilenano#, Rol Único
Nacional, RolÚnicoNacional#,
nacionalidentidad#

China Passport Number Chinese 中国护照, 护照, 护照本 Chinese passport, passport,
passport book

Codice Fiscale Italian codice fiscal, dati anagrafici, tax code, personal data, VAT
partita I.V.A., p. iva number, VAT number

Columbian Addresses Spanish Calle, Cll, Carrera, Cra, Cr, Street, St, Career, Avenue,
Avenida, Av, Dg, Diagonal, Diag, Diagonal, Transversal, sidewalk
Tv, Trans, Transversal, vereda

Columbian Cell Phone Spanish numero celular, número de Cellular number, telephone
Number teléfono, teléfono celular no., number, cellular telephone
numero celular# number

Columbian Personal Spanish cedula, cédula, c.c., c.c,C.C., C.C, Identification card, citizenship
Identification Number cc, CC, NIE., NIE, nie., nie, cedula card, identification document
de ciudadania, cédula de
ciudadanía, cc#, CC #, documento
de identificacion, documento de
identificación, Nit.

Columbian Tax Spanish NIT., NIT, nit., nit, Nit. TIN (tax identification number)
Identification Number
Detecting content using data identifiers 784
Modifying system data identifiers

Table 31-22 Keyword list for international PII data identifiers (continued)

Data identifier Language Keywords English translation

Croatia National Croatian Osobna iskaznica, Nacionalni Personal ID, national identification
Identification Number identifikacijski broj, osobni ID, number, personal ID, personal
osobni identifikacijski broj, porez identification number, tax
iskaznica, porezni broj, porezni identification card, tax number, tax
identifikacijski broj, porez kod, identification number, tax code,
šifra poreznog obveznika taxpayer code

Cyprus Tax Turkish, Greek αριθμός φορολογικού μητρώου, Tax identification number, tax
Identification Number Vergi Kimlik Numarası, vergi number, TIN number, Cyprus TIN
numarası, Kıbrıs TIN numarası number

Cyprus Value Added Turkish, Greek KDV, kdv#, KDV numarası, Katma VAT, VAT number, value added
Tax (VAT) Number değer Vergisi, Φόρος tax,
Προστιθέμενης Αξίας

Czech Republic Driver's Czech řidičský průkaz, řidičský prúkaz, Driving license, driver's license
Licence Number číslo řidičského průkazu, řidičské number, driving license number,
číslo řidičů, ovladače lic., Číslo driver's lic., driver license number,
licence řidiče, Řidičský průkaz, driver's permit
povolení řidiče, řidiči povolení,
povolení k jízdě, číslo licence

Czech Republic Czech Česká Osobní identifikační číslo, Czech Personal Identification
Personal Identification Osobní identifikační číslo., Number, personal identification
Number identifikační číslo, čeština number, Czech identification
identifikační číslo number

Czech Republic Tax Czech osobní kód, Národní identifikační Personal code, national
Identification Number číslo, osobní identifikační číslo, identification number, personal
cínové číslo, daňové identifikačné identification number, TIN number,
číslo, daňový poplatník id tax identification number, taxpayer
ID

Czech Republic Value Czech číslo DPH, Daň z přidané VAT number, value added tax,
Added Tax (VAT) hodnoty, Dan z pridané hodnoty, VAT
Number Daň přidané hodnoty, Dan
pridané hodnoty, DPH, DIC, DIČ
Detecting content using data identifiers 785
Modifying system data identifiers

Table 31-22 Keyword list for international PII data identifiers (continued)

Data identifier Language Keywords English translation

Denmark Personal Danish Nationalt identifikationsnummer, National identification number,


Identification Number personnummer, unikt personal number, unique
identifikationsnummer, identification number, identification
identifikationsnummer, centrale number, central registry of
personregister, persons, CPR number
cpr,cpr-nummer,cpr#,
cpr-nummer#,
identifikationsnummer#,
personnummer#

Denmark Value Added Danish moms, momsnummer, moms VAT number, vat, value added tax
Tax (VAT) Number identifikationsnummer, number, vat identification number
merværdiafgift

Estonia Driver's Estonian juhiluba, JUHILUBA, juhiluba Driving license, driving license
Licence Number number, juhiloa number, number, driver's license number,
Juhiluba, juhi litsentsi number license number

Estonia Passport Estonian Pass, pass, passi number, pass Passport, passport number,
Number nr, pass#, Pass nr, Eesti passi Estonian passport number
number

Estonia Personal Estonian isikukood, isikukood#, IK, IK#, Personal identification code, tax
Identification Code maksu ID, maksukohustuslase ID, taxpayer identification number,
identifitseerimisnumber, tax identification number, tax
maksukood, maksukood#, code, taxpayer code
maksuID#, maksumaksja kood,
maksumaksja
identifitseerimisnumber

Estonia Value Added Estonian käibemaksu VAT registration number, VAT,


Tax (VAT) Number registreerimisnumber, VAT number
käibemaksu, Käibemaksu
number, käibemaks, käibemaks#,
käibemaksu#
Detecting content using data identifiers 786
Modifying system data identifiers

Table 31-22 Keyword list for international PII data identifiers (continued)

Data identifier Language Keywords English translation

European Health Croatian, Danish, numero conto medico, tessera Medical account number, health
Insurance Card Number Estonian, Finnish, sanitaria assicurazione numero, insurance card number, insurance
French, German, carta assicurazione numero, card number, health insurance
Irish, Italian, Krankenversicherungsnummer, number, medical account number,
Luxembourgish, assicurazione sanitaria numero, health card number, health card,
Polish, Slovenian, medisch rekeningnummer, insurance number, EHIC number,
Spanish ziekteverzekeringskaartnummer,
verzekerings kaart nummer,
gezondheidskaart nummer,
gezondheidskaart, medizinische
Kontonummer,
Krankenversicherungskarte
Nummer, Versicherungsnummer,
Gesundheitskarte Nummer,
Gesundheitskarte, arstliku konto
number, ravikindlustuse kaardi
number, tervisekaart,
tervisekaardi number, Uimhir
ehic, tarjeta salud, broj kartice
zdravstvenog osiguranja, kartice
osiguranja broj, zdravstvenu
karticu, zdravstvene kartice broj,
ehic broj, numero tessera
sanitaria, numero carta di
assicurazione, tessera sanitaria,
numero ehic, Gesondheetskaart,
ehic nummer, numer rachunku
medycznego, numer karty
ubezpieczenia zdrowotne, numer
karty ubezpieczenia, karta
zdrowia, numer karty zdrowia,
numer ehic,
sairausvakuutuskortin numero,
vakuutuskortin numero,
terveyskortti, terveyskortin
numero, medicinsk
kontonummer, ehic numeris,
medizinescher Konto Nummer,
zdravstvena izkaznica
Detecting content using data identifiers 787
Modifying system data identifiers

Table 31-22 Keyword list for international PII data identifiers (continued)

Data identifier Language Keywords English translation

Finland Driver's Finnish, Swedish permis de conduire, ajokortti, Driver's license, driver's license
License Number ajokortin numero, kuljettaja lic., number, driver's lic.
körkort, körkort nummer, förare
lic.

Finland European Finnish Suomi EHIC-numero, Finland EHIC number, sickness


Health Insurance Sairausvakuutuskortti, insurance card, health insurance
Number sairaanhoitokortin, card, EHIC, Finnish health
Sjukförsäkringskort, ehic, insurance card, Health Card,
sairaanhoitokortin, Suomen Survival Card, health insurance
sairausvakuutuskortti, Finska number
sjukförsäkringskort,
Terveyskortti, Hälsokort, ehic#,
sairausvakuutusnumero,
sjukförsäkring nummer

Finland Passport Finnish Suomen passin numero, Finnish passport number, Finnish
Number suomalainen passi, passin passport, passport number,
numero, passin numero.#, passin passport number, passport #
numero#, passin numero, passin
numero., passin numero#, passi#

Finland Tax Finnish verotunniste, verokortti, Tax identification number, tax


Identification Number verotunnus, veronumero card, tax ID, tax number

Finland Value Added Finnish arvonlisäveronumero, ALV, VAT number, VAT, VAT
Tax (VAT) Number arvonlisäverotunniste, ALV nro, identification number
ALV numero, alv

Finnish Personal Finnish tunnistenumero, henkilötunnus, Identification number, personal


Identification Number yksilöllinen henkilökohtainen identification number, unique
tunnistenumero, Ainutlaatuinen personal identification number,
henkilökohtainen tunnus, identity number, Finnish personal
identiteetti numero, Suomen identification number, national
kansallinen henkilötunnus, identification number
henkilötunnusnumero#,
kansallisen tunnistenumero,
tunnusnumero,kansallinen
tunnus numero

France Driver's License French permis de conduire Driver's license


Number

France Health French carte vitale, carte d'assuré social Health card, social insurance card
Insurance Number
Detecting content using data identifiers 788
Modifying system data identifiers

Table 31-22 Keyword list for international PII data identifiers (continued)

Data identifier Language Keywords English translation

France Tax French numéro d'identification fiscale Tax identification number


Identification Number

France Value Added French Numéro d'identification taxe sur Value added tax identification
Tax (VAT) Number valeur ajoutée, Numéro taxe number, value added tax number,
valeur ajoutée, taxe valeur value added tax, VAT number,
ajoutée, Taxe sur la valeur French VAT number, SIREN
ajoutée, Numéro de TVA identification number
intracommunautaire, n° TVA,
numéro de TVA, Numéro de TVA
en France, français numéro de
TVA, Numéro d'identification
SIREN

French INSEE Code French INSEE, numéro de sécu, code INSEE, social security number,
sécu social security code

French Passport French Passeport français, Passeport, French passport, passport,


Number Passeport livre, Passeport carte, passport book, passport card,
numéro passeport passport number

French Social Security French sécurité sociale non., sécurité Social secuty number, social
Number sociale numéro, code sécurité security code, insurance number
sociale, numéro d'assurance,
sécuritésocialenon.#,
sécuritésocialeNuméro#

German Passport German Reisepass kein, Reisepass, Passport number, passport,


Number Deutsch Passnummer, German passport number,
Passnummer, Reisepasskein#, passport number
Passnummer#

German Personal ID German persönliche Personal identification number, ID


Number identifikationsnummer, number, Germane personal ID
ID-Nummer, Deutsch number, personal ID number,
persönliche-ID-Nummer, clear ID number, personal
persönliche ID Nummer, number, identity number,
eindeutige ID-Nummer, insurance number
persönliche Nummer,identität
nummer, Versicherungsnummer,
persönlicheNummer#,
IDNummer#
Detecting content using data identifiers 789
Modifying system data identifiers

Table 31-22 Keyword list for international PII data identifiers (continued)

Data identifier Language Keywords English translation

Germany Driver's German Führerschein, Fuhrerschein, Driver's license, driver's license


License Number Fuehrerschein, number
Führerscheinnummer,
Fuhrerscheinnummer,
Fuehrerscheinnummer,
Führerscheinnummer,
Fuhrerscheinnummer,
Fuehrerscheinnummer,
Führerschein- Nr, Fuhrerschein-
Nr, Fuehrerschein- Nr

Germany Value Added German Mehrwertsteuer, MwSt, Value added tax, value added tax
Tax (VAT) Number Mehrwertsteuer identification number, value added
Identifikationsnummer, tax number
Mehrwertsteuer nummer

Greece Passport Greek λλάδα pasport αριθμός, Ελλάδα Greece passport number, Greece
Number pasport όχι., Ελλάδα Αριθμός passport no., passport, Greece
Διαβατηρίου, διαβατήριο, passport, passport book
Διαβατήριο, ΕΛΛΑΔΑ
ΔΙΑΒΑΤΗΡΙΟ, Ελλάδα
Διαβατήριο, ελλάδα διαβατήριο,
Διαβατήριο Βιβλίο, βιβλίο
διαβατηρίου

Greece Social Security Greek Αριθμού Μητρώου Κοινωνικής Social security number
Number (AMKA) Ασφάλισης

Greece Value Added Greek FPA, fpa, Foros Prostithemenis VAT, value added tax, tax
Tax (VAT) Number Axias, arithmós dexamenís, Fóros identification number
Prostithémenis Axías, μέγας
κάδος, ΦΠΑ, Φ Π Α, Φόρος
Προστιθέμενης Αξίας, ΦΟΡΟΣ
ΠΡΟΣΤΙΘΕΜΕΝΗΣ ΑΞΙΑΣ, φόρος
προστιθέμενης αξίας, Arithmos
Forologikou Mitroou, Α.Φ.Μ, ΑΦΜ

Greek Tax Identification Greek Αριθμός Φορολογικού Μητρώου, Tax identification number, TIN, tax
Number AΦΜ, Φορολογικού Μητρώου registry number
Νο., τον αριθμό φορολογικού
μητρώου

Hong Kong ID Chinese 身份證 , 三顆星 Identity card, Hong Kong


(Traditional) permanent resident ID Card
Detecting content using data identifiers 790
Modifying system data identifiers

Table 31-22 Keyword list for international PII data identifiers (continued)

Data identifier Language Keywords English translation

Hungary Driver's Hungarian jogosítvány, Illesztőprogramok License, driver's lic, driver's


Licence Number Lic, jogsi, licencszám, vezetői license, number of licenses,
engedély, VEZETŐI ENGEDÉLY, driving license
vezető engedély, VEZETŐ
ENGEDÉLY

Hungary Passport French, útlevél, Magyar útlevélszám, Passport, Hungarian passport


Number Hungarian útlevél könyv, nombre, numéro number, passport book, number,
de passeport, hongrois, numéro passport number
de passeport hongrois

Hungarian Social Hungarian Magyar társadalombiztosítási Hungarian social security number,


Security Number szám, Társadalombiztosítási social security number, social
szám, társadalombiztosítási ID, security ID, social security code
szociális biztonsági kódot,
szociális biztonság nincs.,
társadalombiztosításiID#

Hungarian Tax Hungarian Magyar adóazonosító jel no, Hungarian tax identification
Identification Number adóazonosító szám, magyar tumber, tax identification number,
adószám, Magyar adóhatóság Hungarian tax number, Hungarian
no., azonosító szám, tax authority number, tax number,
adóazonosító no., adóhatóság no tax authority number

Hungarian VAT Number Hungarian Közösségi adószám, Általános Value added tax identification
forgalmi adó szám, number, sales tax number, value
hozzáadottérték adó, magyar added tax, Hungarian value added
Közösségi adószám tax number

Iceland National Icelandic kennitala, persónuleg kennitala, Social security number, personal
Identification Number galdur númer, skattanúmer, identification number, magic
skattgreiðenda kóða, kennitala number, tax code, taxpayer code,
skattgreiðenda taxpayer ID number

Iceland Passport Icelandic vegabréf, vegabréfs númer, Passport, passport number,


Number Vegabréf Nei, vegabréf# passport no.

Iceland Value Added Icelandic virðisaukaskattsnúmer, vsk VAT number


Tax (VAT) Number númer
Detecting content using data identifiers 791
Modifying system data identifiers

Table 31-22 Keyword list for international PII data identifiers (continued)

Data identifier Language Keywords English translation

Indonesian Identity Indonesian, Kartu Tanda Penduduk nomor, Identity card number, card
Card Number Portuguese número do cartão, Kartu identitas number, Indonesian identity card
Indonesia no, kartu no., Kartu number, card no., Indonesian
identitas Indonesia nomor, Nomor identity card number, ID number
Induk Kependudukan,
númerodocartão,kartuno.,
KartuidentitasIndonesiano

International Bank French Code IBAN, numéro IBAN IBAN Code, IBAN number
Account Number (IBAN)
Central

International Bank French Code IBAN, numéro IBAN IBAN Code, IBAN number
Account Number (IBAN)
East

International Bank French Code IBAN, numéro IBAN IBAN Code, IBAN number
Account Number (IBAN)
West

Ireland Passport Irish irelande passeport, Éire pas, no Ireland passport, passport
Number de passeport, pas uimh, uimhir number, passport
pas, numéro de passeport

Ireland Tax Irish uimhir carthanachta, Uimhir Charity number, charity


Identification Number chláraithe charthanais, uimhir registration number,CHY number,
CHY, CHY uimh., uimhir thagartha tax reference number, Ireland tax
cánach, uimhir aitheantais identification number, Irish tax
cánach ireland, aitheantais identification, tax identification
cánach irish, uimhir aitheantais number, tax id, TIN, Ireland tin
cánach, id cánach, uimhir
chánach, cáin #, STÁIN, cáin id
uimh.

Ireland Value Added Irish cáin bhreisluacha, CBL, CBL aon, Ireland VAT number, VAT
Tax (VAT) Number Uimhir CBL, Uimhir CBL number, VAT no, VAT#, value
hÉireann, bhreisluacha uimhir added tax number, value added
chánach tax, irish VAT

Irish Personal Public Gaelic Gaeilge Uimhir Phearsanta Irish personal public service
Service Number Seirbhíse Poiblí, PPS Uimh., number, PPS no., personal public
uimhir phearsanta seirbhíse service number, service no., PPS
poiblí, seirbhíse Uimh, PPS Uimh, no., PPS service one
PPS seirbhís aon
Detecting content using data identifiers 792
Modifying system data identifiers

Table 31-22 Keyword list for international PII data identifiers (continued)

Data identifier Language Keywords English translation

Israel Personal Hebrew, Arabic ‫זהות‬,‫מספר זיהוי ישראלי‬,‫מספר זיהוי‬ Israeli identity number, identity
Identification Number ‫هوية‬,‫هويةاسرائيلية عدد‬,‫ישראלית‬ number, unique identity number,
‫عدد هوية فريدة من نوعها‬,‫رقم الهوية‬,‫ إسرائيلية‬personal ID, unique personal ID,
unique ID

Italy Driver's License Italian patente guida numero, patente di Driver's license number, driver's
Number guida numero, patente di guida, license
patente guida

Italy Health Insurance Italian TESSERA SANITARIA, tessera Health insurance card, Italian
Number sanitaria, tessera sanitaria health insurance card
italiana

Italian Passport Italian Repubblica Italiana Passaporto, Italian Republic passport,


Number Passaporto, Passaporto Italiana, passport, Italian passport, Italian
passport number, Italiana passport number, passport
Passaporto numero, Passaporto number
numero, Numéro passeport
italien, numéro passeport

Italy Value Added Tax Italian IVA, numero partita IVA, IVA#, VAT, VAT number, VAT#, VAT
(VAT) Number numero IVA number

Japan Driver's License Japanese 公安委員会, 番号, 免許, 交付, 運転 Public Security Committee,
Number 免許, 運転免許証, ドライバライセ driver's license, driving license,
ンス, ドライバーズライセンス, ラ driver license, driver's license
イセンス, 運転免許証番号 number, driving license number,
driver license number, license

Japanese Juki-Net ID Japanese 住基ネット識別番号, 住基ネット番 Juki-Net identification number,


Number 号, 識別番号, 個人識別番号 Juki-Net number, identification
number, personal identification
number

Japanese My Number - Japanese マイナンバー, 共通番号 My number, common number


Corporate

Japanese My Number - Japanese マイナンバー, 個人番号, 共通番号 My number, personal number,


Personal common number

Japan Passport Japanese 日本国旅券, パスポート, パスポー Japanese passport, passport,


Number ト数 passport number
Detecting content using data identifiers 793
Modifying system data identifiers

Table 31-22 Keyword list for international PII data identifiers (continued)

Data identifier Language Keywords English translation

Kazakhstan Passport Kazakh төлқұжат, төлқұжат нөмірі, Passport, passport number,


Number номер паспорта, заграничный passport ID, international
пасспорт, национальный passport, national passport
паспорт

Korea Passport Number Korean 한국어 여권, 여권, 여권 번호, 대한 Korean passport, passport,
민국 passport number, Republic of
Korea

Korea Residence Korean 외국인 등록 번호, 주민번호 Foreigner registration number,


Registration Number social security number
for Foreigners

Korean Residence Korean 주민등록번호, 주민번호 Resident registration number,


Registration Number social security number
for Korean

Latvia Driver's Licence Latvian licences numurs, vadītāja License number, driver's license,
Number apliecība, autovadītāja apliecība, driver's license number, driver's
vadītāja apliecības numurs, lic.
Vadītāja licences numurs, vadītāji
lic., vadītāja atļauja

Latvia Passport Latvian LATVIJA, LETTONIE, Pases Nr., Latvia, passport no., passport
Number Pases Nr, Pase, pase, pases number, passport book, passport
numurs, Pases Nr, pases #, passport card
grāmata, pase#, pases karte

Latvia Personal Latvian Personas kods, personas kods, Latvia personal code, personal
Identification Number latvijas personas kods, Valsts code, national identification
identifikācijas numurs, valsts number, identification number,
identifikācijas numurs, national ID, latvia TIN, TIN, tax
identifikācijas numurs, identification number, tax ID, TIN
nacionālais id, latvija alva, alva, number, tax number
nodokļu identifikācijas numurs,
nodokļu id, alvas nē, nodokļa
numurs

Latvia Value Added Tax Latvian PVN Nr, PVN maksātāja numurs, VAT no., VAT payer number, VAT
(VAT) Number PVN numurs, PVN#, pievienotās number, VAT#, value added tax,
vērtības nodoklis, pievienotās value added tax number
vērtības nodokļa numurs

Liechtenstein Passport German Reisepass, Pass Nr, Pass Nr., Passport, passport no.
Number Reisepass#, Pass Nr#
Detecting content using data identifiers 794
Modifying system data identifiers

Table 31-22 Keyword list for international PII data identifiers (continued)

Data identifier Language Keywords English translation

Lithuania Personal Lithuanian Nacionalinis ID, Nacionalinis National ID, national identification
Identification Number identifikavimo numeris, asmens number, personal ID
kodas

Lithuania Tax Lithuanian mokesčių identifikavimo Nr., tax identification number, tax ID,
Identification Number mokesčių identifikavimo numeris, tax ID number, tax ID number, tax
mokesčių ID, mokesčių id nr, ID #, tax number, tax no., fee #
mokesčių id nr., mokesčių ID#,
mokesčių numeris, mokestis Nr,
mokestis #, Mokesčių
identifikavimo numeris

Lithuania Value Added Lithuanian pridėtinės vertės mokesčio VAT number, VAT, VAT #, Value
Tax (VAT) Number numeris, PVM, PVM#, pridėtinės added tax, VAT registration
vertės mokestis, PVM numeris, number
PVM registracijos numeris

Luxembourg National German, French Eindeutige ID-Nummer, Unique ID number, unique ID,
Register of Individuals Eindeutige ID, ID personnelle, personal ID, personal identification
Number Numéro d'identification number
personnel, IDpersonnelle#,
Persönliche
Identifikationsnummer,
EindeutigeID#

Luxembourg Passport French and passnummer, ausweisnummer, Passport number, passport,


Number German passeport, reisepass, pass, pass Luxembourg pass, Luxembourg
net, pass nr, no de passeport, passport
passeport nombre, numéro de
passeport

Luxembourg Tax French, German Zinn, Zinn Nummer, Luxembourg TIN, TIN number, Luxembourg tax
Identification Number Tax Identifikatiounsnummer, identification number, tax number,
Steier Nummer, Steier ID, tax ID, social security ID,
Sozialversicherungsausweis, Luxembourg tax identification
Zinnzahl, Zinn nein, Zinn#, number, Social Security, Social
luxemburgische Security Card, tax identification
steueridentifikationsnummer, number
Steuernummer,Steuer ID, sécurité
sociale, carte de sécurité sociale,
étain,numéro d'étain, étain non,
étain#, Numéro d'identification
fiscal luxembourgeois, numéro
d'identification fiscale
Detecting content using data identifiers 795
Modifying system data identifiers

Table 31-22 Keyword list for international PII data identifiers (continued)

Data identifier Language Keywords English translation

Luxembourg Value German, TVA kee, TVA#, TVA Aschreiwung Luxembourg VAT number, VAT
Added Tax (VAT) Luxembourgish kee, T.V.A, stammnummer, number, VAT, value added tax
Number bleiwen, geheescht, gitt id, number, VAT ID, VAT registration
mehrwertsteuer, vat number, value added tax
registrierungsnummer,
umsatzsteuer-id, wat,
umsatzsteuernummer,
umsatzsteuer-identifikationsnummer,
id de la batterie, lëtzebuerg vat
nee, registréierung nummer,
numéro de TVA, numéro de
enregistrement vat

Macau National Chinese, 身份证号码, 唯一的识别号码 ID number, unique identification


Identification Number Portuguese number
número de identificação, número
cartão identidade, número cartão Identification number, identity card
identidade nacional, número number, national identity card
identificação pessoal, número number, personal identification
identificação único, id único não, number, unique identification
ID único# number, unique non-ID, unique ID
#

Malaysia Passport Malay pasport, nombor pasport, Passport, passport number,


Number pasport# passport #

Malaysian MyKad Malay nombor kad pengenalan, kad Identification card number,
Number (MyKad) pengenalan no, kad pengenalan identification card no., Malaysian
Malaysia, bilangan identiti unik, identification card, unique identity
nombor peribadi, number, personal number
nomborperibadi#,
kadpengenalanno#

Malta National Maltese numru identifikazzjoni nazzjonali, national identification number,


Identification Number ID nazzjonali, numru national ID, personal identification
identifikazzjoni personali, ID number, personal ID
personali, IDnazzjonali#,
IDpersonali#

Malta Tax Identification Maltese kodiċi tat-taxxa, numru tat-taxxa, Tax code, tax number, tax
Number numru identifikazzjoni tat-taxxa, identification number, taxid#
taxxaid#, numru identifikazzjoni taxpayer identification number,
kontribwent, kodiċi kontribwent, taxpayer code, tin, tin no
landa, landa nru
Detecting content using data identifiers 796
Modifying system data identifiers

Table 31-22 Keyword list for international PII data identifiers (continued)

Data identifier Language Keywords English translation

Malta Value Added Tax Maltese Numru tal-VAT, numru tal-VAT, VAT number, VAT, value added
(VAT) Number bettija,valur miżjud taxxa tax number, vat identification
in-numru, bettija identifikazzjoni number
in-numru

Mexican Personal Spanish Clave de Registro de Identidad Personal identity registration key,
Registration and Personal, Código de Mexican personal identification
Identification Number Identificación Personal mexicana, code, Mexican personal
número de identificación identification number
personal mexicana

Mexican Tax Spanish Registro Federal de Federal taxpayer registry, tax


Identification Number Contribuyentes, número de identification number, federal
identificación de impuestos, taxpayer registry number, RFC
Código del Registro Federal de number, RFC key
Contribuyentes, Número RFC,
Clave del RFC

Mexican Unique Spanish Única de registro de Población, Unique population registry, unique
Population Registry clave única, clave única de key, unique identity key, unique
Code identidad, clave personal personal identity, personal identity
Identidad, personal Identidad key
Clave, ClaveÚnica#,
clavepersonalIdentidad#

Mexico CLABE Number Spanish Clave Bancaria Estandarizada, Standardized banking code,
Estandarizado Banco número de standardized bank code number,
clave, número de clave, clave code number
número, clave#

Netherlands Bank Dutch, bancu aklarashon number, Bank account number, account
Account Number Papiamento aklarashon number, number
bankrekeningnummer,
rekeningnummer

Netherlands Driver's Dutch RIJMEWIJS, permis de conduire, Driver's license, driving permit,
License Number rijbewijs, Rijbewijsnummer, driver's license number
RIJBEWIJSNUMMER

Netherlands Passport Dutch Nederlanden paspoort nummer, Dutch passport number, passport,
Number Paspoort, paspoort, Nederlanden passport number
paspoortnummer,
paspoortnummer
Detecting content using data identifiers 797
Modifying system data identifiers

Table 31-22 Keyword list for international PII data identifiers (continued)

Data identifier Language Keywords English translation

Netherlands Tax Dutch, Nederlands belasting Dutch tax identification number,


Identification Number Pampiamento, identificatienummer, tax identification number, Dutch
Norwegian identificatienummer van tax identification, Dutch tax
belasting, identificatienummer number, tax number
belasting, Nederlands belasting
identificatie, Nederlands belasting
id nummer, Nederlands
belastingnummer, btw nummer,
Nederlandse belasting
identificatie, Nederlands
belastingnummer, netherlands
tax identification tal, netherland's
tax identification tal, tax
identification tal, tax tal,
Nederlânske tax identification tal,
Hollânske tax identification,
Nederlânsk tax tal, Hollânske tax
id tal, netherlands impuesto
identification number,
netherland's impuesto
identification number, impuesto
identification number, impuesto
number, hulandes impuesto
identification number, hulandes
impuesto identification, hulandes
impuesto number, hulandes
impuesto id number

Netherlands Value Dutch, Frisian wearde tafoege tax getal, BTW Value added tax number, VAT
Added Tax (VAT) nûmer, BTW-nummer number
Number

New Zealand Driver's Maori raihana taraiwa Driving license


Licence Number

New Zealand Passport Maori uruwhenua, tau uruwhenua, Passport, passport no.
Number uruwhenua no, uruwhenua no.

Norway Driver's Norwegian førerkort, førerkortnummer Driver's license, driver's license


Licence Number number
Detecting content using data identifiers 798
Modifying system data identifiers

Table 31-22 Keyword list for international PII data identifiers (continued)

Data identifier Language Keywords English translation

Norway National Norwegian Nasjonalt ID, personlig ID, National ID, personal ID, national
Identification Number Nasjonalt ID#, personlig ID#, skatt ID #, personal ID #, tax ID, tax
id, skattenummer, skattekode, code, taxpayer ID, taxpayer
skattebetalers id, skattebetalers identification number
identifikasjonsnummer

Norway Value Added Norwegian mva, MVA, momsnummer, VAT, VAT number, VAT
Tax Number Momsnummer, registration number
momsregistreringsnummer

Norwegian Birth Norwegian fødsel nummer, Fødsel nr, fødsel Birth number
Number nei, fødselnei#, fødselnummer#

People's Republic of Chinese 身份证,居民信息,居民身份信息 Identity Card, Information of


China ID (Simplified) resident, Information of resident
identification

Poland Driver's Licence Polish Kierowcy Lic., prawo jazdy, Drivers license number, driving
Number numer licencyjny, zezwolenie na license, license number
prowadzenie, PRAWO JAZDY

Poland European Polish Numer EHIC, Karta Ubezpieczenia EHIC number, Health Insurance
Health Insurance Zdrowotnego, Europejska Karta Card, European Health Insurance
Number Ubezpieczenia Zdrowotnego, Card, health insurance number,
numer ubezpieczenia medical account number
zdrowotnego, numer rachunku
medycznego

Poland Passport French, Polish paszport#, numer paszportu, Nr Passport #, passport number,
Number paszportu, paszport, książka passport number, passport,
paszportowa passport book

passeport, nombre, numéro de Passport, number, passport


passeport, passeport#, No de number, passport #, passport
passeport number

Poland Value Added Polish Numer Identyfikacji Podatkowej, Tax identification number, tax ID
Tax (VAT) Number NIP, nip, Liczba VAT, podatek od number, VAT number, value
wartosci dodanej, faktura VAT, added tax, VAT invoice, VAT
faktura VAT# invoice #

Polish Identification Polish owód osobisty, Tożsamości Identification card, national


Number narodowej, osobisty numer identity, identification card
identyfikacyjny, niepowtarzalny number, unique number, number
numer, numer
Detecting content using data identifiers 799
Modifying system data identifiers

Table 31-22 Keyword list for international PII data identifiers (continued)

Data identifier Language Keywords English translation

Polish REGON Number Polish numer statystyczny, REGON, Statistical number, REGON
numeru REGON, number
numerstatystyczny#,
numeruREGON#

Polish Social Security Polish PESEL Liczba, społeczny PESEL number, social security
Number (PESEL) bezpieczeństwo liczba, społeczny number, social security ID, social
bezpieczeństwo ID, społeczny security code
bezpieczeństwo kod,
PESELliczba#,
społecznybezpieczeństwoliczba#

Polish Tax Polish Numer Identyfikacji Podatkowej, Tax identification number, Polish
Identification Number Polski numer identyfikacji tax identification number
podatkowej,
NumerIdentyfikacjiPodatkowej#

Portugal Driver's Portuguese carteira de motorista, carteira driver's license, license number,
License Number motorista, carteira de habilitação, driving license, driving license
carteira habilitação, número de Portugal
licença, número licença,
permissão de condução,
permissão condução, Licença
condução Portugal, carta de
condução

Portugal National Portuguese bilhete de identidade, número de identity card, civil identification
Identification Number identificação civil, número de number, citizen's card number,
cartão de cidadão, documento de identification document, citizen's
identificação, cartão de cidadão, card, bi number of Portugal,
número bi de portugal, número document number
do documento

Portugal Passport French and passaporte, passeport, Passport number, passport,


Number Portuguese portuguese passport, portuguese Portuguese passport
passeport, portuguese
passaporte, passaporte nº,
passeport nº

Portugal Tax Portuguese número identificação fiscal Tax identification numberr


Identification Number

Portugal Value Added Portuguese imposto sobre valor Value added tax, VAT, VAT
Tax (VAT) Number acrescentado, VAT nº, número number, VAT code
iva, vat não, código iva
Detecting content using data identifiers 800
Modifying system data identifiers

Table 31-22 Keyword list for international PII data identifiers (continued)

Data identifier Language Keywords English translation

Romania Driver's Romanian permis de conducere, PERMIS DE Driving license, driving license
Licence Number CONDUCERE, Permis de number
conducere, numărul permisului
de conducere, Numărul
permisului de conducere

Romania National Romanian numărul de identificare fiscală, fiscal identification number, tax
Identification Number identificarea fiscală nr #, codul identification number, fiscal code
fiscal nr. number,

Romania Value Added Romanian CIF, cif, CUI, cui, TVA, tva, TVA#, VAT, VAT #, value added tax,
Tax (VAT) Number tva#, taxa pe valoare adaugata, fiscal code, fiscal identification
cod fiscal, cod fiscal de code, unique registration code,
identificare, cod fiscal unique identification code, code
identificare, Cod Unic de unique registration
Înregistrare, cod unic de
identificare, cod unic identificare,
cod unic de înregistrare, cod unic
înregistrare

Romanian Numerical Romanian Cod Numeric Personal, cod Personal numeric code, personal
Personal Code identificare personal, cod unic identification code, unique
identificare, număr personal unic, identification code, identity
număr identitate, număr number, personal identification
identificare personal, number
număridentitate#,
CodNumericPersonal#,
numărpersonalunic#

Russian Passport Russian паспорт нет, паспорт, номер Passport no., passport, passport
Identification Number паспорта, паспорт ID, number, passport ID, Russian
Российской паспорт, Русский passport, Russian passport
номер паспорта, паспорт#, number
паспортID#, номерпаспорта#

Russian Taxpayer Russian НДС, номер TIN (tax identification number),


Identification Number налогоплательщика, taxpayer number, taxpayer ID, rax
Налогоплательщика ИД, налог number
число, налогчисло#, ИНН#,
НДС#
Detecting content using data identifiers 801
Modifying system data identifiers

Table 31-22 Keyword list for international PII data identifiers (continued)

Data identifier Language Keywords English translation

SEPA Creditor Identifier Bulgarian, SEPA-Gläubiger-Identifikator, SEPA creditor identifier, creditor


Number North Finnish, French, Gläubiger-ID, SEPA-ID, ID, SEPA ID, creditor ID
German, Irish, Gläubiger-Kennung
Creditor ID, SEPA ID
Italian,
ID créancier, ID SEPA, Identifiant
Luxembourgish, SEPA creditor identifier, crediting,
du créancie
Portuguese, creditor identification
Spanish SEPA Krediter Identifizéierer,
SEPA creditor identifier, Creditor
Kreditergeld, Krediter
Identifier
Identifizéierer
Creditor ID, SEPA ID, Creditor
SEPA kreditoridentifikator,
identifier
Kreditoridentifikator
Creditor ID, Creditor Identifier
Velkojan tunnus, SEPA-tunnus,
Velkojan tunniste Creditor ID, Creditor Identifier

ID Creidiúnaí, Aithnitheoir Creditor Identifier SEPA, Creditor


Creidiúnaí ID, SEPA ID, Creditor Identifier

ID del creditore, Identificatore del SEPA Creditor Identifier, Creditor


creditore Identifier

Identificador de acreedor SEPA,


ID del acreedor, ID de SEPA,
Identificador del acreedor

Identificador Credor SEPA,


Identificador do Credor
Detecting content using data identifiers 802
Modifying system data identifiers

Table 31-22 Keyword list for international PII data identifiers (continued)

Data identifier Language Keywords English translation

SEPA Creditor Identifier Bulgarian, SEPA-Gläubiger-Identifikator, SEPA creditor identifier, creditor


Number South Finnish, French, Gläubiger-ID, SEPA-ID, ID, SEPA ID, creditor ID
German, Irish, Gläubiger-Kennung
Creditor ID, SEPA ID
Italian,
ID créancier, ID SEPA, Identifiant
Luxembourgish, SEPA creditor identifier, crediting,
du créancie
Portuguese, creditor identification
Spanish SEPA Krediter Identifizéierer,
SEPA creditor identifier, Creditor
Kreditergeld, Krediter
Identifier
Identifizéierer
Creditor ID, SEPA ID, Creditor
SEPA kreditoridentifikator,
identifier
Kreditoridentifikator
Creditor ID, Creditor Identifier
Velkojan tunnus, SEPA-tunnus,
Velkojan tunniste Creditor ID, Creditor Identifier

ID Creidiúnaí, Aithnitheoir Creditor Identifier SEPA, Creditor


Creidiúnaí ID, SEPA ID, Creditor Identifier

ID del creditore, Identificatore del SEPA Creditor Identifier, Creditor


creditore Identifier

Identificador de acreedor SEPA,


ID del acreedor, ID de SEPA,
Identificador del acreedor

Identificador Credor SEPA,


Identificador do Credor
Detecting content using data identifiers 803
Modifying system data identifiers

Table 31-22 Keyword list for international PII data identifiers (continued)

Data identifier Language Keywords English translation

SEPA Creditor Identifier Bulgarian, SEPA-Gläubiger-Identifikator, SEPA creditor identifier, creditor


Number West Finnish, French, Gläubiger-ID, SEPA-ID, ID, SEPA ID, creditor ID
German, Irish, Gläubiger-Kennung
Creditor ID, SEPA ID
Italian,
ID créancier, ID SEPA, Identifiant
Luxembourgish, SEPA creditor identifier, crediting,
du créancie
Portuguese, creditor identification
Spanish SEPA Krediter Identifizéierer,
SEPA creditor identifier, Creditor
Kreditergeld, Krediter
Identifier
Identifizéierer
Creditor ID, SEPA ID, Creditor
SEPA kreditoridentifikator,
identifier
Kreditoridentifikator
Creditor ID, Creditor Identifier
Velkojan tunnus, SEPA-tunnus,
Velkojan tunniste Creditor ID, Creditor Identifier

ID Creidiúnaí, Aithnitheoir Creditor Identifier SEPA, Creditor


Creidiúnaí ID, SEPA ID, Creditor Identifier

ID del creditore, Identificatore del SEPA Creditor Identifier, Creditor


creditore Identifier

Identificador de acreedor SEPA,


ID del acreedor, ID de SEPA,
Identificador del acreedor

Identificador Credor SEPA,


Identificador do Credor

Serbia Unique Master Serbian јединствен мајстор грађанин Unique master citizen number,
Citizen Number Број, Јединствен матични број, unique identification number,
јединствен број ид, Национални unique id number, National
идентификациони број identification number

Serbia Value Added Tax Serbian poreski identifikacioni broj, Tax identification number VAT
(VAT) Number PORESKI IDENTIFIKACIONI number, value added tax, VAT,
BROJ, Poreski br., ПДВ број, identification number, tax number
Порез на додату вредност, PDV
broj, Porez na dodatu vrednost,
porez na dodatu vrednost, PDV,
pdv, ПДВ, порески
идентификациони број, PIB, pib,
пиб, poreski broj, порески број
Detecting content using data identifiers 804
Modifying system data identifiers

Table 31-22 Keyword list for international PII data identifiers (continued)

Data identifier Language Keywords English translation

Slovakia Driver's Slovak vodičský preukaz, Vodičský Driving license, license number
Licence Number preukaz, VODIČSKÝ PREUKAZ,
číslo vodičského preukazu,
ovládače lic., povolenie vodiča,
povolenia vodičov, povolenie na
jazdu, povolenie jazdu, číslo
licencie

Slovakia National Hungarian, identifikačné číslo, személyi ID number, identity card number,
Identification Number Slovak igazolvány száma, national identity card number,
személyigazolvány szám, číslo national identification number,
občianského preukazu, identification number, ID card
identifikačná karta č, személyi number, identification card,
igazolvány szám, nemzeti national identity card
személyi igazolvány száma, číslo
národnej identifikačnej karty,
národná identifikačná karta č,
nemzeti személyazonosító
igazolvány, nemzeti azonosító
szám, národné identifikačné číslo,
národná identifikačná značka č,
nemzeti azonosító szám,
azonosító szám, identifikačné
číslo

Slovakia Passport French, Slovak PASSEPORT, passeport, Passport, passport number,


Number cestovný pas, číslo pasu, pas č, passport no
Číslo pasu, PAS, CESTOVNÝ
PAS, Passeport n°

Slovakia Value Added Slovak číslo DPH, číslo dane z pridanej VAT number, value added tax
Tax (VAT) Number hodnoty, identifikačné číslo vat, number, VAT, value added tax,
dph, DPH, daň z pridanej VAT identification number
hodnoty, daň pridanej hodnoty,
číslo dane pridanej hodnoty,
identifikačné číslo DPH

Slovenia Passport French, Slovenian številka potnega lista, potni list, Passport number, passport,
Number knjiga potnega lista, potni list #, passport book, passport #
passeport, Passeport

Slovenia Tax Slovenian identifikacijska številka davka, Tax identification number,


Identification Number Slovenska davčna številka, Slovenian tax number, tax number
Davčna številka
Detecting content using data identifiers 805
Modifying system data identifiers

Table 31-22 Keyword list for international PII data identifiers (continued)

Data identifier Language Keywords English translation

Slovenia Unique Master Slovenian EMŠO, emšo, edinstvena številka Unique national number, unique
Citizen Number državljana, enotna identifikacijska identification number, uniform
številka, Enotna maticna številka registration number, unique
obcana, enotna maticna številka registration number, citizen's
obcana, številka državljana, number, unique identification
edinstvena identifikacijska number
številka

Slovenia Value Added Slovenian številka davka na dodano Value added tax number, VAT no,
Tax (VAT) Number vrednost, DDV št, slovenia vat št Slovenia vat no

South African Personal Afrikaans nasionale identifikasie nommer, National identification number,
Identification Number nasionale identiteitsnommer, national identity number,
versekering aantal, persoonlike insurance number, personal
identiteitsnommer, unieke identity number, unique identity
identiteitsnommer, number, identity number
identiteitsnommer,
identiteitsnommer#,
versekeringaantal#,
nasionaleidentiteitsnommer#

South Korea Resident Korean 주민등록번호, 주민번호 Resident Registration Number,


Registration Number Resident Number

Spain Driver's License Spanish permiso de conducción, permiso Driver's license, driver's license
Number conducción, Número licencia number, driving license, driving
conducir, Número de carnet de permit, driving permit number
conducir, Número carnet
conducir, licencia conducir,
Número de permiso de conducir,
Número de permiso conducir,
Número permiso conducir,
permiso conducir, licencia de
manejo, el carnet de conducir,
carnet conducir
Detecting content using data identifiers 806
Modifying system data identifiers

Table 31-22 Keyword list for international PII data identifiers (continued)

Data identifier Language Keywords English translation

Spain Value Added Tax Spanish Número IVA españa, Número de Spain VAT number, Spanish VAT
(VAT) Number IVA español, español Número number, VAT Number, VAT, value
IVA, Número de valor agregado, added tax number, value added
IVA, Número IVA, Número tax
impuesto sobre valor añadido,
Impuesto valor agregado,
Impuesto sobre valor añadido,
valor añadido el impuesto, valor
añadido el impuesto numero

Spanish Customer Spanish número cuenta cliente, código Customer account number,
Account Number cuenta, cuenta cliente ID, número account code, customer account
cuenta bancaria cliente, código ID, customer bank account
cuenta bancaria number, bank account code

Spanish DNI ID Spanish NIE número, Documento Nacional NIE number, national identity
de Identidad, Identidad único, document, unique identity,
Número nacional identidad, DNI national identity number, DNI
Número number

Spanish Passport Spanish libreta pasaporte, número passport book, passport number,
Number pasaporte, Número Pasaporte, Spanish passport, passport
España pasaporte, pasaporte

Spanish Social Security Spanish Número de la Seguridad Social, Social security number
Number número de la seguridad social

Spanish Tax ID (CIF) Spanish número de contribuyente, número taxpayer number, corporate tax
de impuesto corporativo, número number, tax identification number,
de Identificación fiscal, CIF CIF number
número, CIFnúmero#

Sri Lanka National Sinhala See user interface ID, national identity number,
Identity Number personal identification number,
National Identity Card number

Sweden Driver's Finnish, Romani, ajokortti, permis de Driver's license, driver's license
License Number Swedish, Yiddish conducere,ajokortin numero, number, driving license number
kuljettajat lic., drivere lic., körkort,
numărul permisului de
conducere, ‫שאָפער דערלויבעניש‬
‫נומער‬, körkort nummer, förare lic.,
‫דריווערס דערלויבעניש‬,
körkortsnummer
Detecting content using data identifiers 807
Modifying system data identifiers

Table 31-22 Keyword list for international PII data identifiers (continued)

Data identifier Language Keywords English translation

Sweden Personal Swedish personnummer ID, personligt ID number, personal ID number,


Identification Number id-nummer, unikt id-nummer, unique ID number, personal,
personnummer, identification number
identifikationsnumret,
personnummer#,
identifikationsnumret#

Sweden Tax Swedish skattebetalarens Tax identification number,


Identification Number identifikationsnummer, Sverige Swedish TIN, TIN number
TIN, TIN-nummer

Sweden Value Added Swedish moms#, sverige moms, sverige Swedish VAT, Swedish VAT
Tax (VAT) Number momsnummer, sverige moms nr, number, VAT registration number
sweden vat nummer, sweden
momsnummmer,
momsregistreringsnummer

Swedish Passport Swedish Passnummer, pass, sverige pass, Passport number, passport,
Number SVERIGE PASS, sverige Swedish passport, Swedish
Passnummer passport number

Switzerland Health German, Italian medizinische Kontonummer, Medical account number, health
Insurance Card Number Krankenversicherungskarte insurance card number, health
Nummer, numero conto medico, insurance number
tessera sanitaria assicurazione
numero, assicurazione sanitaria
numero
Detecting content using data identifiers 808
Modifying system data identifiers

Table 31-22 Keyword list for international PII data identifiers (continued)

Data identifier Language Keywords English translation

Switzerland Passport French, German, Passeport, passeport, numéro Passport, passport number,
Number Italian passeport, numéro de passport # passport book
passeport,passeport#, No de
Passport, passport Number,
passeport, No de passeport.,
passport #
Numéro de passeport,
PASSEPORT, LIVRE DE Passport, passport number,
PASSEPORT passport no., passport #

Pass, Passnummer, Pass#, Pass Passport, passport #


Nr., Pass Nr, PASS

Passaporto, Numero di
passaporto, passaporto,
Passaporto n,Passaporto n.,
passaporto#, Passaport, numero
passaporto, numero di
passaporto, numero passaporto,
passaporto n, PASSAPORTO

Reisepass, Reisepass#,
REISEPASS

Switzerland Value French, German, T.V.A, numéro TVA, T.V.A#, VAT, VAT number, VAT #, value
Added Tax (VAT) Italian numéro taxe valeur ajoutée, added tax number, value added
Number T.V.A., taxe sur la valeur ajoutée, tax, VAT registration number,
T.V.A#, numéro enregistrement
VAT, VAT number, VAT #
TVA, Numéro TVA
VAT, VAT registration number,
I.V.A, Partita IVA, I.V.A#, numero
VAT #, VAT number
IVA

MwSt,
Umsatzsteuer-Identifikationsnummer,
MwSt#, Mehrwertsteuer-Nummer,
Mehrwertsteuer, VAT
Registrierungsnummer,
Umsatzsteuer-Identifikationsnummer

Swiss AHV Number French Numéro AVS, numéro d'assuré, AVS number, insurance number,
identifiant national, numéro national identifier, national
d'assurance vieillesse, numéro insurance number, social security
de sécurité soclale, Numéro AVH number, AVH number

German AHV-Nummer, Matrikelnumme, AHV number, Swiss Registration


Personenidentifikationsnummer number, PIN

Italian AVS, AVH AVS, AVH


Detecting content using data identifiers 809
Modifying system data identifiers

Table 31-22 Keyword list for international PII data identifiers (continued)

Data identifier Language Keywords English translation

Swiss Social Security French, German, Identifikationsnummer, Identification number, social


Number (AHV) Italian sozialversicherungsnummer, security number, personal
identification personnelle ID, identification ID, tax identification
Steueridentifikationsnummer, number, tax ID, social security
Steuer ID, codice fiscale, number, tax number
Steuernummer

Taiwan ROC ID Chinese 中華民國國民身分證 Taiwan ID


(Traditional)

Thailand Passport Thai หนังสือเดิน ทาง Passport, passport number


Number ,หมายเลขหนังสือเดินทาง

Thailand Personal ID Thai ประกันภัยจำนวน, Insurance number, personal


Number หมายเลขประจำตัวส่วนบุคคล, identification, identification number
หมายเลขประจำตัวที่ไม่ซ้ำกัน,
ประกันภัยจำนวน#,
หมายเลขประจำตัวส่วนบุคคล#,
หมายเลขประจำตัวที่ไมซ้ำกัน#

Turkish Identification Turkish Kimlik Numarası, Türkiye Identification number, Turkish


Number Cumhuriyeti Kimlik Numarası, Republic identification number,
vatandaş kimliği, kişisel kimlik citizen identity, personal
no, kimlik Numarası#, vatandaş identification number, citizen
kimlik numarası, Kişisel kimlik identification number
Numarası

Ukraine Identity Card Ukrainian посвідчення особи України Ukraine identity card

Ukraine Passport Ukrainian паспорт, паспорт України, Passport, Ukraine passport,


Number (Domestic) номер паспорта, персональний passport number

Ukraine Passport Ukranian паспорт, паспорт України, Passport, Ukraine passport,


Number (International) номер паспорта passport number

United Arab Emirates Arabic ‫فريدة‬,‫رقم التعريف الشخصي‬,‫ الهوية الشخصية رقم‬Personal ID Number, PIN, Unique
Personal Number ‫هوية‬,‫التأمينرقم‬,‫التأمين رقم‬,‫ من نوعها هوية رقم‬ID Number, Insurance Number,
‫فريدة‬# Unique Identity #

Venezuela National ID Spanish cédula de identidad número, National ID number, national


Number clave única de identidad, identification number, personal ID
personal de identidad clave, number, personal identification,
personal de identidad, número de unique identification number
identificación nacional, número
ID nacional
Detecting content using data identifiers 810
Modifying system data identifiers

Updating policies to use the Randomized US SSN data identifier


The Randomized US Social Security Number (SSN) data identifier detects both traditional and
randomized SSNs.
See “Use the Randomized US SSN data identifier to detect SSNs” on page 836.
All policy templates that previously used the US Social Security Number (SSN) data identifier
to detect SSNs are updated to use the Randomized US Social Security Number (SSN) data
identifier.
See “Updating policies after upgrading to the latest version” on page 447.
If you have existing policies that use the US SSN data identifier to detect SSNs, you should
update each policy to use the Randomized US SSN data identifier. If you have created policies
using the version 12.5 Randomized US SSN data identifier, you should update each to use
the latest version of the Randomized US SSN data identifier.
To update a policy to use the Randomized US SSN data identifier provides steps for updating
your SSN policies.
To update a policy to use the Randomized US SSN data identifier
1 Edit the policy that implements the US SSN data identifier or the 12.5 Randomized US
SSN data identifier.
See “Configuring policies” on page 413.
Refer to the topic "Configuring policies" in the Symantec Data Loss Prevention
Administration Guide and online Help.
2 Edit the rule that contains the US SSN data identifier.
See “Configuring policy rules” on page 417.
Refer to the topic "Configuring policy rules" in the Symantec Data Loss Prevention
Administration Guide and online Help.
3 Remove the US SSN data identifier.
4 Add the Randomized US SSN data identifier.
See “Managing and adding data identifiers” on page 735.
Refer to the topic "Managing and adding data identifiers" in the Symantec Data Loss
Prevention Administration Guide and online Help.
5 Save the policy.
Detecting content using data identifiers 811
Creating custom data identifiers

6 Test policy detection for both traditional and randomized US SSNs.


See “Test and tune policies to improve match accuracy” on page 453.
Refer to the topic "Test and tune policies to improve match accuracy" in the Symantec
Data Loss Prevention Administration Guide and online Help.
7 Deploy the updated SSN policy into production.
See “Policy deployment” on page 373.
Refer to the topic "Policy deployment" in the Symantec Data Loss Prevention Administration
Guide and online Help.

Creating custom data identifiers


You can create and delete one or more custom data identifiers. A custom data identifier may
be a system data identifier that you have cloned and intend to modify, or one that you create
from scratch. A custom data identifier is reusable across policies. Changes made to a custom
data identifier at the system-level affect any policies that actively or subsequently declare the
custom data identifier.
Table 31-23 lists the components of custom data identifiers.
See “Workflow for creating custom data identifiers” on page 812.

Table 31-23 Custom data identifier components

Component Description

Patterns Define one or more data identifier pattern language patterns, separated by line breaks.

See “About data identifier patterns” on page 732.

See “Using the data identifier pattern language” on page 814.

Data Normalizer Select a data normalizer to standardize the data before matching against it.

See “Selecting a data normalizer” on page 830.

Validators Add or remove validators to perform validation checks on the data detected by the
pattern(s).

See “About pattern validators” on page 733.

Validation Checks Select system-provided validation checks to add them to your list of Active Validators.

See “About pattern validators” on page 733.

Description and Data Entry Provide comma-separated data values for any validators that require data input.

See “About pattern validators” on page 733.


Detecting content using data identifiers 812
Creating custom data identifiers

Table 31-23 Custom data identifier components (continued)

Component Description

Pre- and Post-Validators Pre- and post-validators define characters and character ranges that are valid before
or after a data identifier pattern.

See “Configuring pre- and post-validators” on page 831.

Workflow for creating custom data identifiers


You can implement custom data identifiers to detect unique content. To implement a custom
data identifier, you must define at least one pattern and select a data normalizer. Validators
are optional.
See “Custom data identifier configuration” on page 814.
When you define a custom data identifier, the system assigns it to the "Wide" breadth by
default. This is not a limitation, however, because the actual scope of detection is determined
by the pattern(s) and validator(s) that you define.

Table 31-24 Implementing custom data identifiers

Step Action Description

1 Select Manage > Policies > The Data Identifiers screen lists all data identifiers available in the system.
Data Identifiers.

2 Select Add data identifier. Enter a Name for the custom data identifier.

The name must be unique.

Enter a Description for the custom data identifier.

A custom data identifier is assigned to the Custom category by default and


cannot be changed.

The description field is limited to 255 characters per line.

3 Enter one or more Patterns You must enter at least one pattern for the custom data identifier to be valid.
to match data.
Separate multiple patterns by line breaks.

See “Writing data identifier patterns to match data” on page 817.

See “Using the data identifier pattern language” on page 814.


Detecting content using data identifiers 813
Creating custom data identifiers

Table 31-24 Implementing custom data identifiers (continued)

Step Action Description

4 Select a Data Normalizer. You must select a data normalizer.


See “Selecting a data normalizer” on page 830.
The following normalizers are available:

■ Digits
■ Digits and Letters
■ Lowercase
■ Swift codes
■ Do nothing
Select this option if you do not want to normalize the data.

5 Select zero or more Including a validator to check and verify pattern matching is optional.
Validation Checks.
See “Selecting pattern validators” on page 829.

6 Pre- and Post-Validators: Pre- and Post-Validators are required. You can accept the default values,
Specify characters or or edit them as necessary.
character ranges that are
See “Configuring pre- and post-validators” on page 831.
valid or invalid before or after
a data identifier pattern.

7 Save the custom data Click Save at the upper left of the screen.
identifier.
Once you define and save a custom data identifier, it appears alphabetically
in the list of data identifiers at the Data Identifiers screen.

To edit a custom data identifier, select it from the list.

See “Editing data identifiers” on page 736.


Note: Click Cancel to not save the custom data identifier.

8 Implement the custom data The system lists all custom data identifiers beneath the Custom category
identifier in one or more for the "Content Matches data identifier" condition at the Configure Policy
policies. - Add Rule and the Configure Policy - Add Exception screens.

See “Configuring the Content Matches data identifier condition” on page 737.

You can configure optional validators at the policy instance level for custom
data identifiers.

See “Configuring optional validators” on page 763.


Detecting content using data identifiers 814
Creating custom data identifiers

Custom data identifier configuration


You can create and delete one or more custom data identifiers. A custom data identifier can
be used across policies. Changes made to a custom data identifier at the system-level affect
any policies that actively or subsequently declare the custom data identifier.
See “Workflow for creating custom data identifiers” on page 812.

Table 31-25 Custom data identifier configuration

Configurable at the custom level Not configurable

■ Name and Description ■ Category


You must give a custom data identifier a unique The system assigns a custom data identifier to the
name. Custom category. You cannot change this setting.
It is good practice to provide a description for the ■ Breadth
custom data identifier. The system assigns a custom data identifier to the Wide
You can change the name or description of a custom rule breadth. You cannot change this setting.
data identifier when you modify it. ■ Optional Validators
■ Patterns Custom data identifiers support all optional validators, but
You must define at least one pattern for the custom they are configured at the policy instance level.
data identifier to be valid.
■ Active Validators
You can add one or more required validators to a
custom data identifier.
■ Description and Data Entry
You can edit the input of an active validator that
accepts data input.
■ Data Normalizer
You must select a data normalizer when defining a
custom data identifier.
■ Pre- and Post-Validators
You can edit the values for the valid and invalid pre-
and post validator characters.

Using the data identifier pattern language


The data identifier pattern language is a limited subset of the regular expression lexicon. The
data identifier pattern language does not support all of the regular expressions characters and
constructs. A regular expression pattern converted to a data identifier pattern will require some
syntactical modifications.
Data identifier patterns are limited to 100 characters per line. The pattern itself can be more
than 100 characters, but a line cannot have more than 100 characters. You should split the
pattern up by lines no longer than 100 characters.
Detecting content using data identifiers 815
Creating custom data identifiers

See “Input character limits for policy configuration” on page 431.


Table 31-26 lists the known differences between regular expressions and the data identifier
pattern language. For more detailed information about the data identifier pattern language,
see Data identifier pattern language specification.

Table 31-26 Data identifier pattern language limitations

Character Description

* The asterisk (*), pipe (|), and dot (.) characters are not supported for data identifier
patterns.
|

\w The \w construct cannot be used to match the underscore character (_).

\s The \s construct cannot be used to match a whitespace character; instead, use an actual
whitespace.

\d For digits, use the construct \d.

Grouping Grouping only works at the beginning of the pattern, for example:

\d{4} – 2049 does not work; instead use 2049 – \d{4}

\d{2} /19 \d{2} does not work; instead use \d{2} /[1][9] \d{2}

Groupings are allowed at the beginning of the pattern, like in the credit card data identifier.

Data identifier pattern language specification


You can use three types of tokens when defining a data identifier pattern. Tokens are sequences
of non-whitespace characters at the beginning of the file, or preceded by one or more
whitespace characters, followed by whitespace characters or the end of the file. The three
token types that are used in data identifier patterns are:
■ Character literals
■ Bracket expressions
■ Special characters
You can follow each token by an optional quantifier.
See the section called “Quantifiers” on page 817.
Data identifier patterns only match a complete token or set of tokens.

Literal characters, metacharacters, and special characters


Most characters are literal matches in the data identifier pattern language. For example, the
character a in the data identifier pattern matches the character a in your content. The data
Detecting content using data identifiers 816
Creating custom data identifiers

identifier pattern language includes four metacharacters. To match these metacharacters as


character literals, use the backslash to escape the characters in your data identifier pattern.
See Table 31-27 for descriptions of these metacharacters.

Table 31-27 Metacharacters

Character Description

[ This character is used to begin a bracket expression.

{ This character is used to quantify the preceding token.

? This character is used to quantify the preceding token.

\ This character is used to escape the following character.

The data identifier pattern language includes five predefined special characters. See Table 31-28
for descriptions of these special characters.

Table 31-28 Special characters

Character Description

\l This special character matches any ASCII letter.

\L This special character matches any non-ASCII letter character, including


Unicode characters.

\d This special character matches any ASCII digit.

\D This special character matches any non-ASCII digit, including Unicode


characters.

\w This special character matches any character not matched by \l or \d,


including Unicode characters.

Bracket expressions
Bracket expressions begin with [ and end with ], and contain at least one character within in
the body of the expression. For example, the bracket expression [abcd] matches any of the
letters "a," "b," "c," or "d."
You can include a character range within a bracket expression by separating two characters
with a hyphen: -. For example, the bracket expression [a-z] matches the lower-case letters
"a" through "z". Any two characters separated by - are interpreted as a range. The relative
ordering of the range does not matter: [a-z] and [z-a] match the same characters.
You can include the characters "]" and "-" in your bracket expression if you follow these rules:
Detecting content using data identifiers 817
Creating custom data identifiers

■ The "]" character must appear as the first character in your bracket expression. For example:
[]a-z] matches the "]" character or any lower-case letter between "a" and "z."

■ The "-" character must appear as either the first or last character in your bracket expression.
If your bracket expression contains both the "]" and "-" characters, the "]" must be the first
character, and "-" the last character. For example: []-] matches either "]" or "-."

Order of interpretation
Data identifier patters are interpreted from left to right. For example, the bracket expression
[a-d-z] is interpreted as the range a-d and then the literals - and z.

Quantifiers
You can follow any token in your data identifier pattern with a quantifier. The quantifier specifies
how many occurrences of the pattern to match. See Table 31-29 for a description of the
quantifiers available in the data identifier pattern language.

Table 31-29 Quantifiers

Quantifier Description

? This quantifier specifies that the expression should match zero or one
occurrences of the preceding token.

{n} This quantifier specifies that the expression should match exactly n occurrences
of the preceding token.

{n, m} This quantifier specifies that the expression should match between n and m
occurrences of the preceding token (inclusive).

Writing data identifier patterns to match data


If you modify an existing data identifier, you can edit its patterns. If you create a custom data
identifier, you must implement at least one pattern. Data identifier patterns are implemented
using a syntax that is similar to the regular expression language, with limitations. In addition,
the system only allows the use of ASCII characters for data identifier patterns.
See “About data identifier patterns” on page 732.
See “Data identifier pattern language specification” on page 815.
To edit or implement a pattern
1 Review the patterns for the data identifier you want to modify.
See “Selecting a data identifier breadth” on page 739.
2 Consider cloning the data identifier, if you are modifying a system data identifier.
See “Cloning a system data identifier before modifying it” on page 777.
Detecting content using data identifiers 818
Creating custom data identifiers

3 Select Manage > Policies > Data Identifiers in the Enforce Server administration console.
4 Select the data identifier you want to modify.
5 Select the breadth for the data identifier you want to modify.
Generally, patterns vary among detection breadths.
6 In the Patterns field, modify an existing pattern, or enter one or more new patterns,
separated by line breaks.
Data identifier patterns are implemented as regular expressions. However, much of the
regular expression syntax is not supported.
See “Using the data identifier pattern language” on page 814.
7 Click Save to save the data identifier.

Using pattern validators


The following table lists all available pattern validators. Validators marked with an asterisk (*)
beside the name in the table below require data input.

Table 31-30 Available validators for system and custom data identifiers

Validator Description

ABA Checksum Every ABA routing number must start with the following two digits:
00-15,21-32,61-72,80 and pass an ABA specific, position-weighted check sum.

Advanced KRRN Validation Validates that 3rd and 4th digits are a valid month, that 5th and 6th digits are a valid
day, and the checksum matches the check digit.

Advanced SSN Validator checks whether SSN contains zeros in any group, the area number (first
group) is less than 773 and not 666, the delimiter between the groups is the same,
the number does not consist of all the same digits, and the number is not reserved
for advertising (123-45-6789, 987-65-432x).

Argentinian Tax Identity Computes the checksum and validates the pattern against it.
Number Validation Check

Australian Business Number Computes the checksum and validates the pattern against it.
Validation Check

Australian Company Number Computes the checksum and validates the pattern against it.
Validation Check

Australian Medicare Number Computes the checksum and validates the pattern against it.
Validation Check
Detecting content using data identifiers 819
Creating custom data identifiers

Table 31-30 Available validators for system and custom data identifiers (continued)

Validator Description

Australian Tax File validation Computes the checksum and validates the pattern against it.
check

Austria VAT Number Computes the checksum and validates the pattern against it.
Validation Check

Austrian Social Security Computes the checksum and validates the pattern against it.
Number Validation Check

Basic SSN Performs minimal SSN validation.

Belgian National Number Computes the checksum and validates the pattern against it.
Validation Check

Belgian Tax Identification Computes the checksum and validates the pattern against it.
Number Validation Check

Belgium VAT Number Computes the checksum and validates the pattern against it.
Validation Check

Brazil Election Identification Computes the checksum and validates the pattern against it.
Number Validation Check

Brazilian National Registry of Computes the checksum and validates the pattern against it.
Legal Entities Number
Validation Check

Brazilian Natural Person Computes the checksum and validates the pattern against it.
Registry Number Validation
Check

British Columbia Personal Computes the checksum and validates the pattern against it.
Healthcare Number Validation
Check

Bulgaria Value Added Tax Computes the checksum and validates the pattern against it.
(VAT) Number Validation
Check

Bulgarian Uniform Civil Computes the checksum and validates the pattern against it.
Number Validation Check

Burgerservicenummer Check Performs a check for the Burgerservicenummer.

Canada Driver's License Computes the checksum and validates the pattern against it.
Number Check
Detecting content using data identifiers 820
Creating custom data identifiers

Table 31-30 Available validators for system and custom data identifiers (continued)

Validator Description

Chilean National Identification Computes the checksum and validates the pattern against it.
Number Validation Check

China ID checksum validator Computes the checksum and validates the pattern against it.

Codice Fiscale Control Key Computes the control key and checks if it is valid.
Check

Croatia National Identification Computes the checksum and validates the pattern against it.
Number Validation Check

Cusip Validation Validator checks for invalid CUSIP ranges and computes the CUSIP checksum
(Modulus 10 Double Add Double algorithm).

Custom Script* Enter a custom script to validate pattern matches for this data identifier breadth.

See “Creating custom script validators” on page 831.

Cyprus Tax Identification Computes the checksum and validates the pattern against it.
Number Validation Check

Cyprus Value Added Tax Computes the checksum and validates the pattern against it.
(VAT) Number Validation
Check

Czech Personal Identity Computes the checksum and validates the pattern against it.
Number Validation Check

Czech Republic Tax Computes the checksum and validates the pattern against it.
Identification Number
Validation Check

Czech Republic VAT Number Computes the checksum and validates the pattern against it.
Validation Check

Denmark Personal Computes the checksum and validates the pattern against it.
Identification Number
Validation Check

Denmark Tax Identification Computes the checksum and validates the pattern against it.
Number Validation Check

Denmark VAT Number Computes the checksum and validates the pattern against it.
Validation Check

DNI control key check Computes the control key and checks if it is valid.
Detecting content using data identifiers 821
Creating custom data identifiers

Table 31-30 Available validators for system and custom data identifiers (continued)

Validator Description

Driver's License Number WA Computes the checksum and validates the pattern against it.
State Validation Check

Driver's License Number WI Computes the checksum and validates the pattern against it.
State Validation Check

Drug Enforcement Agency Computes the checksum and validates the pattern against it.
Number Validation Check

Duplicate digits Ensures that a string of digits are not all the same.

Dutch Tax Identification Computes the checksum and validates the pattern against it.
Number Validation Check

Estonia Personal Computes the checksum and validates the pattern against it.
Identification Number Check

Estonia Value Added Tax Computes the checksum and validates the pattern against it.
(VAT) Number Validation
Check

Exact Match* Enter a comma-separated list of values. If the values are numeric, do NOT enter
any dashes or other separators. Each value can be of any length.

Exact Match Data Identifier Looks up tokens around a pattern for the Exact Match Data Identifier index and
Check validates the pattern.

Exclude beginning Enter a comma-separated list of values. If the values are numeric, do NOT enter
characters* any dashes or other separators. Each value can be of any length.
Note: Beginning and ending validators concern the text of the match itself. Prefix
and suffix validators concern characters before and after matched text.

Exclude ending characters* Enter a comma-separated list of values. If the values are numeric, do NOT enter
any dashes or other separators. Each value can be of any length.

Exclude exact match* Enter a comma-separated list of values. Each value can be of any length.

Exclude prefix* Enter a comma-separated list of values. Each value can be of any length.
Note: Prefix and suffix validators concern characters before and after matched text.
Beginning and ending validators concern the text of the match itself.

Exclude suffix* Enter a comma-separated list of values. Each value can be of any length.

Find keywords* Enter a comma-separated list of values. Each value can be of any length.
Detecting content using data identifiers 822
Creating custom data identifiers

Table 31-30 Available validators for system and custom data identifiers (continued)

Validator Description

Finland Driver's Licence Computes the checksum and validates the pattern against it.
Number Validation Check

Finland Tax Identification Computes the checksum and validates the pattern against it.
Number Validation Check

Finland VAT Number Computes the checksum and validates the pattern against it.
Validation Check

Finnish Personal Computes the checksum and validates the pattern against it.
Identification Number
Validation Check

France VAT Number Computes the checksum and validates the pattern against it.
Validation Check

French Social Security Computes the checksum and validates the pattern against it.
Number Validation Check

German ID Number Validation Computes the checksum and validates the pattern against it.
Check

German Passport Number Computes the checksum and validates the pattern against it.
Validation Check

Germany Tax Number Computes the checksum and validates the pattern against it.
Validation Check

Germany VAT Number Computes the checksum and validates the pattern against it.
Validation Check

Greece Social Security Computes the checksum and validates the pattern against it.
Number (AMKA)

Greece VAT Number Computes the checksum and validates the pattern against it.
Validation Check

Greek Tax Identification Computes the checksum and validates the pattern against it.
Number Validation Check

HCPCS CPT Code Validation Computes the checksum and validates the pattern against it.
Check

Health Care Insurance Computes the checksum and validates the pattern against it.
Number Check

Hong Kong ID Computes the checksum and validates the pattern against it.
Detecting content using data identifiers 823
Creating custom data identifiers

Table 31-30 Available validators for system and custom data identifiers (continued)

Validator Description

Hungarian Social Security Computes the checksum and validates the pattern against it.
Validation Check

Hungarian Tax Identification Computes the checksum and validates the pattern against it.
Number Validation Check

Hungarian VAT Number Computes the checksum and validates the pattern against it.
Validation Check

Hungary Passport Number Computes the checksum and validates the pattern against it.
Validation Check

Iceland National Identification Computes the checksum and validates the pattern against it.
Number Validation Check

Indonesian Kartu Tanda Computes the checksum and validates the pattern against it.
Penduduk Validation Check

INSEE Control Key Validator computes the INSEE control key and compares it to the last 2 digits of the
pattern.

IP Basic Check Every IP address must match the format x.x.x.x and every number must be less than
256.

IP Octet Check Every IP address must match the format x.x.x.x, every number must be less than
256, and no IP address can contain only single-digit numbers (1.1.1.2).

IP Reserved Range Check Checks whether the IP address falls into any of the "Bogons" ranges. If so the match
is invalid.

IPv6 Basic Validation Check Every IPv6 address must match the format xxxx.xxxx.xxxx.xxxx.xxxx.xxxx.xxxx.xxxx
and every number must be lower than ffff.

Ipv6 Medium Validation Check Every IPv6 address must match the format xxxx.xxxx.xxxx.xxxx.xxxx.xxxx.xxxx.xxxx
and every number must be lower than ffff. No IPv6 address can start with 0.

Ipv6 Reserved Validation Every IPv6 address must match the format xxxx.xxxx.xxxx.xxxx.xxxx.xxxx.xxxx.xxxx
Check and every number must be lower than ffff. No IPv6 address can start with 0. Each
IPv6 address must be fully compressed.

Ireland Tax Identification Computes the checksum and validates the pattern against it.
Number Validation Check

Ireland VAT Number Computes the checksum and validates the pattern against it.
Validation Check
Detecting content using data identifiers 824
Creating custom data identifiers

Table 31-30 Available validators for system and custom data identifiers (continued)

Validator Description

Irish Personal Public Service Computes the checksum and validates the pattern against it.
Number Validation Check

Israel Personal Identity Computes the checksum and validates the pattern against it.
Number Validation Check

Italy VAT Number Validation Computes the checksum and validates the pattern against it.
Check

Japan Driver's License Computes the checksum and validates the pattern against it.
Number Validation Check

Japanese Juki-Net ID Computes the checksum and validates the pattern against it.
Validation Check

Japanese My Number Computes the checksum and validates the pattern against it.
Validation Check

KRRN Foreign Validation Validates that 3rd and 4th digits are a valid month, that 5th and 6th digits are a valid
Check day, and the checksum matches the check digit.

Latvia Personal Code Check Computes the checksum and validates the pattern against it.

Latvia Value Added Tax (VAT) Computes the checksum and validates the pattern against it.
Number Validation Check

Lithuania Tax Identification Computes the checksum and validates the pattern against it.
Number Validation Check

Lithuania Value Added Tax Computes the checksum and validates the pattern against it.
(VAT) Number Validation
Check

Luhn Check Computes the Luhn checksum and validates the matched pattern against it.

Luxembourg National Computes the checksum and validates the pattern against it.
Register of Individuals
Number Validation Check

Luxembourg Tax Computes the checksum and validates the pattern against it.
Identification Number
Validation Check

Luxembourg VAT Number Computes the checksum and validates the pattern against it.
Validation Check
Detecting content using data identifiers 825
Creating custom data identifiers

Table 31-30 Available validators for system and custom data identifiers (continued)

Validator Description

Malaysian MyKad Number Computes the checksum and validates the pattern against it.
Validation Check

Malta Value Added Tax (VAT) Computes the checksum and validates the pattern against it.
Number Validation Check

Medicare Beneficiary Identifier Computes the checksum and validates the pattern against it.
Number Validation Check

Mexican CRIP Validation Computes the checksum and validates the pattern against it.
Check

Mexican Tax Identification Computes the checksum and validates the pattern against it.
Validation Check

Mexican Unique Population Computes the checksum and validates the pattern against it.
Registry Code Validation
Check

Mexico CLABE Number Computes the checksum and validates the pattern against it.
Validation Check

Mod 97 Validator Computes the ISO 7064 Mod 97-10 checksum of the complete match.

National Provider Identifier Computes the checksum and validates the pattern against it.
Number Validation Check

National Securities Computes the checksum and validates the pattern against it.
Identification Number
Validation Check

Netherlands Bank Account Computes the checksum and validates the pattern against it.
Number Validation Check

Netherlands VAT Number Computes the checksum and validates the pattern against it.
Validation Check

New Zealand National Health Computes the checksum and validates the pattern against it.
Index Number Validation
Check

NIB Number Validation Check Computes the ISO 7064 Mod 97-10 checksum of the complete match of the NIB
Number.

No Validation Performs no validation.


Detecting content using data identifiers 826
Creating custom data identifiers

Table 31-30 Available validators for system and custom data identifiers (continued)

Validator Description

Norway National Identificaiton Computes the checksum and validates the pattern against it.
Number Validation Check

Norway Value Added Tax Computes the checksum and validates the pattern against it.
(VAT) Number Check

Norwegian Birth Number Computes the checksum and validates the pattern against it.
Validation Check

Number Delimiter Validates a match by checking the surrounding digits.

Poland VAT Number Computes the checksum and validates the pattern against it.
Validation Check

Polish ID Number Validation Computes the checksum and validates the pattern against it.
Check

Polish REGON Number Computes the checksum and validates the pattern against it.
Validation Check

Polish Social Security Number Computes the checksum and validates the pattern against it.
Validation Check

Polish Tax ID Number Computes the checksum and validates the pattern against it.
Validation Check

Portugal National Computes the checksum and validates the pattern against it.
Identification Number
Validation Check

Portugal Tax and VAT Computes the checksum and validates the pattern against it.
Identification Number
Validation Check

Randomized US Social Computes the checksum and validates the pattern against it.
Security Number Validation
Check

Require beginning characters* Enter a comma-separated list of values. If the values are numeric, do NOT enter
any dashes or other separators. Each value can be of any length.

Require ending characters* Enter a comma-separated list of values. If the values are numeric, do NOT enter
any dashes or other separators. Each value can be of any length.

Romania Driver's Licence Computes the checksum and validates the pattern against it.
Number Validation Check
Detecting content using data identifiers 827
Creating custom data identifiers

Table 31-30 Available validators for system and custom data identifiers (continued)

Validator Description

Romania National Computes the checksum and validates the pattern against it.
Identification Number Check

Romania VAT Number Computes the checksum and validates the pattern against it.
Validation Check

Romanian Numerical Personal Computes the checksum and validates the pattern against it.
Code Check

Russian Taxpayer Computes the checksum and validates the pattern against it.
Identification Number
Validation Check

SEPA Creditor Number Computes the checksum and validates the pattern against it.
Validation Check

Serbia Value Added Tax (VAT) Computes the checksum and validates the pattern against it.
Number Validation Check

Singapore NRIC Computes the Singapore NRIC checksum and validates the pattern against it.

Slovakia National Computes the checksum and validates the pattern against it.
Identification Number
Validation Check

Slovakia Value Added Tax Computes the checksum and validates the pattern against it.
(VAT) Number Validation
Check

Slovenia Tax Identification Computes the checksum and validates the pattern against it.
Number Validation Check

Slovenia Unique Master Computes the checksum and validates the pattern against it.
Citizen Number Validation
Check

Slovenia Value Added Tax Computes the checksum and validates the pattern against it.
(VAT) Number Validation
Check

South African Personal Computes the checksum and validates the pattern against it.
Identification Number
Validation Check

Spain VAT Number Validation Computes the checksum and validates the pattern against it.
Check
Detecting content using data identifiers 828
Creating custom data identifiers

Table 31-30 Available validators for system and custom data identifiers (continued)

Validator Description

Spanish Customer Account Computes the checksum and validates the pattern against it.
Number Validation Check

Spanish SSN Number Computes the checksum and validates the pattern against it.
Validation Check

Spanish Tax ID Number Computes the checksum and validates the pattern against it.
Validation Check

Sri Lanka National Computes the checksum and validates the pattern against it.
Identification Number
Validation Check

SSN Area-Group number For a given area number (first group), not all group numbers (second group) might
have been assigned by the SSA. Validator eliminates SSNs with invalid group
numbers.

Sweden TaxPayer Computes the checksum and validates the pattern against it.
Identification Number
Validation Check

Sweden Value Added Tax Computes the checksum and validates the pattern against it.
Number Validation Check

Swedish Personal Computes the checksum and validates the pattern against it.
Identification Number
Validation Check

Swiss AHV Swiss AHV Modulus 11 Checksum.

Swiss Social Security Number Computes the checksum and validates the pattern against it.
Validation Check

Switzerland Value Added Tax Computes the checksum and validates the pattern against it.
(VAT) Number Validation
Check

Taiwan ID Taiwan ID checksum.

Thailand Personal Computes the checksum and validates the pattern against it.
Identification Number
Validation Check

Turkish Identification Number Computes the checksum and validates the pattern against it.
Validation Check

UK Bank Sort Code Check Computes the checksum and validates the pattern against it.
Detecting content using data identifiers 829
Creating custom data identifiers

Table 31-30 Available validators for system and custom data identifiers (continued)

Validator Description

UK Drivers License Every UK drivers license must be 16 characters and the number at the 8th and 9th
position must be larger than 00 and smaller than 32.

UK NHS UK NHS checksum.

UK VAT Number Validation Computes the checksum and validates the pattern against it.
Check

Ukraine Identity Card Check Validates that the first eight digits are a correctly formatted date.

Venezuela Identification Computes the checksum and validates the pattern against it.
Number Validation Check

Verhoeff Validation Check Computes the checksum and validates the pattern against it.

Ukraine Identity Card Check Computes the checksum and validates the pattern against it.

Zip+4 Postal Codes Validation Computes the checksum and validates the pattern against it.
Check

Selecting pattern validators


Symantec Data Loss Prevention provides a comprehensive set of validators to facilitate pattern
matching accuracy.
See “About pattern validators” on page 733.
When you modify a data identifier, the system exposes the active validators used by the data
identifier. When you modify or create a data identifier, the system displays all system-defined
data validators from which you can choose.

Note: The active validators that allow for and define input are not to be confused with the
"Optional validators" that can be configured for any runtime instance of a particular data
identifier. Optional validators are always configurable at the instance level. Active validators
are only configurable at the system level.

Select a validator from the "Validation Checks" list on the left, then click Add Validator to the
right. If the validator requires input, provide the required data using a comma-separated list
and then click Add Validator.
See “Selecting pattern validators” on page 829.
Detecting content using data identifiers 830
Creating custom data identifiers

To select a pattern validator


1 Create a custom data identifier.
See “Workflow for creating custom data identifiers” on page 812.
2 In the Validators section, select the desired validator.
See “About pattern validators” on page 733.
3 If the validator does not require data input, click Add Validator.
The validator is added to the Active Validators list.
4 If the validator requires data input, enter the data values in the Description and Data
Entry field.
5 Edit the input for the validator in the Description and Data Entry field. If you are using
the Find keywords validator, edit the input for the validator in the Description and Data
Entry field. Then select the qualities you want for the keyword:
■ Proximity: Finds a keyword only within the set proximity of the matched patterns.
Check this box and also indicate the Word Distance.
■ Case sensitive: Check this box if you want to search for a case-sensitive match.
■ Highlight keywords in incident: Check this box if you want to highlight the matched
keywords in incidents.

6 Click Add Validator when you are done entering the values.
The validator is added to the Active Validators list.
7 To remove a validator, select it in the Active Validators list and click the red X icon.
8 Click Save to save the configuration of the data identifier.

Selecting a data normalizer


When you create a custom data identifier, you must select a normalizer to reconcile the data
detected by the pattern with the format expected by the validators.
See “Workflow for creating custom data identifiers” on page 812.
Table 31-31 lists and describes the normalizers you can implement for custom data identifiers
.

Note: You cannot modify the normalizer of a system-defined data identifier.


Detecting content using data identifiers 831
Creating custom data identifiers

Table 31-31 Available data normalizers

Normalizer Description

Digits Only numeric characters are allowed.

Digits and Letters Alphanumeric characters are allowed.

Lowercase Only letters are allowed, normalized to lowercase.

Swift codes Code must match SWIFT requirements.

Do nothing The data is not normalized, evaluated as entered by the user.

Creating custom script validators


The custom script validation check lets you enter a custom script to validate pattern matches.
To implement a custom validator, you use the Symantec Data Loss Prevention Scripting
Language.
You can implement a custom script validator in a system data identifier you modify or in a
custom data identifier.

Note: Refer to the Symantec Data Loss Prevention Detection Customization Guide for details
on using the Symantec Data Loss Prevention Scripting Language.

To implement a custom script validator


1 Modify an existing data identifier or create a custom data identifier.
See “Workflow for creating custom data identifiers” on page 812.
2 Select the Custom Script validator from the list of Validation Checks.
3 Enter your custom script in the Description and Data Entry field.
4 Click Add Validator to add the custom validator to the Active Validators list.
5 Click Save to save the configuration of the data identifier.

Configuring pre- and post-validators


Pre- and Post-Validators define characters and character ranges that are valid before or
after a data identifier pattern. They can be helpful for eliminating false-positive detection results.
Acceptable characters for pre- and post-validators include ASCII characters 32 through 126
(as literal characters), and the special characters \S (non-whitespace or Unicode characters)
and \w (any character not matched by a letter or digit). \S is acceptable as a valid or invalid
character for both pre- and post-validators. \w is acceptable as an invalid character for both
Detecting content using data identifiers 832
Creating custom data identifiers

pre- and post-validators. Additionally, the \l (letter) and \d (digit) special characters are
acceptable as invalid pre- or post validator characters.
Though they are not defined here, white spaces such as tabs and new lines are also treated
as valid characters for pre- and post-validators.
Pre- and Post-Validators are required in custom data identifiers. The fields are pre-populated
with default values, but you can edit them as necessary to tune your results.
The default values for the pre- and post-validators are:
Pre-validators:
■ Valid: ,=:#"'()>;@!`~$%^*\S
■ Invalid: \S\w
Post-validators:
■ Valid: ,."'()<;&=@`~\S
■ Invalid: \S\w
The pre- and post-validators only check the character immediately preceding or following the
matched data identifier. In cases where the same characters appear in both the valid and
invalid fields, the valid field takes precedence. For example, where \S (a Unicode character)
appears in both the valid and invalid field for pre-validator characters, Unicode characters will
be considered valid pre-validator characters.

Examples
These examples show some matching and non-matching pre- and post-validators for a 10
digit data identifier pattern \d{10}:

Table 31-32 Pre- and post-validator characters

Character position Valid Invalid

Pre-validator characters !(, \S\w

Post-validator characters ), \S\w

The following strings would match or not match the data identifier pattern based on the
preceding or following characters as described here:
Detecting content using data identifiers 833
Best practices for using data identifiers

Table 31-33 Pre- and post-validator pattern matching examples

String Pattern match condition Description

A1234567890 No match The character A preceding the


\d{10} pattern is not a valid
pre-validator character, so the
pattern does not match.

!1234567890 Match The character ! preceding the


\d{10} pattern is a valid
pre-validator character, so the
pattern matches.

1234567890} No match The character } following the


\d{10} pattern is not a valid
post-validator character, so the
pattern does not match.

(1234567890) Match The character ( preceding the


\d{10} pattern is a valid
pre-validator character. The
character ) following the pattern
is a valid post-validator character.
Because both characters are
valid, the pattern matches.

@1234567890 No match The character @ preceding the


\d{10} pattern is not a valid
pre-validator character, so it does
not override the invalid special
characters \S\w. The pattern
does not match.

,1234567890, Match The character , is a valid pre- and


post-validator character, so the
pattern matches.

1234567890 Match The \d{10} pattern has no


preceding or following character,
so the pattern matches.

Best practices for using data identifiers


Data identifiers are algorithms that combine pattern matching with data validators to detect
content. Symantec Data Loss Prevention provides a number of system-defined data identifiers
for common data patterns, such as SSNs, Tax IDs, and more. In addition, you can define your
Detecting content using data identifiers 834
Best practices for using data identifiers

own custom data identifiers to match any data you can describe using the data identifier pattern
language. Data identifiers are commonly used to detect personally identifiable information
(PII).
This section provides best practices for implementing data identifier policies.
Table 31-34 summarizes the best practices in this section.

Table 31-34 Summary of data identifier best practices

Best practice Description

Use data identifiers instead of regular expressions when See “Use data identifiers instead of regular expressions
possible. to improve accuracy” on page 834.

Modify data identifier definitions when you want tuning to See “Modify data identifier definitions when you want tuning
apply globally. to apply globally” on page 835.

Close system-defined data identifiers before modifying See “Clone system-defined data identifiers before
them. modifying to preserve original state” on page 835.

Consider using multiple data identifier breadths in parallel. See “Consider using multiple breadths in parallel to detect
different severities of confidential data” on page 836.

Avoid matching on the Envelope over HTTP. See “Avoid matching on the Envelope over HTTP to reduce
false positives” on page 836.

Use the Randomized US SSN data identifier to detect See “Use the Randomized US SSN data identifier to detect
traditional and randomized SSNs. SSNs” on page 836.

Use unique match counting to improve accuracy and ease See “Use unique match counting to improve accuracy and
remediation. ease remediation” on page 837.

Use data identifiers instead of regular expressions to improve


accuracy
Data identifiers are designed to protect personally identifiable information (PII) with very good
accuracy (<10% false positive rate). If a data identifier is available for the type of content you
want to protect, you should use the data identifier instead of a regular expression because
data identifiers are more efficient than regular expressions. Out-of-the-box data identifier
patterns are tuned for accuracy, including region, industry, and country nuances. In addition,
data identifiers include validation checks to verify the data that is matched by the pattern. This
additional layer of intelligence screens out test data and other triggers of false positive incidents.
Regular expressions, on the other hand, can be computationally expensive and can lead to
increased false positives.
For example, if you want to detect social security numbers (SSN), you use the Randomized
US SSN data identifier instead of a regular expression pattern. The Randomized US SSN data
Detecting content using data identifiers 835
Best practices for using data identifiers

identifier is more accurate than any regular expression you can write and much easier and
quicker to implement.

Note: The data identifier pattern language is a limited subset of the regular expression language.
Not all regular expression constructs or characters are supported for data identifier patterns.
See “Using the data identifier pattern language” on page 814.

Clone system-defined data identifiers before modifying to preserve


original state
Before you modify a system data identifier or create a custom data identifier, consider the
following:
■ If you want to modify a system data identifier, manually clone it as a custom data identifier
and then modify the cloned copy. In this fashion you preserve the state of the original
system-defined data identifier.
■ Data identifiers do not export as part of a policy template. As such, you should add the
data identifier to a policy and export the policy as a template before modifying the data
identifier.
An exported template contains a reference to each data identifier that is implemented in
that policy. On import to a target system, the template uses a reference to select the local
data identifier. If the system data identifier is modified, on import it is not by the target
system.
See “Cloning a system data identifier before modifying it” on page 777.

Modify data identifier definitions when you want tuning to apply


globally
Data identifiers offer two levels of configuration:
■ Definitions
■ Instances
Data identifier definitions are configured at the system-level of the Enforce Server. At the
definition level you can tune the data that is supplied by any required validator that the definition
declares at this level, as well as what validators are used.
Data identifier instances are only configured at the policy rule level. Any configurations that
are made at the rule level are local in scope and applicable only to that policy. At the rule level
you use optional validators, such as require or exclude beginning or ending characters, to tune
the instance of the data identifier rule.
Detecting content using data identifiers 836
Best practices for using data identifiers

The general recommendation is to configure data identifier definitions so that the changes
apply globally to any instance of that data identifier definition. Such configurations are reusable
across policies. Rule-level optional validators, such as, should be used for unique policies.

Consider using multiple breadths in parallel to detect different


severities of confidential data
Matching data identifiers against content often requires fine-tuning as you adjust the
configuration to keep both false positives and false negatives to a minimum. After you configure
an instance of the Content Matches Data Identifier condition, study the matches and adjust
the configuration to ensure optimum data matching success.
Consider adjusting the data identifier breadth you use if the data identifier produces too many
false positive or negatives. For example, if you use a wide breadth and receive many false
positives, consider using a medium breadth or narrow breadth.
See “About data identifier breadths” on page 731.
As an alternative approach, consider using multiple data identifier breadths in parallel in the
same rule with different severity levels for each rule. For example, in a single policy that is
designed to detect credit card numbers, you can add three rules to the policy, each using a
different breadth (one wide, one medium, one narrow). You would then set the severity for the
narrow to be high severity incidents, and the wide to be low severity incidents. Using this
layered approach lets you survey the data flowing through the enterprise using a policy that
covers both ends of spectrum. You can use this sampling-based approach to focus your
remediation efforts on the highest-priority incidents while still detecting and being able to review
low-severity incidents.

Avoid matching on the Envelope over HTTP to reduce false positives


Sometimes HTTP transmissions contain session IDs in the header that can trigger false
positives for numeric data identifiers. For example, some social media sites such as Facebook
and LinkedIn contain a session ID that may at times match the CCN and SSN data identifiers
exactly, causing false positives.
To reduce false positives in connection with HTTP session IDs in the message header, the
best practice is not to match on the “Envelope” message component when you implement
numeric data identifiers, specifically the CCN or SSN data identifiers.

Use the Randomized US SSN data identifier to detect SSNs


In 2011, the United States Social Security Administration (SSA) began issuing randomized
SSNs. Under this scheme, the high group number (second part of the SSN) no longer
corresponds to the area number (first part of the SSN). Also, the range of the area number
Detecting content using data identifiers 837
Best practices for using data identifiers

can go up to 899 instead of 773. Randomization applies to SSNs issued on or after June 25,
2011. It does not apply to SSNs issued before that date.
To support the new randomized SSN scheme, Symantec Data Loss Prevention provides the
system-defined Randomized US Social Security Number (SSN) data identifier.
See “Randomized US Social Security Number (SSN)” on page 1414.
The Randomized US SSN data identifier detects both traditional and randomized SSNs. The
Randomized US SSN data identifier replaces the US SSN data identifier, which only detects
traditional SSNs.
Symantec recommends that you use the Randomized US SSN data identifier for all new
policies that you want to use to detect SSNs, and that you update your existing SSN policies
to use the Randomized US SSN data identifier. For your existing policies that already implement
the traditional US SSN data identifier, you can add the Randomized US SSN data identifier
as an OR'd rule so that both run in parallel as you test the policy to ensure it accurately detects
both styles of SSNs.
See “Updating policies to use the Randomized US SSN data identifier” on page 810.

Use unique match counting to improve accuracy and ease


remediation
The data identifier rule configuration, by default, counts only unique matches. With this option
only unique matches are reported as the first match found in the message or message
component. Only unique matches are counted and highlighted. You can also choose the option
which counts all matches.
The best practice is to use unique match counting when you only care about unique matches,
not duplicate matches. For example, if you are using the Credit Card Numbers data identifier
to protect credit card numbers, and you only care if a document contains 25 or more unique
numbers, you can use count all unique matches instead of the count all matches option. If you
counted all matches, a document containing 25 of the same CCNs would trigger the policy,
which is not the objective of your policy.
See “About unique match counting” on page 734.
Chapter 32
Detecting content using
keyword matching
This chapter includes the following topics:

■ Introducing keyword matching

■ Configuring keyword matching

■ Best practices for using keyword matching

Introducing keyword matching


Symantec Data Loss Prevention provides the Content Matches Keyword policy condition for
keyword detection.
To detect data loss using keyword matching, the detection engine compares inbound messages
or message components against each keyword in a list of one or more keywords or keyword
phrases. Keyword matching supports both whole word and partial word matching, as well as
word proximity. Keyword matching is supported on the server and on the endpoint. Unique
match counting is supported for keywords.
See “Using unique match counting” on page 775.
Table 32-1 lists typical keyword matching use cases.
Detecting content using keyword matching 839
Introducing keyword matching

Table 32-1 Keyword matching use cases

Configuration Typical use

Whole word matching Languages based on the Latin alphabet


UTF-8 characters

Chinese, Japanese, and Korean (CJK) languages with token verification enabled for the
server

CJK keywords on the endpoint

See “About keyword matching for Chinese, Japanese, and Korean (CJK) languages”
on page 839.

Partial word matching Languages based on the Latin alphabet

Mixed languages

See “Keyword matching examples” on page 841.

About keyword matching for Chinese, Japanese, and Korean (CJK)


languages
Symantec Data Loss Prevention detection servers support natural language processing for
Chinese, Japanese, and Korean (CJK) keywords. When natural language processing for CJK
languages is enabled, the detection server validates CJK tokens before reporting a match.
For CJK languages, a token is a single character which constitutes a word. Thus, partial word
matching does not apply to CJK languages.
Token validation for CJK keywords is only supported for detection servers and is disabled by
default. You must enable token validation for each detection server. In addition you must match
on whole words for token validation to apply.
On the endpoint you can use whole word matching for CJK keywords.
Table 32-2 summarizes keyword matching use cases for CJK languages.

Table 32-2 Keyword matching use cases for CJK languages

Detection component Use case

Server Enable token verification on the detection server and use whole word matching

See “Enabling and using CJK token verification for server keyword matching” on page 847.

Endpoint Use whole word matching

See “Keyword matching examples for CJK languages” on page 842.


Detecting content using keyword matching 840
Introducing keyword matching

About keyword proximity


Using keyword proximity, a policy author can define a pair of keywords and specify a word
range between them. If the words occur within that range, a match is triggered. For example,
an instance of the Content Matches Keyword condition might require that any instance of
the words “confidential” and “information” occurring within 10 words of each other triggers a
match.
Alternatively, you can use keyword proximity to exclude matching words within a specified
distance by using the Content Matches Keyword condition as a detection exception. In this
case any occurrence of the words “confidential” and “information” within 10 words of each is
excepted from matching.
For Chinese, Japanese, and Korean (CJK) languages, a single CJK character is counted as
one word.
See “Keyword matching syntax” on page 840.
See “Keyword matching examples” on page 841.
See “Configuring the Content Matches Keyword condition” on page 844.

Keyword matching syntax


When you define a keyword rule, the system evaluates every keyword in the condition list
against each message component (header, subject, body, attachment).
Consider the following syntactical guidelines when creating keyword lists.

Table 32-3 Keyword matching syntax

Behavior Description

Whole word matching With whole word matching, keywords match at word boundaries only (\W in the regular
expression lexicon). Any characters other than A-Z, a-z, and 0-9 are interpreted as word
boundaries.

With whole word matching, keywords must have at least one alphanumeric character (a letter
or a number). A keyword consisting of only white-space characters, such as "..", is ignored.

Quotation marks Do not use quotation marks when you enter keywords or phrases because quotes are interpreted
literally and will be required in the match.

White space The systems strips out the white space before and after keywords or key phrases. Each
whitespace within a keyword phrase is counted. In addition to actual spaces, all characters
other than A-Z, a-z, and 0-9 are interpreted as white spaces.

Case sensitivity The case sensitivity option that you choose applies to all keywords in the list for that condition.
Detecting content using keyword matching 841
Introducing keyword matching

Table 32-3 Keyword matching syntax (continued)

Behavior Description

Plurals and verb All plurals and verb inflections must be specifically listed. If the number of enumerations
inflections becomes complicated use the wildcard character (asterisk [*]) to detect a keyword suffix (in
whole word mode only).

Keyword phrases You can enter keyword phrases, such as social security number (without quotes). The system
looks for the entire phrase without returning matches on individual constituent words (such as
social or security).

Keyword variants The system only detects the exact keyword or key phrase, not variants. For example, if you
specify the key phrase social security number, detection does not match a phrase that
contains two spaces between the words.

Matching multiple The system implies an OR between keywords. That is, a message component matches if it
keywords contains any of the keywords, not necessarily all of them. To perform an ALL (or AND) keyword
match, combine multiple keyword conditions in a compound rule or exception.

Alpha-numeric During keyword matching, only a letter or a digit is considered a valid keyword start position.
characters Special characters (non-alphanumeric) are treated as delimiters (ignored). For example, the
ampersand character ("&") and the underscore character ("_") are special characters and are
not considered for keyword start position.

For example, consider the following:

____keyword__

Keyword

&&akeyword&&

123Keyword__

For these examples, the valid keyword start positions are as follows: k, K, a, and 1.
Note: This same behavior applies to keyword validators implemented in data identifiers.

Proximity The word distance (proximity value) is exclusive of detected keywords. Thus, a word distance
of 10 allows for a proximity window of 12 words.

Keyword matching examples


To implement keyword matching, you can enter one or more keywords or phrases, each
separated by a comma or newline character. You can match on whole or partial words, and
specify case sensitivity. You can use the asterisk (*) wildcard character to detect a keyword
suffix (in whole word mode only).
See “Keyword matching syntax” on page 840.
Detecting content using keyword matching 842
Introducing keyword matching

Table 32-4 Keyword matching examples

Keyword type Keyword(s) Matches Does Not Match

keyword confidential confidential confidentially (in


whole word mode
-confidential;
only, otherwise it
®"confidential" would match)
®Confidential

®CONFIDENTIAL

key phrase internal use only internal use only internal use

internal use ONLY (if case


insensitive is selected)

keyword list Newline delimited: Comma delimited: hacks hackers

hack hack, hacker, hacks hack shack

hacker hacker

hacks

keyword with wildcard priv* private prize

privilege prevent

privy

privity

privs

priv

keyword dictionary account number, account ps, american If any keyword or phrase is amx
express, americanexpress, amex, bank present, the data is matched:
creditcard
card, bankcard, card num, card number,
cc #, cc#, ccn, check card, checkcard, amex master card
credit card, credit card #, credit card credit card car
number, credit card#, debit card,
debitcard, diners club, dinersclub, mastercard
discover, enroute, japanese card bureau,
jcb, mastercard, mc, visa, (etc....)

Keyword matching examples for CJK languages


Table 32-5 provides keyword matching examples for Chinese, Japanese, and Korean
languages. All examples assume that the keyword condition is configured to match on whole
words only.
Detecting content using keyword matching 843
Introducing keyword matching

If token verification is enabled, the message size must be sufficient for the token validator to
recognize the language. For example: the message “東京都市部の人口” is too small fo a
message for the token validation process to recognize the language of the message. The
following message is a sufficient size for token validation processing:
今朝のニュースによると東京都市部の人口は増加傾向にあるとのことでした。 全国的な人口
減少の傾向の中、東京への一極集中を表しています。
See “About keyword matching for Chinese, Japanese, and Korean (CJK) languages”
on page 839.
Token validation for CJK language keywords is not available on the endpoint. To match CJK
on the endpoint, you configure the condition to match on whole words only.

Table 32-5 Keyword matching examples for CJK

Language Keyword Matches on server with Matches on server Matches on endpoint


token validation ON with token validation
OFF

Chinese 通信 数字无线通信 数字无线通信 交通信息 数字无线通信 交通信息网


网站 站

Japanese 京都市 京都府京都市左京区 京都府京都市左京区 東 京都府京都市左京区 東京


京都市部の人口 都市部の人

Korean 정부 정부의 방침 정부의 방침 의정부 경전 정부의 방침 의정부 경전


철 철

About updates to the Drug, Disease, and Treatment keyword lists


The Drug, Disease, and Treatment keyword lists are updated with current terminology based
on information from the U.S. Federal Drug Administration (FDA) and other sources. The Drug,
and Disease, and Treatment keyword lists are used by the HIPAA and HITECH (including
PHI) and Caldicott Report policy templates.
When you upgrade your Data Loss Prevention system, the generic, system-defined HIPAA
and Caldicott policy templates are updated with the recent Drug, Disease, and Treatment
keyword lists. However, policies you have created based on the HIPAA or Caldicott policy
templates are not automatically updated. This behavior is expected so that any changes or
customizations you have made to your HIPAA or Caldicott policy templates are not overwritten
by updates to the system-defined templates. Updating the Drug, Disease, and Treatment
keyword lists for your HIPAA and Caldicott policy templates is a manual process that you
should perform to ensure your HIPAA or Caldicott policies are up to date.
See “Updating the Drug, Disease, and Treatment keyword lists for your HIPAA and Caldicott
policies” on page 848.
Detecting content using keyword matching 844
Configuring keyword matching

See “Keep the keyword lists for your HIPAA and Caldicott policies up to date” on page 850.
See “HIPAA and HITECH (including PHI) policy template” on page 1690.
See “Caldicott Report policy template” on page 1561.

Configuring keyword matching


Table 32-6 describes the components for implementing keyword matching.

Table 32-6 Implementing keyword matching

Keyword matching feature Description

Match on whole or partial keywords Separate each keyword or phrase by a newline or comma.
and key phrases
See “Keyword matching examples” on page 841.

Match on the wildcard asterisk (*) Match the wildcard at the end of a keyword, in whole word mode only.
character
See “Keyword matching examples” on page 841.

Keyword proximity matching Match across a range of keywords.

See “About keyword proximity” on page 840.

Find keywords Implement one or more keywords in data identifiers to refine the scope of
detection.

See “Introducing data identifiers” on page 717.

Policy rules and exceptions You can implement keyword matching conditions in policy rules and exceptions.

See “Configuring the Content Matches Keyword condition” on page 844.

Cross-component matching Keyword matching detects on one or more message components.

See “Detection messages and message components” on page 391.

Keyword dictionary If you have a large dictionary of keywords, you can index the keyword list.

See “Use VML to generate and maintain large keyword dictionaries” on page 851.

CJK token verification Enable on the detection server for CJK languages and match on whole words
only.

See Table 32-2 on page 839.

Configuring the Content Matches Keyword condition


The Content Matches Keyword condition lets you match content using keywords and key
phrases.
Detecting content using keyword matching 845
Configuring keyword matching

See “Introducing keyword matching” on page 838.


You can implement keyword matching conditions in policy rules and exceptions.
See “Configuring policies” on page 413.
To configure the Content Matches Keyword condition
1 Add a new keyword condition to a policy rule or exception, or modify an existing one.
See “Configuring policy rules” on page 417.
See “Configuring policy exceptions” on page 426.
2 Configure the keyword matching parameters.
See Table 32-7 on page 845.
See “Keyword matching syntax” on page 840.
3 Save the policy.

Table 32-7 Configure the Content Matches Keyword condition

Action Description

Enter the match type. Select if you want the keyword match to be:

Case Sensitive or Case Insensitive

Case insensitive is the default.

Choose the keyword Select the keyword separator you to delimit multiple keywords:
separator.
Newline or Comma.

Newline is the default.

Match any keyword. Enter the keyword(s) or key phrase(s) you want to match. Use the separator you have selected
(newline or comma) to delimit multiple keyword or key phrase entries.

You can use the asterisk (*) wildcard character at the end of any keyword to match one or more
suffix characters in that keyword. If you use the asterisk wildcard character, you must match
on whole words only. For example, a keyword entry of confid* would match on "confidential"
and "confide," but not "confine." As long as the keyword prefix matches, the detection engine
matches on the remaining characters using the wildcard.

See “Keyword matching syntax” on page 840.

See “Keyword matching examples” on page 841.


Detecting content using keyword matching 846
Configuring keyword matching

Table 32-7 Configure the Content Matches Keyword condition (continued)

Action Description

Configure keyword Keyword proximity matching lets you specify a range of detection among keyword pairs.
proximity matching
See “About keyword proximity” on page 840.
(optional).
To implement keyword proximity matching:

■ Select (check) the Keyword Proximity matching option in the "Conditions" section of the
rule builder interface.
■ Click Add Pair of Keywords.
■ Enter a pair of keywords.
■ Specify the Word distance.
The maximum distance between keywords is 999, as limited by the three-digit length of the
“Word distance” field. The word distance is exclusive of detected keywords. For example,
a word distance of 10 allows for a range of 12 words, including the two words comprising
the keyword pair.
■ Repeat the process to add additional keyword pairs.
The system connects multiple keyword pair entries the OR Boolean operator, meaning that
the detection engine evaluates each keyword pair independently.

Match on whole or Select the option On whole words only to match on whole keywords only (by default this
partial keywords. option is selected).

You must match on whole words only if you use the asterisk (*) wildcard character in any
keyword you enter in the list.

See “Keyword matching examples” on page 841.


You must match on whole words only if you have enabled token validation for the server.

See “Keyword matching examples for CJK languages” on page 842.

Configure match Keyword matching lets you specify how you want to count condition matches.
conditions. Select one of the following options:

■ Check for existence


The system reports one incident for all matches.
■ Count all matches and only report incidents with at least 1 matches (default)
With the default setting the system reports one incident for each match. Alternatively, you
can configure the match threshold by changing the default value from 1 to another value.

See “Configuring match counting” on page 421.


Detecting content using keyword matching 847
Configuring keyword matching

Table 32-7 Configure the Content Matches Keyword condition (continued)

Action Description

Select components Keyword matching detection supports matching across message components.
to match on.
See “Selecting components to match on” on page 423.
Select one or more message components to match on:

■ Envelope – Header metadata used to transport the message


■ Subject – Email subject of the message (only applies to SMTP)
■ Body – The content of the message
■ Attachments – Any files attached to or transferred by the message

Note: On the endpoint the DLP Agent matches on the entire message, not individual
components.

See “Detection messages and message components” on page 391.

Also match one or Select this option to create a compound condition. All conditions must be met to report a match.
more additional
You can Add any available condition from the list.
conditions.
See “Configuring compound match conditions” on page 429.

Enabling and using CJK token verification for server keyword


matching
To use token verification for Chinese, Japanese, and Korean (CJK) languages you must enable
it on the server and you must use whole word matching for the keyword condition. In addition,
there must be a sufficient amount of message text for the system to recognize the language.
See “Keyword matching examples for CJK languages” on page 842.
Table 32-8 lists and describes the detection server parameter that lets you enable token
verification for CJK languages.

Table 32-8 Keyword token verification parameter

Setting Default Description

Keyword.TokenVerifierEnabled false Default is disabled ("false").

If enabled ("true"), the server validates tokens for Chinese,


Japanese, and Korean language keywords.

Enable keyword token verification for CJK describes how to enable and use token verification
for CJK keywords.
Detecting content using keyword matching 848
Configuring keyword matching

Enable keyword token verification for CJK


1 Log on to the Enforce Server as an administrative user.
2 Navigate to the System > Servers and Detectors > Overview > Server/Detector Detail
- Advanced Settings screen for the detection server or detector you want to configure.
See “Advanced server settings” on page 285.
3 Locate the parameter Keyword.TokenVerifierEnabled.
4 Change the value to true from false (default).
Setting the server parameter Keyword.TokenVerifierEnabled = true enables token
validation for CJK keyword detection.
5 Save the detection server configuration.
6 Recycle the detection server.
7 Configure a keyword condition using whole word matching.
In the condition the option Match On whole word only is checked.
See “Configuring the Content Matches Keyword condition” on page 844.

Updating the Drug, Disease, and Treatment keyword lists for your
HIPAA and Caldicott policies
If you have created a policy derived from the HIPAA or Caldicott template and have not made
any changes or customizations to the derived policy, after upgrade you can create a new policy
from the appropriate template and remove the old policy from production. If you have made
changes to a policy derived from either the HIPAA or Caldicott policy template and you want
to preserve these changes, you can copy the updated keyword lists from either the HIPAA or
Caldicott policy template and use the copied keyword lists to update your HIPAA or Caldicott
policies.
See “About updates to the Drug, Disease, and Treatment keyword lists” on page 843.
See “Keep the keyword lists for your HIPAA and Caldicott policies up to date” on page 850.
To update the Drug, Disease, and Treatment keyword lists for HIPAA and Caldicott policies
provides instructions for updating the keyword lists for your HIPAA and Caldicot policies.
To update the Drug, Disease, and Treatment keyword lists for HIPAA and Caldicott policies
1 Create a new policy from a template and choose either the HIPAA or Caldicott template.
See “Creating a policy from a template” on page 397.
2 Edit the detection rules for the policy.
See “Configuring policy rules” on page 417.
Detecting content using keyword matching 849
Best practices for using keyword matching

3 Select the Patient Data and Drug Keywords (Keyword Match) rule.
4 Select the Content Matches Keyword condition.
5 Select all the keywords in the Match any Keyword data field and copy them to the
Clipboard.
6 Paste the copied keywords to a text file named Drug Keywords.txt.
7 Cancel the rule edit operation to return to the policy Detection tab.
8 Repeat the same process for the Patient Data and Treatment Keywords (Keyword
Match) rule.
9 Copy and paste the keywords from the condition to a text file named Treatment
Keywords.txt.

10 Repeat the same process for the Patient Data and Disease Keywords (Keyword Match)
rule.
11 Copy and paste the keywords from the condition to a text file named Disease
Keywords.txt.

12 Update your HIPAA and Caldicott policies derived from the HIPAA or Caldicott templates
using the keyword *.txt files you created.
13 Test your updated HIPAA and Caldicott policies.

Best practices for using keyword matching


The Content Matches Keyword condition lets you match content using keywords, key phrases,
and keyword lists or dictionaries. On the server, the keyword rule matches on the header,
subject, body and attachment message components, and it supports cross-component matching.
On the endpoint the keyword condition matches on the entire message.
Table 32-9 summarizes the keyword matching best practices in this section.

Table 32-9 Summary of keyword matching best practices

Best practice More information

Enable linguistic validation for CJK keyword See “Enable token verification on the server to reduce false
detection on the server. positives for CJK keyword detection” on page 850.

Update keyword lists for your Caldicott and HIPAA See “Keep the keyword lists for your HIPAA and Caldicott policies
policies. up to date” on page 850.

Tune keyword validators to improve data identifier See “Tune keywords lists for data identifiers to improve match
accuracy. accuracy” on page 851.
Detecting content using keyword matching 850
Best practices for using keyword matching

Table 32-9 Summary of keyword matching best practices (continued)

Best practice More information

Use VML to profile long keyword lists and See “Use VML to generate and maintain large keyword
dictionaries dictionaries” on page 851.

Use keyword matching for metadata detection. See “Use keyword matching to detect document metadata”
on page 851.

Enable token verification on the server to reduce false positives for


CJK keyword detection
Symantec Data Loss Prevention provides token validation for Chinese, Japanese, and Korean
(CJK) languages. Token validation is supported for detection servers and must be enabled.
See “About keyword matching for Chinese, Japanese, and Korean (CJK) languages”
on page 839.
Token validation lets you match CJK keywords using whole word matching, and improves
overall match accuracy for CJK languages. Although there may be a slight performance hit,
you should enable token verification for each detection server where CJK keyword conditions
are deployed. Once enabled you can use whole word matching for CJK keywords.
See “Enabling and using CJK token verification for server keyword matching” on page 847.

Keep the keyword lists for your HIPAA and Caldicott policies up to
date
For each Symantec Data Loss Prevention relese, the Drug, Disease, and Treatment keyword
lists are updated based on information from the U.S. Federal Drug Administration (FDA) and
other sources. These keyword lists are used in the HIPAA and HITECH (including PHI) and
Caldicott Report policy templates.
See “About updates to the Drug, Disease, and Treatment keyword lists” on page 843.
If you have upgraded to the latest Data Loss Prevention version and you have existing policies
derived from either the HIPAA or Caldicott policy template, consider updating your HIPAA and
Caldicott policies to use the Drug, Disease, and Treatment keyword lists provided with this
Data Loss Prevention version.
See “Updating the Drug, Disease, and Treatment keyword lists for your HIPAA and Caldicott
policies” on page 848.
Detecting content using keyword matching 851
Best practices for using keyword matching

Tune keywords lists for data identifiers to improve match accuracy


Many data identifier definitions contain required keyword validators with pre-populated keyword
lists. In addition, you can add your own list of keywords to a data identifier rule. The best
practice is tune the keyword list using a keyword matching condition before you add the keyword
list to the data identifier condition as a required or optional validator
See “Using pattern validators” on page 818.
To tune the keyword list, take the keywords you want to use for the validator and put them into
a separate keyword matching rule condition and policy. Then test the policy using data that
should and should not match the keywords. The keyword rule will let you see match highlighting
and tune the keyword list. Once tested, you can add the keywords to the data identifier and
then test the data identifier policy to ensure accuracy.

Use keyword matching to detect document metadata


Symantec Data Loss Prevention supports metadata detection for certain document formats,
such as DOCX and PDF. Detection servers and DLP Agents support metadata detection.
If you want to detect document metadata, the recommendation is to enable it for the server or
endpoint and use the Content Matches Keyword condition to match metadata tags.

Use VML to generate and maintain large keyword dictionaries


Sometimes you may want to protect a long list or dictionary of keywords. An example might
be a list of project code names. You can use Vector Machine Learning (VML) to automate the
detection of long keyword lists that are difficult to generate, tune, and maintain. For example,
you could generate a VML profile based on a collection of documents containing the keywords
you want to detect. If you want to detect common words, remove them from the VML stopword
file.
See “Best practices for using VML” on page 687.
Chapter 33
Detecting content using
regular expressions
This chapter includes the following topics:

■ Introducing regular expression matching

■ About the updated regular expression engine

■ About writing regular expressions

■ Configuring the Content Matches Regular Expression condition

■ Best practices for using regular expression matching

Introducing regular expression matching


Data Loss Prevention provides the Content Matches Regular Expression policy match
condition to match message content using the regular expression pattern language.
Regular expressions provide a mechanism for identifying strings of text, such as particular
characters, words, or patterns of characters. You can use the regular expression condition to
match (or exclude from matching) characters, patterns, and strings. Unique match counting
is supported for regular expressions.
See “Using unique match counting” on page 775.
See “Configuring the Content Matches Regular Expression condition” on page 854.
See “Best practices for using regular expression matching” on page 855.
Detecting content using regular expressions 853
About the updated regular expression engine

About the updated regular expression engine


Detection servers and endpoint agents use a common regular expression engine. This common
engine performs regular expression evaluation at a faster rate than previous engines. You will
also notice performance improvements when you have DLP policy sets with many regex rules,
since adding more rules doesnt incur much of a performance cost.

About writing regular expressions


Symantec Data Loss Prevention implements the PCRE-compatible regular exp'ression syntax
for policy condition matching. Table 33-1 provides some reference constructs for writing regular
expressions to match or exclude characters in messages or message components.
See “Introducing regular expression matching” on page 852.

Note: Data Identifier pattern matching is based on the regular expression syntax. However,
not all regular expression constructs listed in the table below are supported by Data Identifier
patterns. See “About data identifier patterns” on page 732.

Table 33-1 Regular expression constructs

Regular expression Description


construct

. Any single character (except for newline characters)


Note: The use of the dot (.) character is not supported for data identifier patterns.

\d Any digit (0-9)

\s Any white space

\w Any word character (a-z, A-Z, 0-9, _)


Note: The use of the \w construct does not match the underscore (_) character when
implemented in a data identifier pattern.

\D Anything other than a digit

\S Anything other than white space

[] Elements inside brackets are a character class (For example, [abc] matches 1 character:
a, b, or c.)

^ At the beginning of a character class, negates it (For example, [^abc] matches anything
except a, b, or c.)
Detecting content using regular expressions 854
Configuring the Content Matches Regular Expression condition

Table 33-1 Regular expression constructs (continued)

Regular expression Description


construct

+ Following a regular expression means 1 or more (For example, \d+ means 1 or more digit.)

? Following a regular expression means 0 or 1 (For example, \d? means 1 or no digits.)

* Following a regular expression means any number (For example, \d* means 0, 1, or more
digits.)

(?i) At the beginning of a regular expression makes the expression case-insensitive (Regular
expressions are case-sensitive by default.)

(?: ) Groups regular expressions together (The ?: is a slight performance enhancement.)

(?u) Makes a period (.) match even newline characters

| Means OR (For example, A|B means regular expression A or regular expression B.)

Configuring the Content Matches Regular Expression


condition
You use the Content Matches Regular Expression condition to match (or exclude from
matching) characters, patterns, and strings using regular expressions.
See “Introducing regular expression matching” on page 852.
To configure the Content Matches Regular Expression condition
1 Add a Content Matches Regular Expression condition to a policy, or edit an existing
one.
See “Configuring policies” on page 413.
See “Configuring policy rules” on page 417.
See “Configuring policy exceptions” on page 426.
2 Configure the Content Matches Regular Expression condition parameters.
See Table 33-2 on page 855.
3 Save the policy configuration.
Detecting content using regular expressions 855
Best practices for using regular expression matching

Table 33-2 Content Matches Regular Expression parameters

Action Description

Match regex. Specify a regular expression to be matched.


See “About writing regular expressions” on page 853.

Configure match Configure how you want to count matches.


counting.
See “Configuring match counting” on page 421.

Check for existence reports a match count of 1 if there are one or more matches. For
compound rules or exceptions, all conditions must be configured this way.

Count all matches reports the sum of all matches; applies if any condition uses this
parameter.

Match on one or more Configure cross-component matching by selecting one or more message components to
message components. match on.

■ Envelope – The header of the message, transport metadata.


■ Subject – The email subject (only applies to email messages).
■ Body – The content of the message.
■ Attachments – The content of any files that are attached to or transported by the
message.

See “Selecting components to match on” on page 423.

Also match one or more Select this option to create a compound condition. All conditions must match to trigger or
additional conditions. except an incident.

You can Add any available condition from the list.


See “Configuring compound match conditions” on page 429.

Best practices for using regular expression matching


This section provides considerations for implementing the Content Matches Regular
Expression match condition in your Data Loss Prevention policies.
See “Introducing regular expression matching” on page 852.
Table 33-3 summarizes the regular expression matching best practices in this section.

Table 33-3 Regular expressions best practices

Best practice Description

Use Data Identifiers instead of regular expressions where See “Use regular expressions sparingly to support efficient
possible. performance” on page 857.
Detecting content using regular expressions 856
Best practices for using regular expression matching

Table 33-3 Regular expressions best practices (continued)

Best practice Description

Use regular expressions sparingly to support efficient policy See “Test regular expressions before deployment to
performance. improve accuracy” on page 857.

Use look ahead and behind characters to improve regular See “Use look ahead and look behind characters to
expression performance. improve regular expression accuracy” on page 856.

Test regular expressions for accuracy and performance. See “Test regular expressions before deployment to
improve accuracy” on page 857.

When to use regular expression matching


Data Identifiers are more efficient than regular expressions because the Data Identifier patterns
are tuned for accuracy and the data is validated. For example, if you want to search for social
security numbers, use the US Social Security Number (SSN) Data Identifier instead of a regular
expression.
The regular expression condition is useful for matching or excepting unique data types for
which there are no system-provided Data Identifiers. Examples of these include internal account
numbers and data types that can vary greatly in length, such as email addresses.

Use look ahead and look behind characters to improve regular


expression accuracy
Symantec Data Loss Prevention implements a significant enhancement to improve the
performance of regular expressions. To achieve improved regular expression performance,
the look ahead and look behind sections must exactly match one of the supported standard
sections.
Table 33-4 lists the standard look ahead and look behinds sections that this performance
improvement supports. If either section differs even slightly, that section is executed as part
of the regular expression without the performance improvement.
See “About writing regular expressions” on page 853.

Table 33-4 Look ahead and look behind standard sections

Operation Construct

Look ahead (?=(?:[^-\w])|$)


Detecting content using regular expressions 857
Best practices for using regular expression matching

Table 33-4 Look ahead and look behind standard sections (continued)

Operation Construct

Look behind (?<=(^|(?:[^)+\d][^-\w+])))

and

(?<=(^|(?:[^)+\d][^-\w+])|\t))

Use regular expressions sparingly to support efficient performance


Regular expressions can be computationally expensive. If you add a regular expression
condition, observe the system for one hour. Make sure that the system does not slow down
and that there are no false positives.

Test regular expressions before deployment to improve accuracy


If you implement regular expression matching, consider using a third-party tool to test the
regular expressions before you deploy the policy rules to production. The recommended tool
is RegexBuddy. Another good tool for testing your regular expressions is RegExr.
Chapter 34
Detecting content using
classification matching
This chapter includes the following topics:

■ Introducing classification matching

■ Supported file types

■ How tag matching works

■ Configuring the Content Matches Classification condition

Introducing classification matching


Symantec Data Loss Prevention provides the Content Matches Classification condition to
detect Information Centric Tagging tags that have been applied to various files and email
content.
A tag comprises three components: organization, scope, and sensitivity level. An example
could be: Symantec-Marketing-Confidential, or written in tag form, SYMC-MKTG-CONF. An
organization can be the entire company or logical divisions within one company. Scope is
typically a functional group, such as Payroll or Engineering. The level of the sensitivity of the
data being tagged ranges from 1 through 9, with 1 being the least sensitive. An ICT administrator
defines these tags and can use terms that make sense to the organization. For example, the
admin might call Level 1 PUBLIC, Level 4 CONFIDENTIAL, and Level 9 TOPSECRET. The
collection of all of the tags comprises the classification taxonomy.
To make use of this ICT taxonomy in Data Loss Prevention, you must import it into the Data
Loss Prevention database. The taxonomy is then available to you as you define your detection
rule with the Content Matches Classification option.
In the Conditions area for this rule option, you have three choices for detection criteria:
Content is classified, Content is not classified, and Content matches. If you choose
Detecting content using classification matching 859
Supported file types

Content matches, the taxonomy is available to you to select from drop-down menus under
Organization, Scope, and Level. You can also select Any organization or scope. To complete
the detection formula, you choose the search Operator, such as Not Equals or Is Less Than
or Equals. Multiple operators can be combined ("OR'd" together).

Note: The Content is classified expression is triggered only if the classified file or email
message has been classified within the imported taxonomy. If a file or email message has
been classified using some other taxonomy that has not been imported into Enforce, then this
expression does not evaluate as true. Similarly, something that has been classified within
another Information Centric Tagging taxonomy that is not known to Enforce evaluates as
Content is not classified.

To detect these tags, the Data Loss Prevention detection engine searches the metadata of
supported emails and files. Prior to your search running, end users applied the tags to various
emails and files.
See “About integrating Information Centric Tagging with Data Loss Prevention” on page 226.

Supported file types


Data Loss Prevention searches for tags only in supported file types and email messages. For
the supported file types, see Table 34-1.

Table 34-1 Supported file types for classification matching

File type Supported formats

Microsoft Office ■ Pre-Office 2007 (CFB)


■ Office 2007 and later (XML)

Images .png, .gif

PDF .pdf

Files include email attachments.

No tag detection takes place:


■ On file types natively supported by Information Centric Tagging, but unreadable by Data
Loss Prevention (.jpg, .tiff).
■ Against file types not natively supported by Information Centric Tagging where the
classification tag resides in the Alternate Data Stream.
■ On encrypted data, unless DLP is configured to inspect Microsoft Rights Management
protected files, which include Microsoft Office and PDF documents for policy evaluation.
Detecting content using classification matching 860
How tag matching works

Note: Even though tags can be detected in the (unencrypted) metadata, a common scenario
for using the Content Matches Classification option is to join this option with other options,
such as using keyword matching or regular expressions to detect sensitive content, such
as Social Security Numbers. Then, if a file is detected with a Level 1 (PUBLIC) tag, for
example, but the document content is sensitive, an incident could be generated. If content
is encrypted, that type of policy using compound rules fails.

How tag matching works


For the Content Matches Classification option, you have three choices:
■ Content is classified
■ Content is not classified
■ Content matches (Select Operator, Select Organization, Select Scope, Select Level)
To understand how tag matching works when supported email or file types are searched, see
the appropriate table below.

Note: In the tables, the term this taxonomy refers to the taxonomy that's been
imported/synchronized on this Enforce Server.

Table 34-2 Search results for the Content is classified condition

Incidents are generated when Incidents are not generated when

The tag belongs to this taxonomy. ■ The tag belongs to a different taxonomy.
■ There is no classification tag applied to the
content.
■ The tag is in the wrong format.

Table 34-3 Search results for the Content is not classified condition

Incidents are generated when Incidents are not generated when

■ The tag belongs to a different taxonomy. The tag belongs to this taxonomy.
■ There is no classification tag applied to the
content.
■ The tag is in the wrong format.
Detecting content using classification matching 861
How tag matching works

Table 34-4 Search results for the Content matches [specific operator and selected tags]
condition

Incidents are generated when Incidents are not generated when

The ICT tag matches the criteria. ■ The tag in this taxonomy does not match the
criteria.
■ The tag belongs to a different taxonomy.
■ There is no classification tag applied to the
content.
■ The tag is in the wrong format.

Table 34-5 lists an example of an imported classification taxonomy, displayed on the System
> Settings > Information Centric Tagging page.
Table 34-6 shows the results of running various combinations of operators and tag selections
against that taxonomy, either from the Configure Policy - Add Rule page or the Configure
Policy - Edit Rule page, when defining a detection rule of Content Matches Classification
type.

Table 34-5 Sample imported ICT classification taxonomy

Organization Scope Sensitivity Level

CLOUD

ENG

CONFID 4

RESTRICT 3

INTERNAL 2

CORE

FIN

SECRET 5

HR

PUB 1

MKTG

CONFID 4

PUB 1
Detecting content using classification matching 862
How tag matching works

Table 34-5 Sample imported ICT classification


taxonomy (continued)

Organization Scope Sensitivity Level

OUTSRC

ENG

SECRET 4

CONFID 3

DEPTONLY 2

Table 34-6 Incidents that evaluate to true, based on operator and matching requirements

Operator Organization Scope Level

Equals CLOUD Any 2

Evaluates to true if
content classified
as:

(2) INTERNAL

Equals CORE Any (4) CONFID

Evaluates to true if
content classified
as:

CORE MKTG (4) CONFID

Not Equals CORE MKTG 1

Evaluates to true if
content classified
as:

CLOUD ENG (4) CONFID

CLOUD ENG (3) RESTRICT

CLOUD ENG (2) INTERNAL

CORE FIN (5) SECRET


Detecting content using classification matching 863
Configuring the Content Matches Classification condition

Table 34-6 Incidents that evaluate to true, based on operator and matching
requirements (continued)

Operator Organization Scope Level

OUTSRC ENG (4) SECRET

OUTSRC ENG (3) CONFID

OUTSRC ENG (2) DEPTONLY

Is Less Than or CORE FIN (5) SECRET


Equals

Evaluates to true if
content classified
as:

CORE FIN (5) SECRET

Is Greater Than or OUTSRC ENG (3) CONFID


Equals

Evaluates to true if
content classified
as:

OUTSRC ENG (4) SECRET

OUTSRC ENG (3) CONFID

Configuring the Content Matches Classification


condition
To configure the Content Matches Classification condition
1 Add a Content Matches Classification condition to a policy, or edit an existing one.
2 In the Conditions area, set the parameters:
■ Configure the Content Matches Classification condition (Table 34-7).
■ For the Matches on parameter, Envelope and Attachments are always selected;
Subject and Body are never selected.

3 Save the policy.


Detecting content using classification matching 864
Configuring the Content Matches Classification condition

Table 34-7 Content Matches Classification parameters

Parameter Description

Content is classified See “How tag matching works”


on page 860.

Content is not classified See “How tag matching works”


on page 860.

Content matches: See “How tag matching works”


on page 860.

Select Operator Choose an operator: Equals, Not


Equals, Is Less Than or Equals,
or Is Greater Than or Equals.
Note: As your ICT classification
taxonomy evolves, using "Is less
than..." or "Is greater than..."
makes your detection rule option
more durable. These comparative
terms allow for searching current
and future taxonomies. If you write
every rule using "Equals," you
may have to revise your rules
often.

Select Organization Choose an Organization from the


drop-down menu, which contains
the organizations imported in the
ICT taxonomy. You can also
choose Any.

Note that the term Organization


in Data Loss Prevention is called
Company in Information Centric
Tagging.

Select Scope Choose a Scope from the


drop-down menu, which contains
the scopes imported in the ICT
taxonomy. You can also choose
Any.
Detecting content using classification matching 865
Configuring the Content Matches Classification condition

Table 34-7 Content Matches Classification parameters (continued)

Parameter Description

Select Level Choose a Level from the


drop-down menu, which contains
the sensitivity levels imported in
the ICT taxonomy.

■ If you selected a specific


Organization and Scope, the
Level menu includes the
sensitivity level and name,
such as (1) PUBLIC, (4)
CONF, and (9) TOPSECRET.
■ If you selected Any
Organization and Scope, the
Level menu displays only the
level numbers, 1 - 9, since the
level names could differ
among scopes.
■ The Level is compared only if
the Organization and Scope
requirements are met.

OR Click OR to add another selection


of Operator, Organization,
Scope, and Level. You can add
multiple OR statements for one
rule. OR statements are evaluated
individually; they do not all need
to be true to create an incident.
Chapter 35
Detecting international
language content
This chapter includes the following topics:

■ Detecting non-English language content

■ Best practices for detecting non-English language content

Detecting non-English language content


Symantec Data Loss Prevention detection features support many localized versions of Microsoft
Windows operating systems. To use international character sets, the Windows system on
which you view the Enforce Server administration console must have the appropriate
capabilities.
See “About support for character sets, languages, and locales” on page 91.
See “Working with international characters” on page 93.
You can create policies and detect violations using any supported language. You can use
localized keywords, regular expressions, and Data Profiles to detect data loss. In addition,
Symantec Data Loss Prevention offers several international data identifiers and policy templates
for protecting confidential data.
See “Supported languages for detection” on page 92.
See “Use international policy templates for policy creation” on page 867.
See “Use custom keywords for system data identifiers” on page 869.
Detecting international language content 867
Best practices for detecting non-English language content

Best practices for detecting non-English language


content
This section provides some best practices for implementing non-English language conent
detection.

Use international policy templates for policy creation


Symantec Data Loss Prevention provides several international policy templates that you can
quickly deploy in your enterprise.
See “Creating a policy from a template” on page 397.

Table 35-1 International policy templates

Policy template Description

Canadian Social Insurance Numbers This policy detects patterns indicating Canadian social insurance numbers.

See “Canadian Social Insurance Numbers policy template” on page 1562.

Caldicott Report This policy protects UK patient information.

See “Caldicott Report policy template” on page 1561.

Data Protection Act 1998 This policy protects personal identifiable information.

See “Data Protection Act 1998 policy template” on page 1568.

EU Data Protection Directives This policy detects personal data specific to the EU directives.
See “Data Protection Directives (EU) policy template” on page 1570.

General Data Protection Regulations This policy protects personal identifiable information related to banking and
(Banking and Finance) finance.

See “General Data Protection Regulation (Banking and Finance)” on page 1583.

General Data Protection Regulation This policy protects personal identifiable information related to digital identity.
(Digital Identity)
See “General Data Protection Regulation (Digital Identity)” on page 1617.

General Data Protection Regulation This policy protects personal identifiable information related to government
(Government Identification) identification.

See “General Data Protection Regulation (Government Identification)”


on page 1618.
Detecting international language content 868
Best practices for detecting non-English language content

Table 35-1 International policy templates (continued)

Policy template Description

General Data Protection Regulation This policy protects personal identifiable information related to healthcare
(Healthcare and Insurance) and insurance.

See “General Data Protection Regulation (Healthcare and Insurance)”


on page 1656.

General Data Protection Regulation This policy protects personal identifiable information related to personal
(Personal Profile) profile data.

See “General Data Protection Regulation (Personal Profile)” on page 1672.

General Data Protection Regulation This policy protects personal identifiable information related to travel.
(Travel)
See “General Data Protection Regulation (Travel)” on page 1675.

Human Rights Act 1998 This policy enforces Article 8 of the act for UK citizens.

See “Human Rights Act 1998 policy template” on page 1694.

PIPEDA (Canada) This policy detects Canadian citizen customer data.

See “PIPEDA policy template” on page 1711.

SWIFT Codes (International banking) This policy detects codes that banks use to transfer money across
international borders.

See “SWIFT Codes policy template” on page 1726.

UK Drivers License Numbers This policy detects UK Drivers License Numbers.

See “UK Drivers License Numbers policy template” on page 1727.

UK Electoral Roll Numbers This policy detects UK Electoral Roll Numbers.

See “UK Electoral Roll Numbers policy template” on page 1727.

UK National Insurance Numbers This policy detects UK National Insurance Numbers.

See “UK National Insurance Numbers policy template” on page 1728.

UK National Health Service Number This policy detects personal identification numbers issued by the NHS.

See “UK National Health Service (NHS) Number policy template” on page 1728.

UK Passport Numbers This policy detects valid UK passports.

See “UK Passport Numbers policy template” on page 1728.

UK Tax ID Numbers This policy detects UK Tax ID Numbers.

See “UK Tax ID Numbers policy template” on page 1729.


Detecting international language content 869
Best practices for detecting non-English language content

Use custom keywords for system data identifiers


Data identifiers offer broad support for detecting international content.
See “Introducing data identifiers” on page 717.
Some international data identifiers offer a wide breadth of detection only. In this case you can
implement the Find Keywords optional validator to narrow the scope of detection. Implementing
this optional validator may help you eliminate any false positives that your policy matches.
See “Selecting a data identifier breadth” on page 739.
The following table provides keywords for several international data identifiers.
To use international keywords for system data identifiers
1 Create a policy using one of the system-provided international data identifiers that is listed
in the table.
Table 35-2
2 Select the Find Keywords optional validator.
See “Configuring the Content Matches data identifier condition” on page 737.
3 Copy and past the appropriate comma-separated keywords from the list to the Find
Keywords optional validator field.
See “Configuring optional validators” on page 763.

Table 35-2 International data identifiers and keyword lists

Data Identifier Language Keywords English Translation

Argentina Tax Spanish Número de Identificación Fiscal, Tax identification number,


Identification Number número de contribuyente, taxpayer number, Argentina tax
Número de identificación fiscal identification number, Argentina
Argentina, Argentina número de taxpayer number
contribuyente

Austria Passport German REISEPASS, ÖSTERREICHISCH Passport, Austrian passport


Number REISEPASS, reisepass

Austria Tax German Österreich, Steuernummer Austria, tax number


Identification Number

Austria Value Added German MwSt, Umsatzsteuernummer, VAT, sales tax number, VAT
Tax (VAT) Number MwSt Nummer, number, VAT identification
Ust.-Identifikationsnummer, number, sales tax, UID number
umsatzsteuer, Umsatzsteuer-
Identifikationsnummer
Detecting international language content 870
Best practices for detecting non-English language content

Table 35-2 International data identifiers and keyword lists (continued)

Data Identifier Language Keywords English Translation

Austrian Social German sozialversicherungsnummer, Social insurance number, social


Security Number soziale sicherheit security number, insurance
kein,Versicherungsnummer, number, Austrian SSN, Austrian
Österreichischen SSN, social insurance
Österreichischen
Sozialversicherungs

Belgian National French Numéro national, numéro de National number, security number,
Number sécurité, numéro d'assuré, number of insured, national
identifiant national, identification, national
identifiantnational#, identification #, national number
Numéronational# #

Belgium Driver's German, French, Führerschein, Fuhrerschein, Driver's license, driver's license
License Number Frisian Fuehrerschein, number, driving permit, driving
Führerscheinnummer, permit number
Fuhrerscheinnummer,
Fuehrerscheinnummer,
Führerscheinnummer,
Fuhrerscheinnummer,
Fuehrerscheinnummer,
Führerschein- Nr, Fuhrerschein-
Nr, Fuehrerschein- Nr, permis de
conduire,
rijbewijs,Rijbewijsnummer,
Numéro permis conduire

Belgium Passport Dutch, German, Paspoort, paspoort, Passport, passport number,


Number French paspoortnummer, Reisepass passport book, passport card
kein, Reisepass, Passnummer,
Passeport, Passeport livre,
Passeport carte, numéro
passeport

Belgium Tax Dutch, German, Numéro de registre national, National registry number, tax
Identification Number French numéro d'identification fiscale, identification number, tax number
belasting aantal,Steuernummer

Belgium Value Added German, French Numéro T.V.A, VAT number, tax identification
Tax (VAT) Number Umsatzsteuer-Identifikationsnummer, number
Umsatzsteuernummer
Detecting international language content 871
Best practices for detecting non-English language content

Table 35-2 International data identifiers and keyword lists (continued)

Data Identifier Language Keywords English Translation

Brazilian Election Brazilian número identificação, Identification number, voter


Identification Number Portuguese identificação do eleitor, ID eleitor identification, electoral
eleição, número identificação identification number, Brazilian
eleitoral, Número identificação electoral identification number,
eleitoral brasileira,
IDeleitoreleição#

Brazilian Brazilian Portuguese Brasileira ID Legal, entidades


National jurídicas ID,Registro Nacional
Registry of de Pessoas Jurídicas n º,
Legal Entities BrasileiraIDLegal#
Number

Brazilian Natural Brazilian Portuguese Cadastro de Pessoas Físicas,


Person Registry Brasileiro Pessoa Natural
Number Número de Registro, pessoa
natural número de registro,
pessoas singulares registro NO

British Columbia French MSP nombre, soins de santé no, MSP Number, MSP no, personal
Personal Healthcare soins de santé personnels healthcare number, Healthcare
Number nombre, MSPNombre#, No, PHN
soinsdesanténo#

Bulgaria Value Added Bulgarian номер на таксата, ДДС, ДДС#, Fee number, VAT, VAT number,
Tax (VAT) Number ДДС номер., ДДС номер.#, value added tax
номер на данъка върху
добавената стойност, данък
върху добавената стойност,
ДДС номер

Bulgarian Uniform Civil Bulgarian Униформ граждански номер, Uniform civil number, Uniform ID,
Number - EGN Униформ ID, Униформ Uniform civil ID, Bulgarian uniform
граждански ID, Униформ civil number
граждански не., български
Униформ граждански номер,
УниформгражданскиID#,
Униформгражданскине.#

Burgerservicenummer Dutch Persoonsnummer, sofinummer, person number, social-fiscal


sociaal-fiscaal nummer, number (abbreviation),
persoonsgebonden social-fiscal number,
person-related number
Detecting international language content 872
Best practices for detecting non-English language content

Table 35-2 International data identifiers and keyword lists (continued)

Data Identifier Language Keywords English Translation

Canada Driver's French permis de conduire Driver's license


License Number

Canada Passport French numéro passeport, No passeport, Passport number, passport no.,
Numbert passeport# passport#

Canada Permanent French numéro résident permanent, permanent resident number,


Resident (PR) Number résident permanent non, résident permanent resident no, permanent
permanent no., carte résident resident number, permanent
permanent, numéro carte résident resident card, permanent resident
permanent, pr non card number, pr no

Chilean National Spanish Chilena número identificación, Chileand identification number,


Identification Number nacional identidad, número national identity, identification
identificación, número number, national identification
identificación nacional, identidad number, identity number, Unique
número, National Role
NúmerodeIdentificación#,
Identidadchilenano#, Rol Único
Nacional, RolÚnicoNacional#,
nacionalidentidad#

China Passport Number Chinese 中国护照, 护照, 护照本 Chinese passport, passport,
passport book

Codice Fiscale Italian codice fiscal, dati anagrafici, tax code, personal data, VAT
partita I.V.A., p. iva number, VAT number

Columbian Addresses Spanish Calle, Cll, Carrera, Cra, Cr, Street, St, Career, Avenue,
Avenida, Av, Dg, Diagonal, Diag, Diagonal, Transversal, sidewalk
Tv, Trans, Transversal, vereda

Columbian Cell Phone Spanish numero celular, número de Cellular number, telephone
Number teléfono, teléfono celular no., number, cellular telephone
numero celular# number

Columbian Personal Spanish cedula, cédula, c.c., c.c,C.C., C.C, Identification card, citizenship
Identification Number cc, CC, NIE., NIE, nie., nie, cedula card, identification document
de ciudadania, cédula de
ciudadanía, cc#, CC #, documento
de identificacion, documento de
identificación, Nit.

Columbian Tax Spanish NIT., NIT, nit., nit, Nit. TIN (tax identification number)
Identification Number
Detecting international language content 873
Best practices for detecting non-English language content

Table 35-2 International data identifiers and keyword lists (continued)

Data Identifier Language Keywords English Translation

Croatia National Croatian Osobna iskaznica, Nacionalni Personal ID, national identification
Identification Number identifikacijski broj, osobni ID, number, personal ID, personal
osobni identifikacijski broj, porez identification number, tax
iskaznica, porezni broj, porezni identification card, tax number, tax
identifikacijski broj, porez kod, identification number, tax code,
šifra poreznog obveznika taxpayer code

Cyprus Tax Turkish, Greek αριθμός φορολογικού μητρώου, Tax identification number, tax
Identification Number Vergi Kimlik Numarası, vergi number, TIN number, Cyprus TIN
numarası, Kıbrıs TIN numarası number

Cyprus Value Added Turkish, Greek KDV, kdv#, KDV numarası, Katma VAT, VAT number, value added
Tax (VAT) Number değer Vergisi, Φόρος tax,
Προστιθέμενης Αξίας

Czech Republic Driver's Czech řidičský průkaz, řidičský prúkaz, Driving license, driver's license
Licence Number číslo řidičského průkazu, řidičské number, driving license number,
číslo řidičů, ovladače lic., Číslo driver's lic., driver license number,
licence řidiče, Řidičský průkaz, driver's permit
povolení řidiče, řidiči povolení,
povolení k jízdě, číslo licence

Czech Republic Czech Česká Osobní identifikační číslo, Czech Personal Identification
Personal Identification Osobní identifikační číslo., Number, personal identification
Number identifikační číslo, čeština number, Czech identification
identifikační číslo number

Czech Republic Tax Czech osobní kód, Národní identifikační Personal code, national
Identification Number číslo, osobní identifikační číslo, identification number, personal
cínové číslo, daňové identifikačné identification number, TIN number,
číslo, daňový poplatník id tax identification number, taxpayer
ID

Czech Republic Value Czech číslo DPH, Daň z přidané VAT number, value added tax,
Added Tax (VAT) hodnoty, Dan z pridané hodnoty, VAT
Number Daň přidané hodnoty, Dan
pridané hodnoty, DPH, DIC, DIČ
Detecting international language content 874
Best practices for detecting non-English language content

Table 35-2 International data identifiers and keyword lists (continued)

Data Identifier Language Keywords English Translation

Denmark Personal Danish Nationalt identifikationsnummer, National identification number,


Identification Number personnummer, unikt personal number, unique
identifikationsnummer, identification number, identification
identifikationsnummer, centrale number, central registry of
personregister, persons, CPR number
cpr,cpr-nummer,cpr#,
cpr-nummer#,
identifikationsnummer#,
personnummer#

Denmark Value Added Danish moms, momsnummer, moms VAT number, vat, value added tax
Tax (VAT) Number identifikationsnummer, number, vat identification number
merværdiafgift

Estonia Driver's Estonian juhiluba, JUHILUBA, juhiluba Driving license, driving license
Licence Number number, juhiloa number, number, driver's license number,
Juhiluba, juhi litsentsi number license number

Estonia Passport Estonian Pass, pass, passi number, pass Passport, passport number,
Number nr, pass#, Pass nr, Eesti passi Estonian passport number
number

Estonia Personal Estonian isikukood, isikukood#, IK, IK#, Personal identification code, tax
Identification Code maksu ID, maksukohustuslase ID, taxpayer identification number,
identifitseerimisnumber, tax identification number, tax
maksukood, maksukood#, code, taxpayer code
maksuID#, maksumaksja kood,
maksumaksja
identifitseerimisnumber

Estonia Value Added Estonian käibemaksu VAT registration number, VAT,


Tax (VAT) Number registreerimisnumber, VAT number
käibemaksu, Käibemaksu
number, käibemaks, käibemaks#,
käibemaksu#
Detecting international language content 875
Best practices for detecting non-English language content

Table 35-2 International data identifiers and keyword lists (continued)

Data Identifier Language Keywords English Translation

European Health Croatian, Danish, numero conto medico, tessera Medical account number, health
Insurance Card Number Estonian, Finnish, sanitaria assicurazione numero, insurance card number, insurance
French, German, carta assicurazione numero, card number, health insurance
Irish, Italian, Krankenversicherungsnummer, number, medical account number,
Luxembourgish, assicurazione sanitaria numero, health card number, health card,
Polish, Slovenian, medisch rekeningnummer, insurance number, EHIC number,
Spanish ziekteverzekeringskaartnummer,
verzekerings kaart nummer,
gezondheidskaart nummer,
gezondheidskaart, medizinische
Kontonummer,
Krankenversicherungskarte
Nummer, Versicherungsnummer,
Gesundheitskarte Nummer,
Gesundheitskarte, arstliku konto
number, ravikindlustuse kaardi
number, tervisekaart,
tervisekaardi number, Uimhir
ehic, tarjeta salud, broj kartice
zdravstvenog osiguranja, kartice
osiguranja broj, zdravstvenu
karticu, zdravstvene kartice broj,
ehic broj, numero tessera
sanitaria, numero carta di
assicurazione, tessera sanitaria,
numero ehic, Gesondheetskaart,
ehic nummer, numer rachunku
medycznego, numer karty
ubezpieczenia zdrowotne, numer
karty ubezpieczenia, karta
zdrowia, numer karty zdrowia,
numer ehic,
sairausvakuutuskortin numero,
vakuutuskortin numero,
terveyskortti, terveyskortin
numero, medicinsk
kontonummer, ehic numeris,
medizinescher Konto Nummer,
zdravstvena izkaznica
Detecting international language content 876
Best practices for detecting non-English language content

Table 35-2 International data identifiers and keyword lists (continued)

Data Identifier Language Keywords English Translation

Finland Driver's Finnish, Swedish permis de conduire, ajokortti, Driver's license, driver's license
License Number ajokortin numero, kuljettaja lic., number, driver's lic.
körkort, körkort nummer, förare
lic.

Finland European Finnish Suomi EHIC-numero, Finland EHIC number, sickness


Health Insurance Sairausvakuutuskortti, insurance card, health insurance
Number sairaanhoitokortin, card, EHIC, Finnish health
Sjukförsäkringskort, ehic, insurance card, Health Card,
sairaanhoitokortin, Suomen Survival Card, health insurance
sairausvakuutuskortti, Finska number
sjukförsäkringskort,
Terveyskortti, Hälsokort, ehic#,
sairausvakuutusnumero,
sjukförsäkring nummer

Finland Passport Finnish Suomen passin numero, Finnish passport number, Finnish
Number suomalainen passi, passin passport, passport number,
numero, passin numero.#, passin passport number, passport #
numero#, passin numero, passin
numero., passin numero#, passi#

Finland Tax Finnish verotunniste, verokortti, Tax identification number, tax


Identification Number verotunnus, veronumero card, tax ID, tax number

Finland Value Added Finnish arvonlisäveronumero, ALV, VAT number, VAT, VAT
Tax (VAT) Number arvonlisäverotunniste, ALV nro, identification number
ALV numero, alv

Finnish Personal Finnish tunnistenumero, henkilötunnus, Identification number, personal


Identification Number yksilöllinen henkilökohtainen identification number, unique
tunnistenumero, Ainutlaatuinen personal identification number,
henkilökohtainen tunnus, identity number, Finnish personal
identiteetti numero, Suomen identification number, national
kansallinen henkilötunnus, identification number
henkilötunnusnumero#,
kansallisen tunnistenumero,
tunnusnumero,kansallinen
tunnus numero

France Driver's License French permis de conduire Driver's license


Number

France Health French carte vitale, carte d'assuré social Health card, social insurance card
Insurance Number
Detecting international language content 877
Best practices for detecting non-English language content

Table 35-2 International data identifiers and keyword lists (continued)

Data Identifier Language Keywords English Translation

France Tax French numéro d'identification fiscale Tax identification number


Identification Number

France Value Added French Numéro d'identification taxe sur Value added tax identification
Tax (VAT) Number valeur ajoutée, Numéro taxe number, value added tax number,
valeur ajoutée, taxe valeur value added tax, VAT number,
ajoutée, Taxe sur la valeur French VAT number, SIREN
ajoutée, Numéro de TVA identification number
intracommunautaire, n° TVA,
numéro de TVA, Numéro de TVA
en France, français numéro de
TVA, Numéro d'identification
SIREN

French INSEE Code French INSEE, numéro de sécu, code INSEE, social security number,
sécu social security code

French Passport French Passeport français, Passeport, French passport, passport,


Number Passeport livre, Passeport carte, passport book, passport card,
numéro passeport passport number

French Social Security French sécurité sociale non., sécurité Social secuty number, social
Number sociale numéro, code sécurité security code, insurance number
sociale, numéro d'assurance,
sécuritésocialenon.#,
sécuritésocialeNuméro#

German Passport German Reisepass kein, Reisepass, Passport number, passport,


Number Deutsch Passnummer, German passport number,
Passnummer, Reisepasskein#, passport number
Passnummer#

German Personal ID German persönliche Personal identification number, ID


Number identifikationsnummer, number, Germane personal ID
ID-Nummer, Deutsch number, personal ID number,
persönliche-ID-Nummer, clear ID number, personal
persönliche ID Nummer, number, identity number,
eindeutige ID-Nummer, insurance number
persönliche Nummer,identität
nummer, Versicherungsnummer,
persönlicheNummer#,
IDNummer#
Detecting international language content 878
Best practices for detecting non-English language content

Table 35-2 International data identifiers and keyword lists (continued)

Data Identifier Language Keywords English Translation

Germany Driver's German Führerschein, Fuhrerschein, Driver's license, driver's license


License Number Fuehrerschein, number
Führerscheinnummer,
Fuhrerscheinnummer,
Fuehrerscheinnummer,
Führerscheinnummer,
Fuhrerscheinnummer,
Fuehrerscheinnummer,
Führerschein- Nr, Fuhrerschein-
Nr, Fuehrerschein- Nr

Germany Value Added German Mehrwertsteuer, MwSt, Value added tax, value added tax
Tax (VAT) Number Mehrwertsteuer identification number, value added
Identifikationsnummer, tax number
Mehrwertsteuer nummer

Greece Passport Greek λλάδα pasport αριθμός, Ελλάδα Greece passport number, Greece
Number pasport όχι., Ελλάδα Αριθμός passport no., passport, Greece
Διαβατηρίου, διαβατήριο, passport, passport book
Διαβατήριο, ΕΛΛΑΔΑ
ΔΙΑΒΑΤΗΡΙΟ, Ελλάδα
Διαβατήριο, ελλάδα διαβατήριο,
Διαβατήριο Βιβλίο, βιβλίο
διαβατηρίου

Greece Social Security Greek Αριθμού Μητρώου Κοινωνικής Social security number
Number (AMKA) Ασφάλισης

Greece Value Added Greek FPA, fpa, Foros Prostithemenis VAT, value added tax, tax
Tax (VAT) Number Axias, arithmós dexamenís, Fóros identification number
Prostithémenis Axías, μέγας
κάδος, ΦΠΑ, Φ Π Α, Φόρος
Προστιθέμενης Αξίας, ΦΟΡΟΣ
ΠΡΟΣΤΙΘΕΜΕΝΗΣ ΑΞΙΑΣ, φόρος
προστιθέμενης αξίας, Arithmos
Forologikou Mitroou, Α.Φ.Μ, ΑΦΜ

Greek Tax Identification Greek Αριθμός Φορολογικού Μητρώου, Tax identification number, TIN, tax
Number AΦΜ, Φορολογικού Μητρώου registry number
Νο., τον αριθμό φορολογικού
μητρώου

Hong Kong ID Chinese 身份證 , 三顆星 Identity card, Hong Kong


(Traditional) permanent resident ID Card
Detecting international language content 879
Best practices for detecting non-English language content

Table 35-2 International data identifiers and keyword lists (continued)

Data Identifier Language Keywords English Translation

Hungary Driver's Hungarian jogosítvány, Illesztőprogramok License, driver's lic, driver's


Licence Number Lic, jogsi, licencszám, vezetői license, number of licenses,
engedély, VEZETŐI ENGEDÉLY, driving license
vezető engedély, VEZETŐ
ENGEDÉLY

Hungary Passport French, útlevél, Magyar útlevélszám, Passport, Hungarian passport


Number Hungarian útlevél könyv, nombre, numéro number, passport book, number,
de passeport, hongrois, numéro passport number
de passeport hongrois

Hungarian Social Hungarian Magyar társadalombiztosítási Hungarian social security number,


Security Number szám, Társadalombiztosítási social security number, social
szám, társadalombiztosítási ID, security ID, social security code
szociális biztonsági kódot,
szociális biztonság nincs.,
társadalombiztosításiID#

Hungarian Tax Hungarian Magyar adóazonosító jel no, Hungarian tax identification
Identification Number adóazonosító szám, magyar tumber, tax identification number,
adószám, Magyar adóhatóság Hungarian tax number, Hungarian
no., azonosító szám, tax authority number, tax number,
adóazonosító no., adóhatóság no tax authority number

Hungarian VAT Number Hungarian Közösségi adószám, Általános Value added tax identification
forgalmi adó szám, number, sales tax number, value
hozzáadottérték adó, magyar added tax, Hungarian value added
Közösségi adószám tax number

Iceland National Icelandic kennitala, persónuleg kennitala, Social security number, personal
Identification Number galdur númer, skattanúmer, identification number, magic
skattgreiðenda kóða, kennitala number, tax code, taxpayer code,
skattgreiðenda taxpayer ID number

Iceland Passport Icelandic vegabréf, vegabréfs númer, Passport, passport number,


Number Vegabréf Nei, vegabréf# passport no.

Iceland Value Added Icelandic virðisaukaskattsnúmer, vsk VAT number


Tax (VAT) Number númer
Detecting international language content 880
Best practices for detecting non-English language content

Table 35-2 International data identifiers and keyword lists (continued)

Data Identifier Language Keywords English Translation

Indonesian Identity Indonesian, Kartu Tanda Penduduk nomor, Identity card number, card
Card Number Portuguese número do cartão, Kartu identitas number, Indonesian identity card
Indonesia no, kartu no., Kartu number, card no., Indonesian
identitas Indonesia nomor, Nomor identity card number, ID number
Induk Kependudukan,
númerodocartão,kartuno.,
KartuidentitasIndonesiano

International Bank French Code IBAN, numéro IBAN IBAN Code, IBAN number
Account Number (IBAN)
Central

International Bank French Code IBAN, numéro IBAN IBAN Code, IBAN number
Account Number (IBAN)
East

International Bank French Code IBAN, numéro IBAN IBAN Code, IBAN number
Account Number (IBAN)
West

Ireland Passport Irish irelande passeport, Éire pas, no Ireland passport, passport
Number de passeport, pas uimh, uimhir number, passport
pas, numéro de passeport

Ireland Tax Irish uimhir carthanachta, Uimhir Charity number, charity


Identification Number chláraithe charthanais, uimhir registration number,CHY number,
CHY, CHY uimh., uimhir thagartha tax reference number, Ireland tax
cánach, uimhir aitheantais identification number, Irish tax
cánach ireland, aitheantais identification, tax identification
cánach irish, uimhir aitheantais number, tax id, TIN, Ireland tin
cánach, id cánach, uimhir
chánach, cáin #, STÁIN, cáin id
uimh.

Ireland Value Added Irish cáin bhreisluacha, CBL, CBL aon, Ireland VAT number, VAT
Tax (VAT) Number Uimhir CBL, Uimhir CBL number, VAT no, VAT#, value
hÉireann, bhreisluacha uimhir added tax number, value added
chánach tax, irish VAT

Irish Personal Public Gaelic Gaeilge Uimhir Phearsanta Irish personal public service
Service Number Seirbhíse Poiblí, PPS Uimh., number, PPS no., personal public
uimhir phearsanta seirbhíse service number, service no., PPS
poiblí, seirbhíse Uimh, PPS Uimh, no., PPS service one
PPS seirbhís aon
Detecting international language content 881
Best practices for detecting non-English language content

Table 35-2 International data identifiers and keyword lists (continued)

Data Identifier Language Keywords English Translation

Israel Personal Hebrew, Arabic ‫זהות‬,‫מספר זיהוי ישראלי‬,‫מספר זיהוי‬ Israeli identity number, identity
Identification Number ‫هوية‬,‫هويةاسرائيلية عدد‬,‫ישראלית‬ number, unique identity number,
‫عدد هوية فريدة من نوعها‬,‫رقم الهوية‬,‫ إسرائيلية‬personal ID, unique personal ID,
unique ID

Italy Driver's License Italian patente guida numero, patente di Driver's license number, driver's
Number guida numero, patente di guida, license
patente guida

Italy Health Insurance Italian TESSERA SANITARIA, tessera Health insurance card, Italian
Number sanitaria, tessera sanitaria health insurance card
italiana

Italian Passport Italian Repubblica Italiana Passaporto, Italian Republic passport,


Number Passaporto, Passaporto Italiana, passport, Italian passport, Italian
passport number, Italiana passport number, passport
Passaporto numero, Passaporto number
numero, Numéro passeport
italien, numéro passeport

Italy Value Added Tax Italian IVA, numero partita IVA, IVA#, VAT, VAT number, VAT#, VAT
(VAT) Number numero IVA number

Japan Driver's License Japanese 公安委員会, 番号, 免許, 交付, 運転 Public Security Committee,
Number 免許, 運転免許証, ドライバライセ driver's license, driving license,
ンス, ドライバーズライセンス, ラ driver license, driver's license
イセンス, 運転免許証番号 number, driving license number,
driver license number, license

Japanese Juki-Net ID Japanese 住基ネット識別番号, 住基ネット番 Juki-Net identification number,


Number 号, 識別番号, 個人識別番号 Juki-Net number, identification
number, personal identification
number

Japanese My Number - Japanese マイナンバー, 共通番号 My number, common number


Corporate

Japanese My Number - Japanese マイナンバー, 個人番号, 共通番号 My number, personal number,


Personal common number

Japan Passport Japanese 日本国旅券, パスポート, パスポー Japanese passport, passport,


Number ト数 passport number
Detecting international language content 882
Best practices for detecting non-English language content

Table 35-2 International data identifiers and keyword lists (continued)

Data Identifier Language Keywords English Translation

Kazakhstan Passport Kazakh төлқұжат, төлқұжат нөмірі, Passport, passport number,


Number номер паспорта, заграничный passport ID, international
пасспорт, национальный passport, national passport
паспорт

Korea Passport Number Korean 한국어 여권, 여권, 여권 번호, 대한 Korean passport, passport,
민국 passport number, Republic of
Korea

Korea Residence Korean 외국인 등록 번호, 주민번호 Foreigner registration number,


Registration Number social security number
for Foreigners

Korean Residence Korean 주민등록번호, 주민번호 Resident registration number,


Registration Number social security number
for Korean

Latvia Driver's Licence Latvian licences numurs, vadītāja License number, driver's license,
Number apliecība, autovadītāja apliecība, driver's license number, driver's
vadītāja apliecības numurs, lic.
Vadītāja licences numurs, vadītāji
lic., vadītāja atļauja

Latvia Passport Latvian LATVIJA, LETTONIE, Pases Nr., Latvia, passport no., passport
Number Pases Nr, Pase, pase, pases number, passport book, passport
numurs, Pases Nr, pases #, passport card
grāmata, pase#, pases karte

Latvia Personal Latvian Personas kods, personas kods, Latvia personal code, personal
Identification Number latvijas personas kods, Valsts code, national identification
identifikācijas numurs, valsts number, identification number,
identifikācijas numurs, national ID, latvia TIN, TIN, tax
identifikācijas numurs, identification number, tax ID, TIN
nacionālais id, latvija alva, alva, number, tax number
nodokļu identifikācijas numurs,
nodokļu id, alvas nē, nodokļa
numurs

Latvia Value Added Tax Latvian PVN Nr, PVN maksātāja numurs, VAT no., VAT payer number, VAT
(VAT) Number PVN numurs, PVN#, pievienotās number, VAT#, value added tax,
vērtības nodoklis, pievienotās value added tax number
vērtības nodokļa numurs

Liechtenstein Passport German Reisepass, Pass Nr, Pass Nr., Passport, passport no.
Number Reisepass#, Pass Nr#
Detecting international language content 883
Best practices for detecting non-English language content

Table 35-2 International data identifiers and keyword lists (continued)

Data Identifier Language Keywords English Translation

Lithuania Personal Lithuanian Nacionalinis ID, Nacionalinis National ID, national identification
Identification Number identifikavimo numeris, asmens number, personal ID
kodas

Lithuania Tax Lithuanian mokesčių identifikavimo Nr., tax identification number, tax ID,
Identification Number mokesčių identifikavimo numeris, tax ID number, tax ID number, tax
mokesčių ID, mokesčių id nr, ID #, tax number, tax no., fee #
mokesčių id nr., mokesčių ID#,
mokesčių numeris, mokestis Nr,
mokestis #, Mokesčių
identifikavimo numeris

Lithuania Value Added Lithuanian pridėtinės vertės mokesčio VAT number, VAT, VAT #, Value
Tax (VAT) Number numeris, PVM, PVM#, pridėtinės added tax, VAT registration
vertės mokestis, PVM numeris, number
PVM registracijos numeris

Luxembourg National German, French Eindeutige ID-Nummer, Unique ID number, unique ID,
Register of Individuals Eindeutige ID, ID personnelle, personal ID, personal identification
Number Numéro d'identification number
personnel, IDpersonnelle#,
Persönliche
Identifikationsnummer,
EindeutigeID#

Luxembourg Passport French and passnummer, ausweisnummer, Passport number, passport,


Number German passeport, reisepass, pass, pass Luxembourg pass, Luxembourg
net, pass nr, no de passeport, passport
passeport nombre, numéro de
passeport

Luxembourg Tax French, German Zinn, Zinn Nummer, Luxembourg TIN, TIN number, Luxembourg tax
Identification Number Tax Identifikatiounsnummer, identification number, tax number,
Steier Nummer, Steier ID, tax ID, social security ID,
Sozialversicherungsausweis, Luxembourg tax identification
Zinnzahl, Zinn nein, Zinn#, number, Social Security, Social
luxemburgische Security Card, tax identification
steueridentifikationsnummer, number
Steuernummer,Steuer ID, sécurité
sociale, carte de sécurité sociale,
étain,numéro d'étain, étain non,
étain#, Numéro d'identification
fiscal luxembourgeois, numéro
d'identification fiscale
Detecting international language content 884
Best practices for detecting non-English language content

Table 35-2 International data identifiers and keyword lists (continued)

Data Identifier Language Keywords English Translation

Luxembourg Value German, TVA kee, TVA#, TVA Aschreiwung Luxembourg VAT number, VAT
Added Tax (VAT) Luxembourgish kee, T.V.A, stammnummer, number, VAT, value added tax
Number bleiwen, geheescht, gitt id, number, VAT ID, VAT registration
mehrwertsteuer, vat number, value added tax
registrierungsnummer,
umsatzsteuer-id, wat,
umsatzsteuernummer,
umsatzsteuer-identifikationsnummer,
id de la batterie, lëtzebuerg vat
nee, registréierung nummer,
numéro de TVA, numéro de
enregistrement vat

Macau National Chinese, 身份证号码, 唯一的识别号码 ID number, unique identification


Identification Number Portuguese number
número de identificação, número
cartão identidade, número cartão Identification number, identity card
identidade nacional, número number, national identity card
identificação pessoal, número number, personal identification
identificação único, id único não, number, unique identification
ID único# number, unique non-ID, unique ID
#

Malaysia Passport Malay pasport, nombor pasport, Passport, passport number,


Number pasport# passport #

Malaysian MyKad Malay nombor kad pengenalan, kad Identification card number,
Number (MyKad) pengenalan no, kad pengenalan identification card no., Malaysian
Malaysia, bilangan identiti unik, identification card, unique identity
nombor peribadi, number, personal number
nomborperibadi#,
kadpengenalanno#

Malta National Maltese numru identifikazzjoni nazzjonali, national identification number,


Identification Number ID nazzjonali, numru national ID, personal identification
identifikazzjoni personali, ID number, personal ID
personali, IDnazzjonali#,
IDpersonali#

Malta Tax Identification Maltese kodiċi tat-taxxa, numru tat-taxxa, Tax code, tax number, tax
Number numru identifikazzjoni tat-taxxa, identification number, taxid#
taxxaid#, numru identifikazzjoni taxpayer identification number,
kontribwent, kodiċi kontribwent, taxpayer code, tin, tin no
landa, landa nru
Detecting international language content 885
Best practices for detecting non-English language content

Table 35-2 International data identifiers and keyword lists (continued)

Data Identifier Language Keywords English Translation

Malta Value Added Tax Maltese Numru tal-VAT, numru tal-VAT, VAT number, VAT, value added
(VAT) Number bettija,valur miżjud taxxa tax number, vat identification
in-numru, bettija identifikazzjoni number
in-numru

Mexican Personal Spanish Clave de Registro de Identidad Personal identity registration key,
Registration and Personal, Código de Mexican personal identification
Identification Number Identificación Personal mexicana, code, Mexican personal
número de identificación identification number
personal mexicana

Mexican Tax Spanish Registro Federal de Federal taxpayer registry, tax


Identification Number Contribuyentes, número de identification number, federal
identificación de impuestos, taxpayer registry number, RFC
Código del Registro Federal de number, RFC key
Contribuyentes, Número RFC,
Clave del RFC

Mexican Unique Spanish Única de registro de Población, Unique population registry, unique
Population Registry clave única, clave única de key, unique identity key, unique
Code identidad, clave personal personal identity, personal identity
Identidad, personal Identidad key
Clave, ClaveÚnica#,
clavepersonalIdentidad#

Mexico CLABE Number Spanish Clave Bancaria Estandarizada, Standardized banking code,
Estandarizado Banco número de standardized bank code number,
clave, número de clave, clave code number
número, clave#

Netherlands Bank Dutch, bancu aklarashon number, Bank account number, account
Account Number Papiamento aklarashon number, number
bankrekeningnummer,
rekeningnummer

Netherlands Driver's Dutch RIJMEWIJS, permis de conduire, Driver's license, driving permit,
License Number rijbewijs, Rijbewijsnummer, driver's license number
RIJBEWIJSNUMMER

Netherlands Passport Dutch Nederlanden paspoort nummer, Dutch passport number, passport,
Number Paspoort, paspoort, Nederlanden passport number
paspoortnummer,
paspoortnummer
Detecting international language content 886
Best practices for detecting non-English language content

Table 35-2 International data identifiers and keyword lists (continued)

Data Identifier Language Keywords English Translation

Netherlands Tax Dutch, Nederlands belasting Dutch tax identification number,


Identification Number Pampiamento, identificatienummer, tax identification number, Dutch
Norwegian identificatienummer van tax identification, Dutch tax
belasting, identificatienummer number, tax number
belasting, Nederlands belasting
identificatie, Nederlands belasting
id nummer, Nederlands
belastingnummer, btw nummer,
Nederlandse belasting
identificatie, Nederlands
belastingnummer, netherlands
tax identification tal, netherland's
tax identification tal, tax
identification tal, tax tal,
Nederlânske tax identification tal,
Hollânske tax identification,
Nederlânsk tax tal, Hollânske tax
id tal, netherlands impuesto
identification number,
netherland's impuesto
identification number, impuesto
identification number, impuesto
number, hulandes impuesto
identification number, hulandes
impuesto identification, hulandes
impuesto number, hulandes
impuesto id number

Netherlands Value Dutch, Frisian wearde tafoege tax getal, BTW Value added tax number, VAT
Added Tax (VAT) nûmer, BTW-nummer number
Number

New Zealand Driver's Maori raihana taraiwa Driving license


Licence Number

New Zealand Passport Maori uruwhenua, tau uruwhenua, Passport, passport no.
Number uruwhenua no, uruwhenua no.

Norway Driver's Norwegian førerkort, førerkortnummer Driver's license, driver's license


Licence Number number
Detecting international language content 887
Best practices for detecting non-English language content

Table 35-2 International data identifiers and keyword lists (continued)

Data Identifier Language Keywords English Translation

Norway National Norwegian Nasjonalt ID, personlig ID, National ID, personal ID, national
Identification Number Nasjonalt ID#, personlig ID#, skatt ID #, personal ID #, tax ID, tax
id, skattenummer, skattekode, code, taxpayer ID, taxpayer
skattebetalers id, skattebetalers identification number
identifikasjonsnummer

Norway Value Added Norwegian mva, MVA, momsnummer, VAT, VAT number, VAT
Tax Number Momsnummer, registration number
momsregistreringsnummer

Norwegian Birth Norwegian fødsel nummer, Fødsel nr, fødsel Birth number
Number nei, fødselnei#, fødselnummer#

People's Republic of Chinese 身份证,居民信息,居民身份信息 Identity Card, Information of


China ID (Simplified) resident, Information of resident
identification

Poland Driver's Licence Polish Kierowcy Lic., prawo jazdy, Drivers license number, driving
Number numer licencyjny, zezwolenie na license, license number
prowadzenie, PRAWO JAZDY

Poland European Polish Numer EHIC, Karta Ubezpieczenia EHIC number, Health Insurance
Health Insurance Zdrowotnego, Europejska Karta Card, European Health Insurance
Number Ubezpieczenia Zdrowotnego, Card, health insurance number,
numer ubezpieczenia medical account number
zdrowotnego, numer rachunku
medycznego

Poland Passport French, Polish paszport#, numer paszportu, Nr Passport #, passport number,
Number paszportu, paszport, książka passport number, passport,
paszportowa passport book

passeport, nombre, numéro de Passport, number, passport


passeport, passeport#, No de number, passport #, passport
passeport number

Poland Value Added Polish Numer Identyfikacji Podatkowej, Tax identification number, tax ID
Tax (VAT) Number NIP, nip, Liczba VAT, podatek od number, VAT number, value
wartosci dodanej, faktura VAT, added tax, VAT invoice, VAT
faktura VAT# invoice #

Polish Identification Polish owód osobisty, Tożsamości Identification card, national


Number narodowej, osobisty numer identity, identification card
identyfikacyjny, niepowtarzalny number, unique number, number
numer, numer
Detecting international language content 888
Best practices for detecting non-English language content

Table 35-2 International data identifiers and keyword lists (continued)

Data Identifier Language Keywords English Translation

Polish REGON Number Polish numer statystyczny, REGON, Statistical number, REGON
numeru REGON, number
numerstatystyczny#,
numeruREGON#

Polish Social Security Polish PESEL Liczba, społeczny PESEL number, social security
Number (PESEL) bezpieczeństwo liczba, społeczny number, social security ID, social
bezpieczeństwo ID, społeczny security code
bezpieczeństwo kod,
PESELliczba#,
społecznybezpieczeństwoliczba#

Polish Tax Polish Numer Identyfikacji Podatkowej, Tax identification number, Polish
Identification Number Polski numer identyfikacji tax identification number
podatkowej,
NumerIdentyfikacjiPodatkowej#

Portugal Driver's Portuguese carteira de motorista, carteira driver's license, license number,
License Number motorista, carteira de habilitação, driving license, driving license
carteira habilitação, número de Portugal
licença, número licença,
permissão de condução,
permissão condução, Licença
condução Portugal, carta de
condução

Portugal National Portuguese bilhete de identidade, número de identity card, civil identification
Identification Number identificação civil, número de number, citizen's card number,
cartão de cidadão, documento de identification document, citizen's
identificação, cartão de cidadão, card, bi number of Portugal,
número bi de portugal, número document number
do documento

Portugal Passport French and passaporte, passeport, Passport number, passport,


Number Portuguese portuguese passport, portuguese Portuguese passport
passeport, portuguese
passaporte, passaporte nº,
passeport nº

Portugal Tax Portuguese número identificação fiscal Tax identification numberr


Identification Number

Portugal Value Added Portuguese imposto sobre valor Value added tax, VAT, VAT
Tax (VAT) Number acrescentado, VAT nº, número number, VAT code
iva, vat não, código iva
Detecting international language content 889
Best practices for detecting non-English language content

Table 35-2 International data identifiers and keyword lists (continued)

Data Identifier Language Keywords English Translation

Romania Driver's Romanian permis de conducere, PERMIS DE Driving license, driving license
Licence Number CONDUCERE, Permis de number
conducere, numărul permisului
de conducere, Numărul
permisului de conducere

Romania National Romanian numărul de identificare fiscală, fiscal identification number, tax
Identification Number identificarea fiscală nr #, codul identification number, fiscal code
fiscal nr. number,

Romania Value Added Romanian CIF, cif, CUI, cui, TVA, tva, TVA#, VAT, VAT #, value added tax,
Tax (VAT) Number tva#, taxa pe valoare adaugata, fiscal code, fiscal identification
cod fiscal, cod fiscal de code, unique registration code,
identificare, cod fiscal unique identification code, code
identificare, Cod Unic de unique registration
Înregistrare, cod unic de
identificare, cod unic identificare,
cod unic de înregistrare, cod unic
înregistrare

Romanian Numerical Romanian Cod Numeric Personal, cod Personal numeric code, personal
Personal Code identificare personal, cod unic identification code, unique
identificare, număr personal unic, identification code, identity
număr identitate, număr number, personal identification
identificare personal, number
număridentitate#,
CodNumericPersonal#,
numărpersonalunic#

Russian Passport Russian паспорт нет, паспорт, номер Passport no., passport, passport
Identification Number паспорта, паспорт ID, number, passport ID, Russian
Российской паспорт, Русский passport, Russian passport
номер паспорта, паспорт#, number
паспортID#, номерпаспорта#

Russian Taxpayer Russian НДС, номер TIN (tax identification number),


Identification Number налогоплательщика, taxpayer number, taxpayer ID, rax
Налогоплательщика ИД, налог number
число, налогчисло#, ИНН#,
НДС#
Detecting international language content 890
Best practices for detecting non-English language content

Table 35-2 International data identifiers and keyword lists (continued)

Data Identifier Language Keywords English Translation

SEPA Creditor Identifier Bulgarian, SEPA-Gläubiger-Identifikator, SEPA creditor identifier, creditor


Number North Finnish, French, Gläubiger-ID, SEPA-ID, ID, SEPA ID, creditor ID
German, Irish, Gläubiger-Kennung
Creditor ID, SEPA ID
Italian,
ID créancier, ID SEPA, Identifiant
Luxembourgish, SEPA creditor identifier, crediting,
du créancie
Portuguese, creditor identification
Spanish SEPA Krediter Identifizéierer,
SEPA creditor identifier, Creditor
Kreditergeld, Krediter
Identifier
Identifizéierer
Creditor ID, SEPA ID, Creditor
SEPA kreditoridentifikator,
identifier
Kreditoridentifikator
Creditor ID, Creditor Identifier
Velkojan tunnus, SEPA-tunnus,
Velkojan tunniste Creditor ID, Creditor Identifier

ID Creidiúnaí, Aithnitheoir Creditor Identifier SEPA, Creditor


Creidiúnaí ID, SEPA ID, Creditor Identifier

ID del creditore, Identificatore del SEPA Creditor Identifier, Creditor


creditore Identifier

Identificador de acreedor SEPA,


ID del acreedor, ID de SEPA,
Identificador del acreedor

Identificador Credor SEPA,


Identificador do Credor
Detecting international language content 891
Best practices for detecting non-English language content

Table 35-2 International data identifiers and keyword lists (continued)

Data Identifier Language Keywords English Translation

SEPA Creditor Identifier Bulgarian, SEPA-Gläubiger-Identifikator, SEPA creditor identifier, creditor


Number South Finnish, French, Gläubiger-ID, SEPA-ID, ID, SEPA ID, creditor ID
German, Irish, Gläubiger-Kennung
Creditor ID, SEPA ID
Italian,
ID créancier, ID SEPA, Identifiant
Luxembourgish, SEPA creditor identifier, crediting,
du créancie
Portuguese, creditor identification
Spanish SEPA Krediter Identifizéierer,
SEPA creditor identifier, Creditor
Kreditergeld, Krediter
Identifier
Identifizéierer
Creditor ID, SEPA ID, Creditor
SEPA kreditoridentifikator,
identifier
Kreditoridentifikator
Creditor ID, Creditor Identifier
Velkojan tunnus, SEPA-tunnus,
Velkojan tunniste Creditor ID, Creditor Identifier

ID Creidiúnaí, Aithnitheoir Creditor Identifier SEPA, Creditor


Creidiúnaí ID, SEPA ID, Creditor Identifier

ID del creditore, Identificatore del SEPA Creditor Identifier, Creditor


creditore Identifier

Identificador de acreedor SEPA,


ID del acreedor, ID de SEPA,
Identificador del acreedor

Identificador Credor SEPA,


Identificador do Credor
Detecting international language content 892
Best practices for detecting non-English language content

Table 35-2 International data identifiers and keyword lists (continued)

Data Identifier Language Keywords English Translation

SEPA Creditor Identifier Bulgarian, SEPA-Gläubiger-Identifikator, SEPA creditor identifier, creditor


Number West Finnish, French, Gläubiger-ID, SEPA-ID, ID, SEPA ID, creditor ID
German, Irish, Gläubiger-Kennung
Creditor ID, SEPA ID
Italian,
ID créancier, ID SEPA, Identifiant
Luxembourgish, SEPA creditor identifier, crediting,
du créancie
Portuguese, creditor identification
Spanish SEPA Krediter Identifizéierer,
SEPA creditor identifier, Creditor
Kreditergeld, Krediter
Identifier
Identifizéierer
Creditor ID, SEPA ID, Creditor
SEPA kreditoridentifikator,
identifier
Kreditoridentifikator
Creditor ID, Creditor Identifier
Velkojan tunnus, SEPA-tunnus,
Velkojan tunniste Creditor ID, Creditor Identifier

ID Creidiúnaí, Aithnitheoir Creditor Identifier SEPA, Creditor


Creidiúnaí ID, SEPA ID, Creditor Identifier

ID del creditore, Identificatore del SEPA Creditor Identifier, Creditor


creditore Identifier

Identificador de acreedor SEPA,


ID del acreedor, ID de SEPA,
Identificador del acreedor

Identificador Credor SEPA,


Identificador do Credor

Serbia Unique Master Serbian јединствен мајстор грађанин Unique master citizen number,
Citizen Number Број, Јединствен матични број, unique identification number,
јединствен број ид, Национални unique id number, National
идентификациони број identification number

Serbia Value Added Tax Serbian poreski identifikacioni broj, Tax identification number VAT
(VAT) Number PORESKI IDENTIFIKACIONI number, value added tax, VAT,
BROJ, Poreski br., ПДВ број, identification number, tax number
Порез на додату вредност, PDV
broj, Porez na dodatu vrednost,
porez na dodatu vrednost, PDV,
pdv, ПДВ, порески
идентификациони број, PIB, pib,
пиб, poreski broj, порески број
Detecting international language content 893
Best practices for detecting non-English language content

Table 35-2 International data identifiers and keyword lists (continued)

Data Identifier Language Keywords English Translation

Slovakia Driver's Slovak vodičský preukaz, Vodičský Driving license, license number
Licence Number preukaz, VODIČSKÝ PREUKAZ,
číslo vodičského preukazu,
ovládače lic., povolenie vodiča,
povolenia vodičov, povolenie na
jazdu, povolenie jazdu, číslo
licencie

Slovakia National Hungarian, identifikačné číslo, személyi ID number, identity card number,
Identification Number Slovak igazolvány száma, national identity card number,
személyigazolvány szám, číslo national identification number,
občianského preukazu, identification number, ID card
identifikačná karta č, személyi number, identification card,
igazolvány szám, nemzeti national identity card
személyi igazolvány száma, číslo
národnej identifikačnej karty,
národná identifikačná karta č,
nemzeti személyazonosító
igazolvány, nemzeti azonosító
szám, národné identifikačné číslo,
národná identifikačná značka č,
nemzeti azonosító szám,
azonosító szám, identifikačné
číslo

Slovakia Passport French, Slovak PASSEPORT, passeport, Passport, passport number,


Number cestovný pas, číslo pasu, pas č, passport no
Číslo pasu, PAS, CESTOVNÝ
PAS, Passeport n°

Slovakia Value Added Slovak číslo DPH, číslo dane z pridanej VAT number, value added tax
Tax (VAT) Number hodnoty, identifikačné číslo vat, number, VAT, value added tax,
dph, DPH, daň z pridanej VAT identification number
hodnoty, daň pridanej hodnoty,
číslo dane pridanej hodnoty,
identifikačné číslo DPH

Slovenia Passport French, Slovenian številka potnega lista, potni list, Passport number, passport,
Number knjiga potnega lista, potni list #, passport book, passport #
passeport, Passeport

Slovenia Tax Slovenian identifikacijska številka davka, Tax identification number,


Identification Number Slovenska davčna številka, Slovenian tax number, tax number
Davčna številka
Detecting international language content 894
Best practices for detecting non-English language content

Table 35-2 International data identifiers and keyword lists (continued)

Data Identifier Language Keywords English Translation

Slovenia Unique Master Slovenian EMŠO, emšo, edinstvena številka Unique national number, unique
Citizen Number državljana, enotna identifikacijska identification number, uniform
številka, Enotna maticna številka registration number, unique
obcana, enotna maticna številka registration number, citizen's
obcana, številka državljana, number, unique identification
edinstvena identifikacijska number
številka

Slovenia Value Added Slovenian številka davka na dodano Value added tax number, VAT no,
Tax (VAT) Number vrednost, DDV št, slovenia vat št Slovenia vat no

South African Personal Afrikaans nasionale identifikasie nommer, National identification number,
Identification Number nasionale identiteitsnommer, national identity number,
versekering aantal, persoonlike insurance number, personal
identiteitsnommer, unieke identity number, unique identity
identiteitsnommer, number, identity number
identiteitsnommer,
identiteitsnommer#,
versekeringaantal#,
nasionaleidentiteitsnommer#

South Korea Resident Korean 주민등록번호, 주민번호 Resident Registration Number,


Registration Number Resident Number

Spain Driver's License Spanish permiso de conducción, permiso Driver's license, driver's license
Number conducción, Número licencia number, driving license, driving
conducir, Número de carnet de permit, driving permit number
conducir, Número carnet
conducir, licencia conducir,
Número de permiso de conducir,
Número de permiso conducir,
Número permiso conducir,
permiso conducir, licencia de
manejo, el carnet de conducir,
carnet conducir
Detecting international language content 895
Best practices for detecting non-English language content

Table 35-2 International data identifiers and keyword lists (continued)

Data Identifier Language Keywords English Translation

Spain Value Added Tax Spanish Número IVA españa, Número de Spain VAT number, Spanish VAT
(VAT) Number IVA español, español Número number, VAT Number, VAT, value
IVA, Número de valor agregado, added tax number, value added
IVA, Número IVA, Número tax
impuesto sobre valor añadido,
Impuesto valor agregado,
Impuesto sobre valor añadido,
valor añadido el impuesto, valor
añadido el impuesto numero

Spanish Customer Spanish número cuenta cliente, código Customer account number,
Account Number cuenta, cuenta cliente ID, número account code, customer account
cuenta bancaria cliente, código ID, customer bank account
cuenta bancaria number, bank account code

Spanish DNI ID Spanish NIE número, Documento Nacional NIE number, national identity
de Identidad, Identidad único, document, unique identity,
Número nacional identidad, DNI national identity number, DNI
Número number

Spanish Passport Spanish libreta pasaporte, número passport book, passport number,
Number pasaporte, Número Pasaporte, Spanish passport, passport
España pasaporte, pasaporte

Spanish Social Security Spanish Número de la Seguridad Social, Social security number
Number número de la seguridad social

Spanish Tax ID (CIF) Spanish número de contribuyente, número taxpayer number, corporate tax
de impuesto corporativo, número number, tax identification number,
de Identificación fiscal, CIF CIF number
número, CIFnúmero#

Sri Lanka National Sinhala See user interface ID, national identity number,
Identity Number personal identification number,
National Identity Card number

Sweden Driver's Finnish, Romani, ajokortti, permis de Driver's license, driver's license
License Number Swedish, Yiddish conducere,ajokortin numero, number, driving license number
kuljettajat lic., drivere lic., körkort,
numărul permisului de
conducere, ‫שאָפער דערלויבעניש‬
‫נומער‬, körkort nummer, förare lic.,
‫דריווערס דערלויבעניש‬,
körkortsnummer
Detecting international language content 896
Best practices for detecting non-English language content

Table 35-2 International data identifiers and keyword lists (continued)

Data Identifier Language Keywords English Translation

Sweden Personal Swedish personnummer ID, personligt ID number, personal ID number,


Identification Number id-nummer, unikt id-nummer, unique ID number, personal,
personnummer, identification number
identifikationsnumret,
personnummer#,
identifikationsnumret#

Sweden Tax Swedish skattebetalarens Tax identification number,


Identification Number identifikationsnummer, Sverige Swedish TIN, TIN number
TIN, TIN-nummer

Sweden Value Added Swedish moms#, sverige moms, sverige Swedish VAT, Swedish VAT
Tax (VAT) Number momsnummer, sverige moms nr, number, VAT registration number
sweden vat nummer, sweden
momsnummmer,
momsregistreringsnummer

Swedish Passport Swedish Passnummer, pass, sverige pass, Passport number, passport,
Number SVERIGE PASS, sverige Swedish passport, Swedish
Passnummer passport number

Switzerland Health German, Italian medizinische Kontonummer, Medical account number, health
Insurance Card Number Krankenversicherungskarte insurance card number, health
Nummer, numero conto medico, insurance number
tessera sanitaria assicurazione
numero, assicurazione sanitaria
numero
Detecting international language content 897
Best practices for detecting non-English language content

Table 35-2 International data identifiers and keyword lists (continued)

Data Identifier Language Keywords English Translation

Switzerland Passport French, German, Passeport, passeport, numéro Passport, passport number,
Number Italian passeport, numéro de passport # passport book
passeport,passeport#, No de
Passport, passport Number,
passeport, No de passeport.,
passport #
Numéro de passeport,
PASSEPORT, LIVRE DE Passport, passport number,
PASSEPORT passport no., passport #

Pass, Passnummer, Pass#, Pass Passport, passport #


Nr., Pass Nr, PASS

Passaporto, Numero di
passaporto, passaporto,
Passaporto n,Passaporto n.,
passaporto#, Passaport, numero
passaporto, numero di
passaporto, numero passaporto,
passaporto n, PASSAPORTO

Reisepass, Reisepass#,
REISEPASS

Switzerland Value French, German, T.V.A, numéro TVA, T.V.A#, VAT, VAT number, VAT #, value
Added Tax (VAT) Italian numéro taxe valeur ajoutée, added tax number, value added
Number T.V.A., taxe sur la valeur ajoutée, tax, VAT registration number,
T.V.A#, numéro enregistrement
VAT, VAT number, VAT #
TVA, Numéro TVA
VAT, VAT registration number,
I.V.A, Partita IVA, I.V.A#, numero
VAT #, VAT number
IVA

MwSt,
Umsatzsteuer-Identifikationsnummer,
MwSt#, Mehrwertsteuer-Nummer,
Mehrwertsteuer, VAT
Registrierungsnummer,
Umsatzsteuer-Identifikationsnummer

Swiss AHV Number French, German, Numéro AVS, numéro d'assuré, AVS number, insurance number,
Italian identifiant national, numéro national identifier, national
d'assurance vieillesse, numéro insurance number, social security
de sécurité soclale, Numéro AVH number, AVH number

AHV-Nummer, Matrikelnumme, AHV number, Swiss Registration


Personenidentifikationsnummer number, PIN

AVS, AVH AVS, AVH


Detecting international language content 898
Best practices for detecting non-English language content

Table 35-2 International data identifiers and keyword lists (continued)

Data Identifier Language Keywords English Translation

Swiss Social Security French, German, Identifikationsnummer, Identification number, social


Number (AHV) Italian sozialversicherungsnummer, security number, personal
identification personnelle ID, identification ID, tax identification
Steueridentifikationsnummer, number, tax ID, social security
Steuer ID, codice fiscale, number, tax number
Steuernummer

Taiwan ROC ID Chinese 中華民國國民身分證 Taiwan ID


(Traditional)

Thailand Passport Thai หนังสือเดิน ทาง Passport, passport number


Number ,หมายเลขหนังสือเดินทาง

Thailand Personal ID Thai ประกันภัยจำนวน, Insurance number, personal


Number หมายเลขประจำตัวส่วนบุคคล, identification, identification number
หมายเลขประจำตัวที่ไม่ซ้ำกัน,
ประกันภัยจำนวน#,
หมายเลขประจำตัวส่วนบุคคล#,
หมายเลขประจำตัวที่ไมซ้ำกัน#

Turkish Identification Turkish Kimlik Numarası, Türkiye Identification number, Turkish


Number Cumhuriyeti Kimlik Numarası, Republic identification number,
vatandaş kimliği, kişisel kimlik citizen identity, personal
no, kimlik Numarası#, vatandaş identification number, citizen
kimlik numarası, Kişisel kimlik identification number
Numarası

Ukraine Identity Card Ukrainian посвідчення особи України Ukraine identity card

Ukraine Passport Ukrainian паспорт, паспорт України, Passport, Ukraine passport,


Number (Domestic) номер паспорта, персональний passport number

Ukraine Passport Ukranian паспорт, паспорт України, Passport, Ukraine passport,


Number (International) номер паспорта passport number

United Arab Emirates Arabic ‫فريدة‬,‫رقم التعريف الشخصي‬,‫ الهوية الشخصية رقم‬Personal ID Number, PIN, Unique
Personal Number ‫هوية‬,‫التأمينرقم‬,‫التأمين رقم‬,‫ من نوعها هوية رقم‬ID Number, Insurance Number,
‫فريدة‬# Unique Identity #

Venezuela National ID Spanish cédula de identidad número, National ID number, national


Number clave única de identidad, identification number, personal ID
personal de identidad clave, number, personal identification,
personal de identidad, número de unique identification number
identificación nacional, número
ID nacional
Detecting international language content 899
Best practices for detecting non-English language content

Enable token validation to match Chinese, Japanese, and Korean


keywords on the server
The Content Matches Keyword condition supports both whole word and partial word matching.
Symantec Data Loss Prevention detection servers support natural language processing for
Chinese, Japanese, and Korean (CJK) language keywords. If you want to detect CJK keywords,
the recommendation is to enable token validation on the detection server and to use whole
word matching for the keyword condition.
The DLP Agent does not support token validation for CJK. On the endpoint, for CJK and
mixed-language keyword matching, consider using partial word matching.
With whole word matching, keywords match at word boundaries only (\W in the regular
expression lexicon). Any characters other than A-Z, a-z, and 0-9 are interpreted as word
boundaries. With whole word matching, keywords must have at least one alphanumeric
character (a letter or a number). A keyword consisting of only white-space characters, such
as "..", is ignored.
See “About keyword matching for Chinese, Japanese, and Korean (CJK) languages”
on page 839.
Chapter 36
Detecting file properties
This chapter includes the following topics:

■ Introducing file property detection

■ Configuring file property matching

■ Best practices for using file property matching

Introducing file property detection


Symantec Data Loss Prevention provides various methods for detecting the context of
messages, files, and attachments. You can detect the type, size, and name of files and
attachments. You can also use these conditions to except files and attachments from matching.
See “About file type matching” on page 900.
See “About file size matching” on page 902.
See “About file name matching” on page 903.
See “Configuring file property matching” on page 903.

About file type matching


You use the Message Attachment or File Type Match condition to match the file type of a
message attachment. Symantec Data Loss Prevention supports the identification of over 300
file types.
See “Supported formats for file type identification” on page 964.
Example uses of message attachment and file type matching are as follows:
■ A certain type of document should never leave the organization (such as a PGP document
or AutoCAD file).
Detecting file properties 901
Introducing file property detection

■ A certain type of match is likely to occur only in a document of a certain type, such as a
Word document.
The detection engine does not rely on the file name extension to match file format type. The
engine checks the binary signature of supported file formats. For example, if a user changes
a .doc file's extension to .txt and emails the file, the detection engine can still register a match
because it checks the binary signature of the file to detect it as an DOC file.
See “Supported formats for file type identification” on page 964.

Note: File type matching does not detect the content of the file; it only detects the file type
based on its binary signature. To detect content, use a content matching condition.

See “Configuring the Message Attachment or File Type Match condition” on page 904.
See “About custom file type identification” on page 901.

About file format support for file type matching


Symantec Data Loss Prevention supports over 300 file formats for file type identification using
the Message Attachment or File Type Match policy condition.
Refer to the following link for a complete list of file formats that can be recognized by this policy
condition.
See “Supported formats for file type identification” on page 964.

About custom file type identification


If the type of file you want to detect is not supported as a system default file type, Symantec
Data Loss Prevention provides you with the ability to identify custom file types using scripts.
To detect a custom file type, you use the Symantec Data Loss Prevention Scripting Language
to write a custom script that detects the binary signature of the file format that you want to
protect. To implement this match condition you need to enable it on the Enforce Server.
See “Enabling the Custom File Type Signature condition in the policy console” on page 908.
See “Configuring the Custom File Type Signature condition” on page 908.
Refer to the Symantec Data Loss Prevention Detection Customization Guide for the language
syntax and examples.

Note: The Symantec Data Loss Prevention Scripting Language only identifies custom file
formats; it does not extract content from custom file types.
Detecting file properties 902
Introducing file property detection

About file size matching


Use Message Attachment or File Size Match to detect content based on the size of particular
email message components.
See “Detection messages and message components” on page 391.
You can also detect matches for the number of files attached to email for SMTP.
The condition you choose when you configure this rule determines how a match is detected.
You choose from these options:
■ Single – This condition detects a match when the body of an email message or an email
attachment meets or exceeds the file size you specify. Detection is based on the each
component individually.
For example, you could specify a condition where the single file size is more than 50 KB
(kilobytes). An email message with a 20 KB body, and a single 51 KB email attachment
matches because the detected attachment exceeds 50 KB. However, an email message
with a 20 KB body, and a two 20 KB email attachments does not match. Even though the
entire message is more than 50 KB, each component is less than 50 KB. This rule does
not combine the total size of the body or the attached email files.
■ Total Attachment File Size – This condition, for SMTP only, detects a match when the
size of a single or combined email attachments meets or exceeds the file size criteria you
specify. Detection is based solely on the email attachments and does not factor in the body
of the email message.
For example, you could specify a condition where the total file size is more than 50 KB
(kilobytes). An email message with a 20 KB body, and a single 40 KB email attachment
does not match because while the total email exceeds 50 KB, the condition does not factor
in the body of the email message. However, an email message with a 20 KB body, and a
two 30 KB email attachments does match, because the two file attachments exceed 50
KB. In addition, an email with a 40 KB ZIP archive file attached would not match, even if
the extracted size of the files in that archive exceeded 50 KB.
The default value for the Total Attachment File Size condition is zero. This condition has
a character limit of four digits. You will encounter validation errors if you include decimal
points or other characters when specifying this value.
■ Total Attachment File Count – This condition, for SMTP only, detects a match when the
number of combined email attachments meets or exceeds the file count criteria you specify.
Detection is based solely on the combined number of direct email attachments. For example,
you could specify a condition where the total file count is more than five files. An email with
six files attached would match this condition, but an email with a single ZIP archive file
attachment would not match, even if the ZIP archive contained 20 files.
The default value for the Total Attachment File Count condition is zero. This condition
has a character limit of seven digits. You will encounter validation errors if you include
decimal points or other characters when specifying this value.
Detecting file properties 903
Configuring file property matching

Note: If the Total Attachment File Size and Total Attachment File Count conditions are
ANDed together with a content matching rule, the rules will be applied to all message
components. Components will only match one condition in an incident, even if they violate
more than one of the conditions.

The Total Attachment File Size and Total Attachment File Count rules are available on
both Windows and Mac endpoints. On Windows, they apply to Microsoft Outlook and IBM
(Lotus) Notes events. On Mac, they apply to Outlook for Mac events.
See “Configuring the Message Attachment or File Size Match condition” on page 905.

About file name matching


You use the Message Attachment or File Name Match condition to detect the names of files
and attachments.
See “File name matching syntax” on page 907.
See “File name matching examples” on page 907.
See “Configuring the Message Attachment or File Name Match condition” on page 906.

Configuring file property matching


Table 36-1 lists the conditions available for implementing file property matching.

Table 36-1 File Properties match conditions

Match condition Description

Message Attachment or File Detect or except specific files and attachments by type.
Type Match
See “About file type matching” on page 900.

See “Configuring the Message Attachment or File Type Match condition” on page 904.

Message Attachment or File Detect or except specific files and attachments by size.
Size Match
See “About file size matching” on page 902.

See “Configuring the Message Attachment or File Size Match condition” on page 905.

Message Attachment or File Detect or except specific files and attachments by name.
Name Match
See “About file name matching” on page 903.

See “Configuring the Message Attachment or File Name Match condition” on page 906.

Custom File Type Signature Detect or except custom file types.


Detecting file properties 904
Configuring file property matching

Configuring the Message Attachment or File Type Match condition


The Message Attachment or File Type Match condition matches the file type of an attachment
message component. You can configure an instance of this condition in policy rules and
exceptions.
See “About file type matching” on page 900.
To configure the Message Attachment or File Type Match condition
1 Add a Message Attachment or File Type Match condition to a policy rule or exception,
or edit an existing one.
See “Configuring policies” on page 413.
See “Configuring policy rules” on page 417.
See “Configuring policy exceptions” on page 426.
2 Configure the Message Attachment or File Type Match condition parameters.
See Table 36-2 on page 904.
3 Click Save to save the policy.

Table 36-2 Message Attachment or File Type Match condition parameters

Action Description

Select the file type or types Select all of the formats you want to match.
to match.
See “Supported formats for file type identification” on page 964.

Click select all or deselect all to select or deselect all formats.

To select all formats within a certain category (for example, all word-processing formats),
click the section heading.

The system implies an OR operator among all file types you select. For example, if you
select Microsoft Word and Microsoft Excel file type attachments, the system detects all
messages with Word or Excel documents attached, not messages with both attachment
types

Match on attachments only. This condition only matches on the Message Attachments component.

See “Detection messages and message components” on page 391.

Also match on one or more Select this option to create a compound condition. All conditions must match to trigger
additional conditions. or except an incident.

You can Add any condition available from the list.

See “Configuring compound match conditions” on page 429.


Detecting file properties 905
Configuring file property matching

Configuring the Message Attachment or File Size Match condition


The Message Attachment or File Size Match condition matches or excludes from matching
files of a specified size. You can configure an instance of this condition in policy rules and
exceptions.
See “About file size matching” on page 902.
To configure the Message Attachment or File Size Match condition
1 Add Message Attachment or File Size Match to a policy, or edit a policy that already
contains this rule.
See “Configuring policies” on page 413.
See “Configuring policy rules” on page 417.
See “Configuring policy exceptions” on page 426.
2 Select the Message Attachment or File Type Match condition:
See Table 36-3 on page 905.
3 Click Save to save the policy.

Table 36-3 Message Attachment or File Size Match parameters

Action Description

Single File Size Select More Than to specify the minimum file size of the file to match or Less Than to
specify the maximum file size to qualify a match.

Enter a number, and select the unit of measure: bytes, kilobytes (KB), megabytes (MB),
or gigabytes (GB).

Total Attachment File Size Enter a number, and select the unit of measure: bytes, kilobytes (KB), megabytes (MB),
or gigabytes (GB) to qualify a match.

Total Attachment File Enter a number to specify the number of files to qualify a match
Count

Match on the. Select one or both of the following message components on which to base the match:

■ Envelope – The option is not applicable for these options.


■ Subject – The option is not applicable for these options.
■ Body – The content of the message (This option applies only to Single File Size).
■ Attachments – Any files that are attached to or transferred by the message.

See “Selecting components to match on” on page 423.


Detecting file properties 906
Configuring file property matching

Table 36-3 Message Attachment or File Size Match parameters (continued)

Action Description

Also match one or more Select this option to create a compound condition. All conditions must match to trigger or
additional conditions. except an incident.

You can Add any condition available from the list.

See “Configuring compound match conditions” on page 429.

Configuring the Message Attachment or File Name Match condition


The Message Attachment or File Name Match condition matches based on the name of a
file attached to the message. You can configure an instance of this condition in policy rules
and exceptions.
See “About file name matching” on page 903.
To configure the Message Attachment or File Name Match condition
1 Add a Message Attachment or File Name Match condition to a policy, or edit an existing
one.
See “Configuring policies” on page 413.
See “Configuring policy rules” on page 417.
See “Configuring policy exceptions” on page 426.
2 Configure the Message Attachment or File Type Match condition parameters.
See Table 36-4 on page 906.
3 Click Save to save the policy.

Table 36-4 Message Attachment or File Name Match parameters

Action Description

Specify the File Name. Specify the file name to match using the DOS pattern matching language to represent
patterns in the file name.

Separate multiple matching patterns with commas or by placing them on separate lines.

See “File name matching syntax” on page 907.

See “File name matching examples” on page 907.

Match on attachments. This condition only matches on the Message Attachments component.

See “Detection messages and message components” on page 391.


Detecting file properties 907
Configuring file property matching

Table 36-4 Message Attachment or File Name Match parameters (continued)

Action Description

Also match one or more Select this option to create a compound condition. All conditions must match to trigger or
additional conditions. except an incident.

You can Add any condition available from the list.

See “Configuring compound match conditions” on page 429.

File name matching syntax


For file name matching, the system supports the DOS pattern matching syntax to detect file
names, including wildcards.
See “About file name matching” on page 903.
Any characters you enter (other than the DOS operators) match exactly. To enter multiple file
names, enter them as comma-separated values or by line space.
Table 36-5 describes the syntax for the Message Attachment or File Name Match condition.

Table 36-5 DOS Operators for file name detection

Operator Description

. Use a dot to separate the file name and the extension.

* Use an asterisk as a wild card to match any number of characters (including none).

? Use a question mark to match a single character.

File name matching examples


Table 36-6 lists some examples for matching file names using the Message Attachment or
File Name condition.
See “About file name matching” on page 903.

Table 36-6 File name matching examples

Match objective Example

To match a Word file name that begins with ENG- followed ENG-????????.doc
by any eight characters:

If you are not sure that it is a Word document: ENG-????????.*

If you are not sure how many characters are in the name: ENG-*.*
Detecting file properties 908
Configuring file property matching

Table 36-6 File name matching examples (continued)

Match objective Example

To match all file names that begin with ENG- and all file Enter as comma separated values:
names that begin with ITA-:
ENG-*.*,ITA-*

Or separate the file names by line space:

ENG-*.*

ITA-*

Enabling the Custom File Type Signature condition in the policy


console
By default the Custom File Type Signature policy condition is not enabled. To implement the
Custom File Type Signature condition, you must first enable it.
See “About custom file type identification” on page 901.
To enable the Custom File Type Signature rule
1 Using a text editor, open the file \Program
Files\Symantec\DataLossPrevention\EnforceServer\15.5\Protect\config\Manager.properties

2 Set the value of the following parameter to "true":


com.vontu.manager.policy.showcustomscriptrule=true

3 Stop and then restart the Symantec DLP Manager service.


4 Log back on to the Enforce Server Administration Console and add a new blank policy.
5 Add a new detection rule or exception and beneath the File Properties heading you should
see the Custom File Type Signature condition.
6 Configure the condition with your custom script.
See “Configuring the Custom File Type Signature condition” on page 908.

Configuring the Custom File Type Signature condition


The Custom File Type Signature condition matches custom file types that you have scripted.
You can implement the Custom File Type Signature condition in policy rules and exceptions.
See “About custom file type identification” on page 901.
See “Enabling the Custom File Type Signature condition in the policy console” on page 908.
Detecting file properties 909
Best practices for using file property matching

To configure a Custom File Type Signature condition


1 Add a Custom File Type Signature condition to a policy rule or exception, or edit an
existing one.
See “Configuring policy rules” on page 417.
See “Configuring policy exceptions” on page 426.
2 Configure the Custom File Type Signature condition parameters.
See Table 36-7 on page 909.
3 Click Save to save the policy.

Table 36-7 Custom File Type Signature parameters

Action Description

Enter the Script Name. Specify the name of the script. The name must be unique across policies.

Enter the custom file Enter the File Type Matches Signature script for detecting the binary signature of the custom
type script. file type.

See the Symantec Data Loss Prevention Detection Customization Guide for details on
writing custom scripts.

Match only on This condition only matches on the Message Attachments component.
attachments.
See “Detection messages and message components” on page 391.

Also match one or more Select this option to create a compound condition. All conditions must match to trigger or
additional conditions. except an incident.
You can Add any condition available from the list.

See “Configuring compound match conditions” on page 429.

Best practices for using file property matching


This section provides best practices for using file property matching conditions to match file
formats, file size, and file name.

Use compound file property rules to protect design and multimedia


files
You can use IDM to protect files, or you can use file property rules. Unless you must protect
an exact file, the general recommendation is to use the file property rules because there is
less overhead in setting up the rules.
Detecting file properties 910
Best practices for using file property matching

For example, if you want to detect CAD files that contain IP diagrams, you could index these
files and apply IDM rules to detect them. Alternatively, you could create a policy that contains
a file type rule that detects on the CAD file format plus a file size rule that specifies a threshold
size. The file property approach is preferred because in this scenario all you really care about
is protecting large CAD files potentially leaving the company. There is no need to gather and
index these files for IDM if you can simply create rules that will detect on the file type and the
size.

Do not use file type matching to detect content


File type recognition does not crack the file and detect content; it only detects the file type
based on the file's binary signature. To detect content, use a content detection rule such as
EDM, IDM, Data Identifiers, or Keyword matching.
For custom file type detection, use the DLP Scripting Language. Refer to the Symantec Data
Loss Prevention Detection Customization Guide.

Calculate file size properly to improve match accuracy


The file size method counts both the body and any attachments in the file size you specify.

Use expression patterns to match file names


The following DOS pattern matching expressions are provided as examples for configuring
the Message Attachment or File Name condition.

Table 36-8 File name detection examples

Example

Any characters you enter (other than the DOS operators) match exactly.

For example, to match a Word file name that begins with ENG- followed by any eight characters, enter:
ENG-????????.doc

If you are not sure that it is a Word document, enter: ENG-????????.*

If you are not sure how many characters follow ENG-, enter: ENG-*.*

To match all file names that begin with ENG- and all file names that begin with ITA-, enter: ENG-*.*,ITA-* (comma
separated), or you can separate the file names by line space.

Use scripts and plugins to detect custom file types


Symantec Data Loss Prevention provides two mechanisms for detecting custom file types: the
DLP Scripting Language and the Content Extraction SPI. If the only requirement is file type
Detecting file properties 911
Best practices for using file property matching

recognition, it may be easier to write a script than an SPI plugin. But, there may be occasions
where using a script is inadequate.
The scripting language does not support loops; you cannot iterate over the file type bytes and
do some processing. The scripting language is designed to detect a known signature at a
relatively known offset. You cannot use the scripting language detect subtypes of the same
document type. For example, , if you wanted to detect password protected PDF files, you could
not use the scripting language. Or, if you wanted to detect only Word documents with track
changes enabled, you would have to write a plugin. On the other hand, you can deploy a script
to the endpoint; currently plugins are server-based only.
For more information, refer to the Symantec Data Loss Prevention Content Extraction
Plugin Developers Guide and the Symantec Data Loss Prevention Detection
Customization Guide on writing custom plugins and scripts, respectively.
Chapter 37
Detecting network incidents
This chapter includes the following topics:

■ Introducing protocol monitoring for network

■ Configuring the Protocol Monitoring condition for network detection

■ Best practices for using network protocol matching

Introducing protocol monitoring for network


Symantec Data Loss Prevention provides the Protocol Monitoring condition which lets you
detect network messages based on the communications transport method.
Table 37-1 lists the protocols that Data Loss Prevention supports for network detection.

Table 37-1 Supported protocols for network monitoring

Protocol Description

Email/SMTP Simple Mail Transfer Protocol (SMTP) is a protocol for sending email messages between servers.

FTP The file transfer protocol (FTP) is used on the Internet for transferring files from one computer
to another.

HTTP The hypertext transfer protocol (HTTP) is the underlying protocol that supports the World Wide
Web. HTTP defines how messages are formatted and transmitted, and what actions Web servers
and browsers should take in response to various commands.

HTTP/SSL Hypertext transfer protocol over Secure Sockets Layer (HTTPS) is a protocol for sending data
securely between a client and server.

NNTP Network News Transport Protocol (NNTP), which is used to send, distribute, and retrieve USENET
messages.
Detecting network incidents 913
Configuring the Protocol Monitoring condition for network detection

Table 37-1 Supported protocols for network monitoring (continued)

Protocol Description

TCP:custom_protocol The Transmission Control Protocol (TCP) is used to reliably exchange data between computers
across the Internet. This option is only available if you have defined a custom TCP port.

See “Configuring the Protocol Monitoring condition for network detection” on page 913.

Configuring the Protocol Monitoring condition for


network detection
You use the Protocol Monitoring condition to detect network incidents. You can implement an
instance of the Protocol Monitoring condition in one or more policy detection rules and
exceptions.

Table 37-2 Protocol Monitoring condition parameters for Network

Action Description

Add or modify the Protocol Add a new Protocol or Endpoint Monitoring condition to a policy rule or exception, or
or Endpoint Monitoring modify an existing rule or exception condition.
condition.
See “Configuring policies” on page 413.

See “Configuring policy rules” on page 417.

See “Configuring policy exceptions” on page 426.

Select one or more To detect Network incidents, select one or more Protocols.
protocols to match.
■ Email/SMTP
■ FTP
■ HTTP
■ HTTPS/SSL
■ NNTP

Configure a custom Select one or more custom protocols: TCP:custom_protocol.


network protocol.

Configure endpoint See “Configuring the Endpoint Monitoring condition” on page 918.
monitoring.
Detecting network incidents 914
Best practices for using network protocol matching

Table 37-2 Protocol Monitoring condition parameters for Network (continued)

Action Description

Match on the entire The Protocol Monitoring condition matches on the entire message, not individual message
message. components.

The Envelope option is selected by default. You cannot select individual message
components.

See “Detection messages and message components” on page 391.

Also match one or more Select this option to create a compound condition. All conditions must match to trigger or
additional conditions. except an incident.

You can Add any condition available from the list.

See “Configuring compound match conditions” on page 429.

Best practices for using network protocol matching


This section provides best practices for using file property matching conditions to match file
formats, file size, and file name.

Use separate policies for specific protocols


You can use protocol matching detection to detect network traffic, such as Web mail, social
networking, and specific protocols. For protocol monitoring, consider implementing different
policies for each type of protocol, such as SMTP, TCP, HTTP, FTP, etc. Creating separate
policies for specific protocols may ease remediation and help you tune the policies.

Consider detection server network placement to support IP address


matching
You can detect senders/users and recipients based one or more IP addresses. However, to
do so you must carefully consider the placement of the detection server on your network.
If the detection server is installed between the Web proxy and the Internet, the IP address of
all Web traffic from individuals in your organization appears to come from the Web proxy. If
the detection server is installed between the Web proxy and the internal corporate network,
the IP address of all Web traffic from outside your organization appears to go to the Web proxy.
The best practice is to match on domain names instead of IP addresses.
Chapter 38
Detecting endpoint events
This chapter includes the following topics:

■ Introducing endpoint event detection

■ Configuring endpoint event detection conditions

■ Best practices for using endpoint detection

Introducing endpoint event detection


Endpoint detection matches events on endpoints where the Symantec DLP Agent is installed.
See “About Endpoint Prevent monitoring” on page 2296.
Symantec Data Loss Prevention provides several methods for detecting and excepting endpoint
events, and a collection of response rules for responding to them.
See “Response rule actions for endpoint detection” on page 1740.

About endpoint protocol monitoring


On the endpoint you can detect data loss based on the transport protocol, such as email
(SMTP), Web (HTTP), and file transfer (FTP).
See “Configuring the Endpoint Monitoring condition” on page 918.

Table 38-1 Supported protocols for endpoint monitoring

Protocol Description

Email/SMTP Simple Mail Transfer Protocol (SMTP) is a protocol for sending email messages between servers.

FTP The file transfer protocol (FTP) is used on the Internet for transferring files from one computer
to another.
Detecting endpoint events 916
Introducing endpoint event detection

Table 38-1 Supported protocols for endpoint monitoring (continued)

Protocol Description

HTTP The hypertext transfer protocol (HTTP) is the underlying protocol that supports the World Wide
Web. HTTP defines how messages are formatted and transmitted, and what actions Web servers
and browsers should take in response to various commands.

HTTP/SSL Hypertext transfer protocol over Secure Sockets Layer (HTTPS) is a protocol for sending data
securely between a client and server.

About endpoint destination monitoring


You can also detect endpoint data loss on the destination where data is copied or moved,
such as CD/DVD drive, USB device, or the clipboard.
See “Configuring the Endpoint Monitoring condition” on page 918.

Table 38-2 Supported destinations for endpoint monitoring

Destination Description

Local Drive Monitor the local disk.

CD/DVD The CD/DVD burner on the endpoint computer. This destination can be any type of
third-party CD/DVD burning software.

Removable Storage Device Detect data that is transferred to any eSATA, FireWire, or USB connected storage
device.

Copy to Network Share Detect data that is transferred to any network share or remote file access.

Printer/Fax Detect data that is transferred to a printer or to a fax that is connected to the endpoint
computer. This destination can also be print-to-file documents.

Clipboard The Windows Clipboard used to copy and paste data between Windows applications.

About endpoint global application monitoring


The DLP Agent monitors applications when they access sensitive files. The DLP Agent monitors
any third-party application you add and configure at the System > Agents > Global Application
Monitoring screen.
You can create exceptions for allowable use scenarios.
See “Adding a Windows application” on page 2468.
See “Configuring the Endpoint Monitoring condition” on page 918.
See “Changing global application monitoring settings” on page 2462.
Detecting endpoint events 917
Configuring endpoint event detection conditions

About endpoint location detection


You can detect or except events based on the location of the endpoint.
Using the Endpoint Location detection method, you can choose to detect incidents only when
the endpoint is on or off the network.
For example, you might configure this condition to match only when users are off the corporate
network because you have other rules in place for detecting network incidents. In this case
implementing the Endpoint Location detection method would achieve this result.
See “Configuring the Endpoint Location condition” on page 919.

About endpoint device detection


Symantec Data Loss Prevention lets you detect or except specific endpoint devices based on
described device metadata. You can configure a condition to allow endpoint users to copy
files to a specific device class, such as USB drives from a single manufacturer.
For example, a policy author has a set of USB flash drives with serial numbers that range from
001-010. These are the only flash drives that should be allowed to access the company’s
endpoints. The policy administrator adds the serial number metadata into an exception of a
policy so that the policy applies to all USB flash drives except for the drives with the serial
number that falls into the 001-010 metadata. In this fashion the device metadata allows for
only “trusted devices” to be allowed to carry company data.
See “Creating and modifying endpoint device configurations” on page 922.
The Endpoint Device Class or ID condition detects specific removable storage devices based
on their definitions. Endpoint Destination parameters in the Endpoint Monitoring condition
detect any removable storage device on the endpoint,
See “Configuring the Endpoint Device Class or ID condition” on page 920.

Configuring endpoint event detection conditions


Table 38-3 describes the various methods for implementing endpoint event monitoring.

Table 38-3 Detecting endpoint events

Endpoint match conditions Details

Endpoint Protocol Monitoring Detect endpoint data based on the protocol.

See “About endpoint protocol monitoring” on page 915.

See “Configuring the Endpoint Monitoring condition” on page 918.


Detecting endpoint events 918
Configuring endpoint event detection conditions

Table 38-3 Detecting endpoint events (continued)

Endpoint match conditions Details

Endpoint Destination Detect endpoint data based on the destination.


Monitoring
See “About endpoint protocol monitoring” on page 915.

See “Configuring the Endpoint Monitoring condition” on page 918.

Endpoint Application Detect endpoint data based on the application.


Monitoring
See “About endpoint protocol monitoring” on page 915.

See “Configuring the Endpoint Monitoring condition” on page 918.

Endpoint Device or Class ID Detect when users move endpoint data to a specific device.

See “About endpoint device detection” on page 917.

See “Configuring the Endpoint Device Class or ID condition” on page 920.

Endpoint Location Detect when the endpoint is on or off the corporate network.

See “About endpoint location detection” on page 917.

See “Configuring the Endpoint Location condition” on page 919.

Configuring the Endpoint Monitoring condition


The Endpoint Monitoring condition matches on endpoint message protocols, destinations, and
applications.
You can implement an instance of the Endpoint Monitoring condition in one or more policy
detection rules and exceptions.

Note: This topic does not address network protocol monitoring configuration.
See “Configuring the Protocol Monitoring condition for network detection” on page 913.

Table 38-4 Configure the Endpoint Monitoring condition

Action Description

Add or modify the Add a new Protocol or Endpoint Monitoring condition to a policy rule or
Endpoint Monitoring exception, or modify an existing rule or exception condition.
condition.
See “Configuring policy rules” on page 417.

See “Configuring policy exceptions” on page 426.

See “Configuring policies” on page 413.


Detecting endpoint events 919
Configuring endpoint event detection conditions

Table 38-4 Configure the Endpoint Monitoring condition (continued)

Action Description

Select one or more To detect Endpoint incidents, select one or more Endpoint Protocols:
endpoint protocols to
■ Email/SMTP
match.
■ HTTP
■ HTTPS/SSL
■ FTP

See “About endpoint protocol monitoring” on page 915.

Select one or more To detect when users move data on the endpoint, select one or more Endpoint
endpoint destinations. Destinations:

■ Local Drive
■ CD/DVD
■ Removable Storage Device
■ Copy to Network Share
■ Printer/Fax
■ Clipboard

See “About endpoint protocol monitoring” on page 915.

Monitor endpoint To detect when endpoint applications access files, select the Application File
applications. Access option.

See “About global application monitoring” on page 2461.

Match on the entire The DLP Agent evaluates the entire message, not individual message
message. components.

The Envelope option is selected by default. You cannot select the other
message components.

See “Detection messages and message components” on page 391.

Also match one or more Select this option to create a compound condition. All conditions must match
additional conditions. to trigger or except an incident.

You can Add any condition available from the list.

See “Configuring compound match conditions” on page 429.

Configuring the Endpoint Location condition


The Endpoint Location condition matches endpoint events based on the location of the endpoint
computer where the DLP Agent is installed.
You can implement an instance of the Endpoint Location condition in one or more policy
detection rules and exceptions.
Detecting endpoint events 920
Configuring endpoint event detection conditions

See “Configuring policies” on page 413.

Table 38-5 Configure the Endpoint Location detection condition

Action Description

Add or modify the Add a new Endpoint Location detection condition to a policy rule or exception,
Endpoint Location or modify an existing policy rule or exception.
condition.
See “Configuring policy rules” on page 417.

See “Configuring policy exceptions” on page 426.

Select the location to Select one of the following endpoint locations to monitor:
monitor.
■ Off the corporate network
Select this option to detect or except events when the endpoint computer is
off of the corporate network.
■ On the corporate network
Select this option to detect or except events when the endpoint computer is
on the corporate network.
This option is the default selection.

See “About endpoint location detection” on page 917.

Match on the entire The DLP Agent evaluates the entire message, not individual message
message. components.

The Envelope option is selected by default. The other message components


are not selectable.

See “Detection messages and message components” on page 391.

Also match one or Select this option to create a compound condition. All conditions must match to
more additional trigger or except an incident.
conditions.
You can Add any condition available from the list.

See “Configuring compound match conditions” on page 429.

See “About endpoint location detection” on page 917.


See “Configuring the Endpoint Location condition” on page 919.

Configuring the Endpoint Device Class or ID condition


The Endpoint Device Class or ID condition lets you detect when users move endpoint data to
specific devices.
You can implement the Endpoint Device Class or ID condition in one or more policy detection
rules or exceptions.
See “Configuring policies” on page 413.
Detecting endpoint events 921
Configuring endpoint event detection conditions

Table 38-6 Configuring the Endpoint Device Class or ID condition

Action Description

Add or modify an Add a new Endpoint Device Class or ID condition to a policy rule or exception,
Endpoint Device or modify an existing one.
condition.
See “Configuring policy rules” on page 417.

See “Configuring policy exceptions” on page 426.

Select one or more The condition matches when users move data from an endpoint computer to the
devices. selected device(s).

Click Create an endpoint device to define one or more devices.

See “Creating and modifying endpoint device configurations” on page 922.

Match on the entire The DLP Agent matches on the entire message, not individual message
message. components.

The Envelope option is selected by default. You cannot select other components.

See “Detection messages and message components” on page 391.

Also match one or Select this option to create a compound condition. All conditions must match to
more additional trigger or except an incident.
conditions.
You can Add any condition available from the drop-down menu.

See “Configuring compound match conditions” on page 429.

See “About endpoint device detection” on page 917.

Gathering endpoint device IDs for removable devices


You add device metadata information to the Enforce Server and create one or more policy
detection methods that detect or except the specific device instance or class of device. The
system supports the regular expression syntax for defining the metadata. The system displays
the device metadata at the Incident Snapshot screen during remediation.
See “Creating and modifying endpoint device configurations” on page 922.
The metadata the system requires to define the device instance or device class is the Device
Instance ID. On Windows you can obtain the "Device Instance Id" from the Device Manager.
In addition, Symantec Data Loss Prevention provides DeviceID.exe for devices attached to
Windows endpoints and DeviceID for devices attached to Mac endpoints. You can use these
utilities to extract Device Instance ID strings and device regex information. These utilities also
report what devices the system can recognize for detection. These utilities are available with
the Enforce Server installation files.
See “About the Device ID utilities” on page 2496.
Detecting endpoint events 922
Configuring endpoint event detection conditions

Note: The Device Instance ID is also used by Symantec Endpoint Protection.

To obtain the Device Instance ID (on Windows)


1 Right-click My Computer.
2 Select Manage.
3 Select the Device Manager.
4 Click the plus sign beside any device to expand its list of device instances.
5 Double-click the device instance. Or, right-click the device instance and select Properties.
6 Look in the Details tab for the Device Instance Id.
7 Use the ID to create device metadata expressions.
See “Creating and modifying endpoint device configurations” on page 922.
See “About endpoint device detection” on page 917.

Creating and modifying endpoint device configurations


You can configure one or more devices for specific endpoint detection. Once the device
expressions are configured, you implement the Endpoint Device Class or ID condition in one
or more policy rules or exceptions to deny or allow the use of the specific devices.
You might deny or allow the use of devices if endpoint users must copy sensitive information
to company-provided USB drives or SD cards.
See “Gathering endpoint device IDs for removable devices” on page 921.

Note: You can use the DeviceID utility for Windows and Mac endpoints to generate removable
storage device information. See “About the Device ID utilities” on page 2496.

To create and modify endpoint device ID expressions


1 Go to the System > Agent > Endpoint Devices screen.
2 Click Add Device.
3 Enter the Device Name.
4 Enter a Device Description.
5 Enter the Device Definition expression.
The device definition must conform to the regular expression syntax.
See Table 38-7 on page 923.
See “About writing regular expressions” on page 853.
Detecting endpoint events 923
Best practices for using endpoint detection

6 Click Save to save the device configuration.


7 Implement the Endpoint Device Class or ID condition in a detection rule or exception.
See “Configuring the Endpoint Device Class or ID condition” on page 920.

Table 38-7 Example Windows endpoint regular device expressions

Example device class Expression example

Generic USB Device USBSTOR\\DISK&VEN_SANDISK&PROD_ULTRA_BACKUP&REV_8\.32\\3485731392112B52

iPod generic USBSTOR\\DISK&VEN_APPLE&PROD_IPOD&.*

Lexar generic USBSTOR\\DISK&VEN_LEXAR.*

CD Drive IDE\\DISKST9160412ASG__________________0002SDM1\\4&F4ACADA&0&0\.0\.0

Hard drive USBSTOR\\DISK&VEN_MAXTOR&PROD_ONETOUCH_II&REV_023D\\B60899082H____&0

Blackberry generic USBSTOR\\DISK&VEN_RIM&PROD_BLACKBERRY...&REV.*

Cell phone USBSTOR\\DISK&VEN_PALM&PROD_PRE&REV_000\\FBB4B8FF4CAEFEC11


24DED689&0

Table 38-8 Example Mac endpoint regex information

Example device Regex information example


class

SanDisk USB SanDisk&Cruzer Blade&20051535820CF1302C2E

SD Card SDC&346128262

External hard drive External&RAID&0000000000702293

See “About endpoint device detection” on page 917.

Best practices for using endpoint detection


When implementing endpoint match conditions, keep in mind the following considerations:
■ Any detection method that executes on the endpoint matches on the entire message, not
individual message components.
See “Detection messages and message components” on page 391.
■ The Endpoint Destination and Endpoint Location methods are specific to the endpoint
computer and are not user-based.
See “Distinguish synchronized DGM from other types endpoint detection” on page 941.
Detecting endpoint events 924
Best practices for using endpoint detection

■ You might often combine group and detection methods on the endpoint. Keep in mind that
the policy language ANDs detection and group methods, whereas methods of the same
type, two rules for example, are ORed.
See “Policy detection execution” on page 394.
Chapter 39
Detecting described
identities
This chapter includes the following topics:

■ Introducing described identity matching

■ Described identity matching examples

■ Configuring described identity matching policy conditions

■ Best practices for using described identity matching

Introducing described identity matching


Described identity detection matches patterns in messages from email senders and recipients,
Windows users, IM users, URL domains, and IP addresses.
See “Configuring described identity matching policy conditions” on page 926.
See “Configuring the Sender/User Matches Pattern condition” on page 927.
See “Configuring the Recipient Matches Pattern condition” on page 930.

Described identity matching examples


Table 39-1 lists and describes some example described content matching examples.
Detecting described identities 926
Configuring described identity matching policy conditions

Table 39-1 Pattern identity matching examples

Example Pattern Matches Does Not Match

fr, cu All SMTP email that is addressed Any email that is addressed to
to a .fr (France) or .cu (Cuba) French company with the .com
addresses. extension instead of .fr.

Any HTTP post to a .fr address


through a Web-based mail
application, such as Yahoo mail.

company.com All SMTP email that is addressed Any SMTP email that is not
to the specific domain URL, such addressed to the specific domain
as symantec.com. URL.

3rdlevel.company.com All SMTP email that is addressed Any SMTP email that is not
to the specific 3rd level domain, addressed to the specific 3rd level
such as dlp.symantec.com. domain.

[email protected] All SMTP email that is addressed Any email not specifically
to [email protected]. addressed to [email protected],
such as:
All SMTP email that is addressed
to [email protected] (the ■ [email protected]
pattern is not case-sensitive). ■ [email protected]
[email protected]

192.168.0.* All email, Web, or URL traffic


Note: If the IP address does not
specifically addressed to
match, use one or more domain
192.168.0.[0-255].
URLs instead.
This result assumes that the IP
address maps to the desired
domain, such as
web.company.com.

*/local/dom1/dom/dom2/Sym These are Lotus Notes example


email addresses.
*/Sym*

*/dlp/qa/test/local/Sym*

Configuring described identity matching policy


conditions
Table 39-2 lists and describes the two conditions that Symantec Data Loss Prevention provides
for matching described identities.
Detecting described identities 927
Configuring described identity matching policy conditions

See “Described identity matching examples” on page 925.

Table 39-2 Implementing described identity matching

Match condition Description

Sender/User Matches Pattern Matches on an email address, domain address, IP address, Windows user
name, or IM screen name/handle.

See “Configuring the Sender/User Matches Pattern condition” on page 927.

Recipient Matches Pattern Matches on an email address, domain address, IP address, or newsgroup.

See “Configuring the Recipient Matches Pattern condition” on page 930.

About Reusable Sender/Recipient Patterns


You can create Reusable Sender/User and Recipient Patterns for use in your policies. Reusable
Sender/Recipient Patterns make policy creation and management easier for policies using
such patterns. For details about creating and using Reusable Sender/Recipient Patterns, refer
to the following topics.
See “Configuring a Reusable Sender Pattern” on page 929.
See “Configuring a Reusable Recipient Pattern” on page 931.

Configuring the Sender/User Matches Pattern condition


The Sender/User Matches Pattern condition matches described user and message sender
identities. You can use this condition in a policy detection rule or exception.
See “Introducing described identity matching” on page 925.
See “Best practices for using described identity matching” on page 932.
Configuring the Sender/User Matches Pattern condition describes the process for configuring
the Sender/User Matches Pattern condition.
Detecting described identities 928
Configuring described identity matching policy conditions

Table 39-3 Configuring the Sender/User Matches Pattern condition

Action Description

Enter one or more Sender Email Address Pattern:


Patterns to match one or
■ To match a specific email address, enter the full email address:
more message senders.
[email protected]
Note: The Pattern field ■ To match multiple exact email addresses, enter a comma-separated list:
allows unlimited data (only
[email protected], [email protected],
limited by the browser).
[email protected]
■ To match partial email addresses, enter one or more domain patterns:
■ Enter one or more top-level domain extensions, for example:
.fr, .cu, .in, .jp
■ Enter one or more domain names, for example:
company.com, symantec.com
■ Enter one or more third-level (or lower) domain names:
web.company.com, mail.yahoo.com, smtp.gmail.com,
dlp.security.symantec.com

Windows User Names

Enter the names of one or more Windows users, for example:

john.smith, jsmith

IM Screen Name

Enter one or more IM screen names that are used in instant messaging systems, for
example:

john_smith, jsmith

IP Address
Enter one or more IP addresses that map to the domain you want to match, for example:

■ Exact IP address match, for example:


192.168.1.1 or for IPv6 fdda:c450:e808:3020:abcd:abcd:0000:5000
■ Wildcard match – The asterisk (*) character can substitute for one or more fields,
for example:
192.168.1.* or 192.*.168.* or for IPv6 fdda:c450:e808:3:*:*:*:*

Note: For IPv6, use only long format addresses.

Select a Reusable Sender You can select a Sender Pattern that you have saved for reuse in your policies. Select
Pattern Reusable Sender Pattern, then choose the pattern you want from the dropdown list.
Detecting described identities 929
Configuring described identity matching policy conditions

Table 39-3 Configuring the Sender/User Matches Pattern condition (continued)

Action Description

Match on the entire message. This condition matches on the entire message. The Envelope option is selected by
default. You cannot select any other message component.

See “Detection messages and message components” on page 391.

Also match additional Select this option to create a compound condition. All conditions must match to trigger
conditions. an incident.

You can Add any available condition from the list.

See “Configuring compound match conditions” on page 429.

Configuring a Reusable Sender Pattern


If you want to use a Sender Pattern in multiple policies, configure a Reusable Sender Pattern.
Reusable Sender Patterns can be selected for use in your policies from the Configure Policy
- Edit Rule page. You can create, edit, and manage your Reusable Sender Patterns from the
Sender/Recipient Patterns page. For example, if you use a Sender Pattern in 50 policies,
using a Reusable Sender Pattern lets you enter the Sender Pattern a single time, then select
it for each policy. In addition, if you need to update the Sender Pattern for those 50 policies,
you can edit it from the Configure Reusable Sender Pattern page and your changes will be
applied automatically to each policy using that pattern.
To configure a Reusable Sender Pattern
1 Take one of the following actions:
■ If you are configuring a policy with a Sender/User Matches Pattern rule, from the
Manage > Policies > Policy List > Configure Policy - Edit Rule page, click Create
Reusable Sender Pattern.
■ In the Enforce Server administration console, navigate to Manage > Policies >
Sender/Recipient Patterns, then click Add > Sender Pattern.

2 In the General section on the Configure Reusable Sender Pattern page, enter a Name
and Description for your Reusable Sender Pattern.
3 In the Sender Pattern section, enter the User Patterns and IP Addresses as described
in the "Configuring the Sender/User Matches Pattern condition table".
See Table 39-3 on page 928.
4 Click Save.
Detecting described identities 930
Configuring described identity matching policy conditions

5 To edit a saved Reusable Sender Pattern, on the Manage > Policies > Sender/Recipient
Patterns page, click the dropdown arrow next to the name of the pattern you want to edit,
then select Edit.
6 To delete a saved Reusable Sender Pattern, on the Manage > Policies >
Sender/Recipient Patterns page, click the dropdown arrow next to the name of the
pattern you want to delete, then select Delete.

Note: You cannot delete a Reusable Sender Pattern that is currently in use in any policy.

Configuring the Recipient Matches Pattern condition


The Recipient Matches Pattern condition matches the described identity of message recipients.
You can use this condition in a policy detection rule or exception.
See “Introducing described identity matching” on page 925.
See “Define precise identity patterns to match users” on page 932.
Configuring the Recipient Matches Pattern condition defines the process for configuring the
Recipient Matches Pattern condition.

Table 39-4 Recipient Matches Pattern condition parameters

Action Description

Enter one or more Recipient Email Address/Newsgroup Pattern


Patterns to match one or more
Enter one or more email or newsgroup addresses to match the desired recipients.
message recipients. Separate
multiple entries with commas. To match specific email addresses, enter the full address, such as
[email protected]. To match email addresses from a specific domain, enter
Note: The Pattern field allows
the domain name only, such as symantec.com.
unlimited data (only limited by
the browser). IP Address

Enter one or more IP address patterns that resolve to the domain that you want to
match. You can use the asterisk (*) wildcard character for one or more fields. You can
enter both IPv4 and IPv6 addresses separated by commas.

URL Domain

Enter one or more URL Domains to match Web-based traffic, including Web-based
email and postings to a Web site. For example, if you want to prohibit the receipt of
certain types of data using Hotmail, enter hotmail.com.
Detecting described identities 931
Configuring described identity matching policy conditions

Table 39-4 Recipient Matches Pattern condition parameters (continued)

Action Description

Select a Reusable Recipient You can select a Recipient Pattern that you have saved for reuse in your policies.
Pattern Select Reusable Recipient Pattern, then choose the pattern you want from the
dropdown list.

Configure match counting. Select one of the following options to specify the number of email recipients that must
match:

■ All recipients must match (Email Only) does not count a match unless ALL email
message recipients match the specified pattern.
■ At least _ recipients must match (Email Only) lets you specify the minimum
number of email message recipients that must match to be counted.
Select one of the following options to specify how you want to count the matches:

■ Check for existence


Reports a match count of 1 if there are one or more matches.
■ Count all matches
Reports the sum of all matches.

See “Configuring match counting” on page 421.

Match on the entire message. This condition matches on the entire message. The Envelope option is selected by
default. You cannot select any other message component.

See “Detection messages and message components” on page 391.

Also match additional Select this option to create a compound condition. All conditions in a rule or exception
conditions. must match to trigger an incident.
You can Add any available condition from the list.

See “Configuring compound match conditions” on page 429.

Configuring a Reusable Recipient Pattern


If you want to use a Recipient Pattern in multiple policies, configure a Reusable Recipient
Pattern. Reusable Recipient Patterns can be selected for use in your policies from the
Configure Policy - Edit Rule page. You can create, edit, and manage your Reusable Recipient
Patterns from the Sender/Recipient Patterns page. For example, if you use a Recipient
Pattern in 50 policies, using a Reusable Recipient Pattern lets you enter the Recipient Pattern
a single time, then select it for each policy. In addition, if you need to update the Recipient
Pattern for those 50 policies, you can edit it from the Configure Reusable Recipient Pattern
page and your changes will be applied automatically to each policy using that pattern.
To configure a Reusable Recipient Pattern
1 Take one of the following actions:
Detecting described identities 932
Best practices for using described identity matching

■ If you are configuring a policy with a Recipient Matches Pattern rule, from the Manage
> Policies > Policy List > Configure Policy - Edit Rule page, click Create Reusable
Recipient Pattern.
■ In the Enforce Server administration console, navigate to Manage > Policies >
Sender/Recipient Patterns, then click Add > Recipient Pattern.

2 In the General section on the Configure Reusable Recipient Pattern page, enter a
Name and Description for your Reusable Recipient Pattern.
3 In the Recipient Pattern section, enter the Email Addresses, IP Addresses, and URL
Domains as described in the "Recipient Matches Pattern condition table".
See Table 39-4 on page 930.
4 Click Save.
5 To edit a saved Reusable Recipient Pattern, on the Manage > Policies >
Sender/Recipient Patterns page, click the dropdown arrow next to the name of the
pattern you want to edit, then select Edit.
6 To delete a saved Reusable Recipient Pattern, on the Manage > Policies >
Sender/Recipient Patterns page, click the dropdown arrow next to the name of the
pattern you want to delete, then select Delete.

Note: You cannot delete a Reusable Recipient Pattern that is currently in use in any policy.

Best practices for using described identity matching


This section provides considerations for implementing the Sender/User or Recipient Matches
Pattern conditions in policy detection rules or exceptions. Keep in mind these considerations
when you implement these conditions.

Define precise identity patterns to match users


Both the Sender/User and Recipient conditions match on the entire message, not individual
message components. If either condition is used as an exception, a match excludes the entire
message, not only the header.
See “Policy detection execution” on page 394.
For both described identity matching rules, the system implies an OR between all
comma-separated list items and between all fields. For example, if any single email address
among a list of email addresses matches, the condition reports (or excepts) an incident. Or,
if either an email address, a domain name, or an IP address matches, the condition reports
(or excepts) an incident.
Detecting described identities 933
Best practices for using described identity matching

See “Detection messages and message components” on page 391.


Table 39-5 describes the types of patterns you can use for described identity matching.

Table 39-5 Patterns for identity matching

Pattern Sender/User Matches Pattern Recipient Matches Pattern

Email address: full and partial matches matches

Domain address: top-level and matches matches


subdomains

IP address matches matches

Windows user name matches does not match

IM screen name / handle matches does not match

Newsgroup patterns does not match matches

Specify email addresses exactly to improve accuracy


An email address must match exactly. For example, [email protected] does not match
[email protected]. But, a domain name pattern such as company.com or
something.company.com matches [email protected].

The email address field does not match the sender or recipient of a Web post. For example,
the email address [email protected] does not match if Bob uses a Web browser to send or
receive email. In this case, you must use the domain pattern mail.yahoo.com to match
[email protected].

Match domains instead of IP addresses to improve accuracy


The URL Domain pattern matches HTTP traffic to particular URL domains. You do not enter
the entire URL. For example, you enter mail.yahoo.com not http://www.mail.yahoo.com.
The system does not resolve URL domains to IP addresses. For example, you specify an IP
address of 192.168.1.1 for a specific domain. If users access the domain URL using a Web
browser, the system does not match emails that are transmitted by the IP address. In this
case, use a domain pattern instead of an IP address, such as internalmemos.com.
You can detect senders/users and recipients based one or more IP addresses . However, to
do so you must carefully consider the placement of the detection server on your network. If
the detection server is installed between the Web proxy and the Internet, the IP address of all
Web traffic from individuals in your organization appears to come from the Web proxy. If the
detection server is installed between the Web proxy and the internal corporate network, the
Detecting described identities 934
Best practices for using described identity matching

IP address of all Web traffic from outside your organization appears to go to the Web proxy.
The best practice is to match on domain names instead of IP addresses.
Chapter 40
Detecting synchronized
identities
This chapter includes the following topics:

■ Introducing synchronized Directory Group Matching (DGM)

■ About two-tier detection for synchronized DGM

■ Configuring User Groups

■ Configuring synchronized DGM policy conditions

■ Best practices for using synchronized DGM

Introducing synchronized Directory Group Matching


(DGM)
Symantec Data Loss Prevention provides synchronized Directory Group Matching (DGM) to
detect data based on the exact identities of users, senders, and recipients of that data. Using
synchronized DGM, you can connect the Enforce Server to a group directory server such as
Microsoft Active Directory and detect users based on their directory group affiliation. For
example, you may want to apply policies to staff only in the engineering department of your
company, but not to staff in the human resources department. Synchronized DGM enables
you to do this.
Synchronized DGM is based on a User Group configuration that you populate with users
synchronized from your directory server. When you create a synchronized DGM policy, you
reference the User Group in the policy. At runtime the synchronized DGM policy only applies
to identities in the User Group reference by the policy. Or, consider an example where you
you want to create a policy that applies to your everyone in your organization except the CEO.
In this case you can create a User Group that contains the CEO's identity as a sole group
Detecting synchronized identities 936
About two-tier detection for synchronized DGM

member. You then define a policy exception that references the CEO User Group. At runtime
the policy will ignore messages sent or received by the CEO.
See “User Groups” on page 376.

About two-tier detection for synchronized DGM


On the endpoint, the Recipient based on a Directory Server Group condition requires two-tier
detection for DLP Agents. The corresponding Sender/User based on a Directory Server
Group condition does not require two-tier detection.
Be sure understand the implications of two-tier detection before you deploy the synchronized
DGM Recipient rule to one or more endpoints.
See “Two-tier detection for DLP Agents” on page 395.
To check if two-tier detection is being used, check the
c:\ProgramData\Symantec\DataLossPrevention\DetectionServer\15.5\Protect\logs
\debug\FileReader.log (Windows) or
/var/log/Symantec/DataLossPrevention/DetectionServer/15.5/debug (Linux) on the
Endpoint Server.
See “Troubleshooting policies” on page 445.

Configuring User Groups


The Manage > Policies > User Groups screen displays configured User Groups and is the
starting point for creating a new User Group. User Groups are used for implementing
synchronized DGM.
See “Introducing synchronized Directory Group Matching (DGM)” on page 935.

Note: DLP Agents installed on Mac endpoints support User Groups that use Active Directory
(AD) group conditions in policies.

To create or modify a User Group


1 Establish a connection to the Active Directory server you want to synchronize with.
See “Configuring directory server connections” on page 156.
2 At the Manage > Policies > User Groups screen, click Create New Group.
Or, to edit an existing user group, select the group in the User Groups screen.
Detecting synchronized identities 937
Configuring User Groups

3 Configure the User Group parameters as required.


See Table 40-1 on page 937.

Note: If this is the first time you are configuring the User Group, you must select the option
Refresh the group directory index on Save to populate the User Group.

4 After you locate the users you want, use the Add and Remove options to include or
exclude them in the User Group.
5 Click Save.

Table 40-1 Configure a User Group

Action Description

Enter the group The Group Name is the name that you want to use to identify this group.
name.
Use a descriptive name so that you can easily identify it later on.

Enter the group Enter a short Description of the group.


description

View which policies Initially, when you create a new User Group, the Used in Policy field displays None.
use the group.
If the User Group already exists and you modify it, the system displays a list of the policies that
implement the User Group, assuming one or more group-based policies is created for this User
Group.

Refresh the group Select (check) the Refresh the group directory index on Save option to synchronize the user
directory index on group profile with the most recent directory server index immediately on Save of the profile. If
Save. you leave this box unselected (unchecked), the profile is synchronized with the directory server
index based on the Directory Connection setting.

See “Scheduling directory server indexing” on page 158.

If this is the first time you are configuring the User Group profile, you must select the Refresh
the group directory index on Save option to populate the profile with the latest directory server
index replication.

Select the directory Select the directory server you want to use from the Directory Server list.
server.
You must establish a connection to the directory server before you create the User Group profile.

See “Configuring directory server connections” on page 156.

Include email Check the Include Mail Aliases box to index user email aliases along with primary email
aliases. addresses. For example, if a user has the primary email address "[email protected]"
and an email alias "[email protected]," checking this box will index both email
addresses. Be aware that indexing email aliases will increase your index size.
Detecting synchronized identities 938
Configuring synchronized DGM policy conditions

Table 40-1 Configure a User Group (continued)

Action Description

Search the directory Enter the search string in the search field and click Search to search the directory for specific
for specific users. users. You can search using literal text or wildcard characters (*).

The search results display the Common Name (CN) and the Distinguished Name (DN) of the
directory server that contains the user. These names give you the specific user identity. Results
are limited to 1000 entries.

Click Clear to clear the results and begin a new search of the directory.
Literal text search criteria options:

■ Name of individual node, such as "engineering" or "accounting"


■ Email address, such as "[email protected]"
Wildcard character search criteria options:

■ The supported wildcard character is an asterisk (*)


■ Proper wildcard search examples:
■ Gabriel *akha* returns "Gabriel Oakham"
■ j* jop* returns "Janice Joplin"
■ Improper wildcard search:
■ Do not begin the search string with a wildcard; this will hinder directory server search
performance.
■ For example, the following search is not recommended: *Gabriel Oakham.

Browse the directory You can browse the directory tree for groups and users by clicking on the individual nodes and
for user groups. expanding them until you see the group or node that you want.

The browse results display the name of each node. These names give you the specific user
identity.

The results are limited to 20 entries by default. Click See More to view up to 1000 results.

Add a user group to To add a group or user to the User Group profile, select it from the tree and click Add.
the profile.
After you select and add the node to the Added Groups column, the system displays the
Common Name (CN) and the Distinguished Name (DN).

Save the user group. Click Save to save the User Group profile you have configured.

Configuring synchronized DGM policy conditions


To implement synchronized DGM policies, you define a Directory Connection using the
Enforce Server administration console. The Directory Connection specifies the directory
server you want to use as source information for defining exact identity User Groups. You
then define one or more User Groups in the Enforce Server administration console and
populate the group by synchronizing the User Group with the directory server. You then
Detecting synchronized identities 939
Configuring synchronized DGM policy conditions

associate the User Groups with the Sender/User based on a Directory Server Group group
rule or the Recipient matches User Group based on a Directory Server group rule.
See “Introducing synchronized Directory Group Matching (DGM)” on page 935.
Table 40-2 describes the process for implementing synchronized DGM.

Table 40-2 Workflow for implementing synchronized DGM

Step Action Description

1 Create the connection to the Establish the connection from the Enforce Server to a directory server such
directory server. as Microsoft Active Directory.

See “Configuring directory server connections” on page 156.

2 Create the User Group. Create one or more User Groups on the Enforce Server and populate the
User Groups with the exact identities from the users, groups, and business
units that are defined in the directory server

See “Configuring User Groups” on page 936.

3 Configure a new policy or edit See “Configuring policies” on page 413.


an existing one.

4 Configure one or more group Choose the type of synchronized DGM rule you want to implement and
rules or exceptions. reference the User Group. After the policy and the group are linked, the
policy applies only to those identifies in the referenced User Group.

See “Configuring the Sender/User based on a Directory Server Group


condition” on page 939.
See “Configuring the Recipient based on a Directory Server Group
condition” on page 940.

Configuring the Sender/User based on a Directory Server Group


condition
The condition Sender/User based on a Directory Server Group matches policy violations
based on message senders and endpoint users synchronized from a directory group server.
You can implement this condition in a policy group (identity) rule or exception.
See “Configuring policies” on page 413.

Note: If the identity being detected is a user, the user must be actively logged on to a DLP
Agent-enabled system for the policy to match.
Detecting synchronized identities 940
Configuring synchronized DGM policy conditions

Table 40-3 Sender/User matches User Group condition parameters

Parameter Description

Select User Groups to Select one or more User Groups that you want this policy to detect.
include in this policy
If you have not created a User Group, click Create a new User Group.

See “Configuring User Groups” on page 936.

Match On This condition matches on the entire message. The Envelope option is selected by default.
You cannot select any other message component.

See “Detection messages and message components” on page 391.

Also Match Select this option to create a compound condition. All conditions in a rule or exception
must match to trigger an incident.

You can Add any available condition from the list.

See “Configuring compound match conditions” on page 429.

See “Introducing synchronized Directory Group Matching (DGM)” on page 935.

Configuring the Recipient based on a Directory Server Group


condition
The Recipient based on a Directory Server Group condition matches policy violations based
on specific message recipients synchronized from a directory server. You can implement this
condition in a policy group rule or exception.
See “Introducing synchronized Directory Group Matching (DGM)” on page 935.

Note: The Recipient based on a Directory Server Group condition requires two-tier detection.
See “About two-tier detection for synchronized DGM” on page 936.

Table 40-4 Configuring the Recipient based on a Directory Server Group condition

Step Action Description

1 Select User Groups to Select the User Group(s) that you want this policy to match on.
include in this policy
If you have not created a User Group, click Create a new Endpoint User
Group option.

See “Configuring User Groups” on page 936.

2 Match On This rule detects the entire message, not individual components. The Envelope
option is selected by default. You cannot select any other message component.

See “Detection messages and message components” on page 391.


Detecting synchronized identities 941
Best practices for using synchronized DGM

Table 40-4 Configuring the Recipient based on a Directory Server Group condition
(continued)

Step Action Description

3 Also Match Select this option to create a compound condition. All conditions in a rule or
exception must match to trigger an incident.

You can Add any available condition from the list.

See “Configuring compound match conditions” on page 429.

Best practices for using synchronized DGM


This section contains a few considerations to keep in mind when implementing synchronized
DGM conditions in your policies.

Refresh the directory on initial save of the User Group


To execute a policy rule based on an Active Directory group, the index that you define on the
Enforce Server must first be populated. When you first define the User Group, the
recommendation is to select the option "Refresh the group directory index on Save." This
ensures proper synchronization of Active Directory with the Enforce Server. Once the User
Group is populated, you can then set up scheduling to keep the user group on Enforce in sync
with the Active Directory server.
One use case for not indexing immediately is where you are creating multiple User Groups
and you want to index after you have defined all the groups. In this case you can use scheduling,
but keep in mind that any policies based on these indices will not execute until they are
populated.
See “Introducing synchronized Directory Group Matching (DGM)” on page 935.
See “Configuring User Groups” on page 936.

Distinguish synchronized DGM from other types endpoint detection


When synchronized DGM policies are deployed to endpoint servers, identity-based detection
applies to the users in a configured group of DLP Agent-based endpoints. With endpoint-based
user groups, many different users can log on to the same computer depending on business
practices. The response that each user sees on that endpoint varies depending on how the
users are grouped. Contrast this style of endpoint detection with the Endpoint Protocol
Destination or Endpoint Location methods, which are specific to the endpoint and are not
user-based.
See “Introducing synchronized Directory Group Matching (DGM)” on page 935.
Chapter 41
Detecting profiled identities
This chapter includes the following topics:

■ Introducing profiled Directory Group Matching (DGM)

■ About two-tier detection for profiled DGM

■ Configuring Exact Data profiles for DGM

■ Configuring profiled DGM policy conditions

■ Best practices for using profiled DGM

Introducing profiled Directory Group Matching (DGM)


Profiled Directory Group Matching (DGM) leverages Exact Data Matching (EDM) technology
to detect identities that you have indexed from your database or directory server using an
Exact Data Profile. For example, you can use profiled DGM to identify network user activity
or to analyze content associated with particular users, senders, or recipients. Or, you can
exclude certain email addresses from analysis. Or, you might want to prevent certain people
from sending confidential information by email.
See “Configuring Exact Data profiles for DGM” on page 943.
Profiled DGM is distinguished from synchronized DGM, which uses a connection to a directory
server (such as Microsoft Active Directory) to match identities.
See “Introducing synchronized Directory Group Matching (DGM)” on page 935.

About two-tier detection for profiled DGM


Profiled DGM relies on an EDM index, which is server-based. Profiled DMG requires two-tier
detection for DLP Agents on the endpoint.
See “About two-tier detection for EDM on the endpoint” on page 533.
Detecting profiled identities 943
Configuring Exact Data profiles for DGM

You cannot combine either type of profiled DGM condition with an Endpoint: Block or
Endpoint: Notify response rule in a policy. If you do, the system reports that the policy is
misconfigured.
See “Troubleshooting policies” on page 445.

Configuring Exact Data profiles for DGM


To implement profiled DGM, you export identity records from a directory server or database,
index the data, and create an Exact Data Profile. You then reference this profile in the
corresponding Sender/User or Recipient condition.
See “Introducing profiled Directory Group Matching (DGM)” on page 942.
Table 41-1 describes the procedure for configuring Exact Data profiles for DGM policies.

Table 41-1 Workflow for implementing profiled DGM

Step Action Description

1 Create the data source file. Create a data source file from the directory server or database you want to
profile. Make sure the data source file contains the appropriate fields.
The following fields are supported for profiled DGM:

■ Email address
■ IP address
■ Window user name (in the format domain\user)
■ IM screen name

See “Creating the exact data source file for profiled DGM for EDM”
on page 537.

2 Prepare the data source See “Configuring Exact Data profiles for EDM” on page 534.
file for indexing.
See “Preparing the exact data source file for indexing for EDM” on page 537.

3 Create the Exact Data This includes uploading the data source file to the Enforce Server, mapping
Profile. the data fields, and indexing the data source.

See “Uploading exact data source files for EDM to the Enforce Server”
on page 539.

See “Creating and modifying Exact Data Profiles for EDM” on page 541.

See “Mapping Exact Data Profile fields for EDM” on page 545.

See “Scheduling Exact Data Profile indexing for EDM” on page 548.
Detecting profiled identities 944
Configuring profiled DGM policy conditions

Table 41-1 Workflow for implementing profiled DGM (continued)

Step Action Description

4 Define the profiled DGM See “Configuring the Sender/User based on a Profiled Directory condition”
condition. on page 944.

See “Configuring the Recipient based on a Profiled Directory condition”


on page 945.

5 Test the profiled DGM Use a test policy group and verify that the matches the policy generates are
policy. accurate.

See “Test and tune policies to improve match accuracy” on page 453.

Configuring profiled DGM policy conditions


Symantec Data Loss Prevention provides two match conditions for profiled DGM: sender/user
and recipient. Both conditions can be used as policy rules or exceptions. For example, consider
a scenario where you index a list of email addresses and author profiled DGM policies based
on this indexed data. You could write a rule that requires the message sender to be from the
indexed list to violate the policy. Or, you could write an exception that is not violated if the
recipient of an email is from the indexed list.
See “Creating the exact data source file for profiled DGM for EDM” on page 537.

Table 41-2 Profiled DGM conditions

Group rule Description

Sender/User based on a Directory If this condition is implemented as a policy rule, a match occurs only if the
from <EDM Profile> sender or user of the data is contained in the index profile. If this condition is
implemented as a policy exception, the data will be excepted from matching
if it is sent by a sender/user listed in the index profile

Recipient based on a Directory from If this condition is implemented as a policy rule, a match occurs only if the
<EDM Profile> recipient of the data is contained in the index profile. If this condition is
implemented as a policy exception, the data will be excepted from matching
if it is received by a recipient listed in the index profile.

Configuring the Sender/User based on a Profiled Directory condition


The Sender/User based on a Directory from detection rule lets you create detection rules
based on sender identity or (for endpoint incidents) user identity. This condition requires an
Exact Data Profile.
See “Creating the exact data source file for profiled DGM for EDM” on page 537.
Detecting profiled identities 945
Configuring profiled DGM policy conditions

After you select the Exact Data Profile, when you configure the rule, the directory you selected
and the sender identifier(s) appear at the top of the page.
Table 41-3 describes the parameters for configuring the Sender/User based on a Directory
an EDM Profile condition.

Table 41-3 Configuring the Sender/User based on a Directory from an EDM Profile condition

Parameter Description

Where Select this option to have the system match on the specified field values. Specify the values by
selecting a field from the drop-down list and typing the values for that field in the adjacent text box.
If you enter more than one value, separate the values with commas.

For example, for an Employees directory group profile that includes a Department field, you would
select Where, select Department from the drop-down list, and enter Marketing,Sales in the text
box. If the condition is implemented as a rule, in this example a match occurs only if the sender or
user works in Marketing or Sales (as long as the other input content meets all other detection criteria).
If the condition is implemented as an exception, in this example the system ignores from matching
messages from a sender or user who works in Marketing or Sales.

Is Any Of Enter or modify the information you want to match. For example, if you want to match any sender
in the Sales department, select Department from the drop-down list, and then enter Sales in this
field (assuming that your data includes a Department column). Use a comma-separated list if you
want to specify more than one value.

Configuring the Recipient based on a Profiled Directory condition


The Recipient based on a Directory from condition lets you create detection methods based
on the identity of the recipient. This method requires an Exact Data Profile.
See “Creating the exact data source file for profiled DGM for EDM” on page 537.
After you select the Exact Data Profile, when you configure the rule, the directory you selected
and the recipient identifier(s) appear at the top of the page.
Table 41-3 describes the parameters for configuring Recipient based on a Directory from
an EDM profile condition.
Detecting profiled identities 946
Best practices for using profiled DGM

Table 41-4 Configuring the Recipient based on a Directory from an EDM profile condition

Parameter Description

Where Select this option to have the system match on the specified field values. Specify the values by
selecting a field from the drop-down list and typing the values for that field in the adjacent text box.
If you enter more than one value, separate the values with commas.

For example, for an Employees directory group profile that includes a Department field, you would
select Where, select Department from the drop-down list, and enter Marketing, Sales in the text
box. For a detection rule, this example causes the system to capture an incident only if at least one
recipient works in Marketing or Sales (as long as the input content meets all other detection criteria).
For an exception, this example prevents the system from capturing an incident if at least one recipient
works in Marketing or Sales.

Is Any Of Enter or modify the information you want to match. For example, if you want to match any recipient
in the Sales department, select Department from the drop-down list, and then enter Sales in this
field (assuming that your data includes a Department column). Use a comma-separated list if you
want to specify more than one value.

Best practices for using profiled DGM


Keep in mind the considerations in this section when implementing profiled Directory Group
Matching (DGM)

Follow EDM best practices when implementing profiled DGM


Profiled DGM leverages EDM technology. Follow the EDM procedures and best practices
when implementing profiled DGM.
See “About two-tier detection for profiled DGM” on page 942.

Include an email address field in the Exact Data Profile for profiled
DGM
You must include the appropriate fields in the Exact Data Profile to implement profiled DGM.
See “Creating the exact data source file for profiled DGM for EDM” on page 537.
If you include the email address field in the Exact Data Profile for profiled DGM and map it to
the email data validator, email address will appear in the Directory EDM drop-down list (at
the remediation page).
Detecting profiled identities 947
Best practices for using profiled DGM

Use profiled DGM for Network Prevent for Web identity detection
If you want to implement DGM for Network Prevent for Web, use one of the profiled DGM
conditions to implement identity matching. For example, you may want to use identity matching
to block all web traffic for a specific users. For Network Prevent for Web, you cannot use
synchronized DGM conditions for this use case.
See “Creating the exact data source file for profiled DGM for EDM” on page 537.
See “Configuring the Sender/User based on a Profiled Directory condition” on page 944.
Chapter 42
Using contextual attributes
for Application Detection
This chapter includes the following topics:

■ Introducing contextual attributes for cloud applications

■ Configuring contextual attribute conditions

Introducing contextual attributes for cloud


applications
You can include contextual attribute conditions in policy detection rules for Application Detection
incidents. These contextual attributes specify the attributes that are associated with cloud
applications monitored or inspected by the Cloud Detection Service. For example, you can
create a policy detection rule that includes the Application Name: Gatelet > Salesforce
condition to specify that the detection rule applies to incidents that are associated with the
Symantec CloudSOC Salesforce Gatelet.
Contextual attributes are organized by category: General, User, Data Exposure, Data
Transfer, and Custom.
See “Contextual attribute categories” on page 949.
See “Configuring contextual attribute conditions” on page 948.

Configuring contextual attribute conditions


You configure contextual attribute conditions as part of a policy rule or exception. The following
procedure presumes that you are familiar with policy configuration. Refer to the following topics
for detailed information about policy configuration:
Using contextual attributes for Application Detection 949
Configuring contextual attribute conditions

See “Configuring policies” on page 413.


See “Configuring policy rules” on page 417.
See “Configuring policy exceptions” on page 426.
To configure a policy rule with a contextual attribute condition, follow this procedure:
To configure contextual attribute conditions
1 Add a Contextual Attributes (Cloud Applications and API Detection Appliance only)
condition to a policy rule or exception, or edit an existing one.
2 Select a contextual attribute condition from the Attributes drop-down list.
See “Contextual attribute categories” on page 949.
3 Configure the appropriate contextual attribute values.
4 Click OK.

Contextual attribute categories


Contextual attributes are grouped into categories: General, User, Data Exposure, Data
Transfer, and Custom.
The following tables provide more details about the attributes and attribute values available
in each category.

General attributes
General attributes apply to all data types and applications.
Using contextual attributes for Application Detection 950
Configuring contextual attribute conditions

Table 42-1 General attributes

Attribute Value Description

Application Name Specifies the name of the cloud web


proxy, Gatelet, or Securlet.
Using contextual attributes for Application Detection 951
Configuring contextual attribute conditions

Table 42-1 General attributes (continued)

Attribute Value Description

Securlets:

■ Amazon S3
■ Amazon Web Services
■ Box
■ Cisco Spark
■ Dropbox
■ Facebook Workplace
■ Google Calendar
■ Google Drive
■ Gmail
■ Microsoft Azure
■ Microsoft Teams
■ Office 365 Email
■ Office 365 OneDrive
■ Office 365 SharePoint
■ Salesforce
■ SAP
■ ServiceNow
■ Slack
■ Workday
■ Yammer
Gatelets:

■ 4Shared
■ 4Sync
■ Acrobat.com
■ AIM Mail
■ Alfresco
■ Amazon CloudDrive
■ Amazon Web Services
■ Amazon WorkDocs
■ BitCasa
■ Box
■ BV ShareX
■ cCloud
■ CentralDesktop
■ CloudMe
■ CloudProvider
Using contextual attributes for Application Detection 952
Configuring contextual attribute conditions

Table 42-1 General attributes (continued)

Attribute Value Description

■ Confluence
■ Copy
■ Cubby
■ DigitalBucket
■ Digital Ocean
■ DocuSign
■ Dropbox
■ Dynamics
■ Egnyte
■ FilesAnywhere
■ Flow
■ Ftopia
■ Gmail
■ GroupDocs
■ Hightail
■ Huddle
■ IBM Connections
■ iCloud
■ iDrive
■ Intralinks
■ Jive
■ Joyent
■ Just Cloud
■ MailerLite
■ MediaFire
■ Microsoft Azure
■ Office 365
■ OneDrive
■ OneHub
■ OneUbuntu
■ Outlook.com
■ OwnCloud
■ Oxygen
■ Podio
■ Rackspace
■ RapidShare
■ SafeSync
■ Salesforce
Using contextual attributes for Application Detection 953
Configuring contextual attribute conditions

Table 42-1 General attributes (continued)

Attribute Value Description

■ SeaCloud
■ ShareFile
■ Sites
■ Slack
■ SmartFile
■ Soonr
■ SugarSync
■ SurveyMonkey
■ Syncplicity
■ Uploaded
■ WatchDocs
■ WebCargo
■ Workshare
■ Wuala
■ Xero
■ Yahoo Mail
■ Yammer
■ Zoho Docs
Bluecoat WSS:

■ Bluecoat WSS (Symantec Web


Security Service)
Custom:

■ Custom

Application Type ■ Web Security Services (Cloud Specifies the type of application:
Proxy) Symantec Web Security Services,
■ Gatelet Symantec CloudSOC Gatelets,
■ Securlet Symantec CloudSOC Securlets, or a
■ Custom custom application.

Data Type ■ Data-at-Rest Specifies the data type: data at rest


■ Data-in-Motion (stored in a cloud repository), data in
■ Custom motion (data traveling over the
network), or custom.

User attributes
User attributes address specific information about the user that is associated with an incident.
Using contextual attributes for Application Detection 954
Configuring contextual attribute conditions

Table 42-2 User attributes

Attribute Value Description

Activity Type ■ Create Specifies the type of action that was


■ Edit taken by the user on the data of the
■ Rename incident.
■ Upload Symantec Web Security Service does
■ Download not use this attribute.
■ Custom

Client Tenant Domain Enter the name in the Match field. Specifies the client tenant domain of
the user. You can match exactly with
or without case sensitivity, or match
on a regular expression.

Client Tenant User ID Enter the user identifier in the Match Specifies the client tenant identifier of
field. the user. You can match exactly with
or without case sensitivity, or match
on a regular expression.

Exposed Document Count ■ Is Greater Than Specifies the users with a number of
■ Is Less Than exposed documents above or below
■ Is Greater Than or Equals a certain value, or within a range you
specify.
■ Is Less Than or Equals
■ Equals Symantec Web Security Service does
■ Range not use this attribute.

User ID ■ Match Specifies a user identifier that you


■ Match Type provide. You can match exactly with
or without case sensitivity, or match
on a regular expression.

User Name ■ Match Specifies a user identifier that you


■ Match Type provide. You can match exactly with
or without case sensitivity, or match
on a regular expression.

Symantec Web Security Service does


not use this attribute.

User Threat Score ■ Is Greater Than Specifies the Shadow IT threat score
■ Is Less Than of the user, above or below a certain
■ Is Greater Than or Equals value, or within a range you specify.
■ Is Less Than or Equals This attribute applies only to Securlet
■ Equals policies.
■ Range
Using contextual attributes for Application Detection 955
Configuring contextual attribute conditions

Table 42-2 User attributes (continued)

Attribute Value Description

User is Internal ■ True Specifies whether or not the user is


■ False part of your organization.

Symantec Web Security Service does


not use this attribute.

Data exposure attributes


Data exposure attributes specify information about the documents that are stored in cloud data
repositories ("data at rest"). Symantec Web Security Services does not use any data exposure
attributes.

Table 42-3 Data exposure attributes

Attribute Value Description

Document Creation Date ■ After Specifies the date the document was
■ Before created.
■ On or After
■ On or Before
■ On
■ Range

Document Last Accessed ■ After Specifies the date the document was
■ Before last accessed.
■ On or After
■ On or Before
■ On
■ Range

Document Last Modified ■ After Specifies the date the document was
■ Before last modified.
■ On or After
■ On or Before
■ On
■ Range

Document Owner ■ Match Specifies the name of the document


■ Match Type owner. You can match exactly with or
without case sensitivity, or match on
a regular expression.
Using contextual attributes for Application Detection 956
Configuring contextual attribute conditions

Table 42-3 Data exposure attributes (continued)

Attribute Value Description

Document Tag ■ Match Specifies the metadata tag of the


■ Match Type document. You can match exactly with
or without case sensitivity, or match
on a regular expression.

Document Type ■ Match Specifies the type of document. You


■ Match Type can match exactly with or without case
sensitivity, or match on a regular
expression.

Document is Exposed ■ True Specifies if the document is shared or


■ False accessible. The document is
"exposed" when shared with or
accessible to everyone within your
organization, or shared with or
accessible to anyone outside of your
organization. If the document is only
shared with certain members of your
organization, it is not considered an
exposed document.

Document is Internal ■ True Specifies if the document is "internal."


■ False A document is considered internal if a
member of your organization created
it.

Document is Internally Shared ■ True Specifies if the document is shared


■ False with or accessible to everyone within
your organization.

Document is Publically Exposed ■ True Specifies if the document is shared


■ False with or accessible to everyone outside
your organization. Such documents
are available to everyone on the
Internet.

Job ID ■ Match Specifies the job identifier that is


■ Match Type associated with the document. You
can match exactly with or without case
sensitivity, or match on a regular
expression.
Using contextual attributes for Application Detection 957
Configuring contextual attribute conditions

Table 42-3 Data exposure attributes (continued)

Attribute Value Description

Service Classification ■ Match Specifies the Shadow IT service


■ Match Type classification. You can match exactly
with or without case sensitivity, or
match on a regular expression.

Symantec Web Security Service does


not use this attribute.

Service Rating ■ Is Greater Than Specifies the Shadow IT service score


■ Is Less Than rating, above or below a certain value,
■ Is Greater Than or Equals or within a range you specify.
■ Is Less Than or Equals Symantec Web Security Service does
■ Equals not use this attribute.
■ Range

SharePoint Site Name ■ Match Specifies the name of a SharePoint


■ Match Type Site. You can match exactly with or
without case sensitivity, or match on
a regular expression.

Symantec Web Security Service does


not use this attribute.

Data transfer attributes


Data transfer attributes specify information about data moving over the network ("data in
motion").

Table 42-4 Data transfer attributes

Attribute Value Description

Browser ■ Match Specifies the name of the web browser


■ Match Type that is associated with the detection
request. You can match exactly with
or without case sensitivity, or match
on a regular expression.

Country Select a country from the drop-down Specifies the name of the country that
list of country names. is associated with the detection
request.

Symantec Web Security Service does


not use this attribute.
Using contextual attributes for Application Detection 958
Configuring contextual attribute conditions

Table 42-4 Data transfer attributes (continued)

Attribute Value Description

Device Inside Office ■ True Specifies if the device associated with


■ False the detection request is located within
your office.

Symantec Web Security Service does


not use this attribute.

Device OS ■ Match Specifies the operating system of the


■ Match Type device that is associated with the
detection request. You can match
exactly with or without case sensitivity,
or match on a regular expression.

Symantec Web Security Service does


not use this attribute.

Device Type ■ Match Specifies the type of device that is


■ Match Type associated with the detection request.
You can match exactly with or without
case sensitivity, or match on a regular
expression.

Symantec Web Security Service does


not use this attribute.

Device is Compliant ■ True Specifies whether or not the device is


■ False compliant, based on information from
your mobile device management
system.

Symantec Web Security Service does


not use this attribute.

Device is Managed ■ True Specifies whether or not your


■ False organization manages the device,
based on information from your mobile
device management system.

Symantec Web Security Service does


not use this attribute.

Device is Personal ■ True Specifies whether or not the user owns


■ False the device, based on information from
your mobile device management
system.

Symantec Web Security Service does


not use this attribute.
Using contextual attributes for Application Detection 959
Configuring contextual attribute conditions

Table 42-4 Data transfer attributes (continued)

Attribute Value Description

Device is Trusted ■ True Specifies whether or not the device is


■ False trusted, based on information from
your mobile device management
system.

Symantec Web Security Service does


not use this attribute.

HTTP Method ■ GET Specifies the method that is used in


■ PUT the HTTP traffic that is submitted for
■ DELETE inspection.
■ POST
■ Custom

Network Direction ■ Upload Specifies the network direction of the


■ Download message that is submitted for
■ Custom inspection.

Recipient IP ■ Match Specifies the IP address of the


■ Match Type message recipient. You can match
exactly with or without case sensitivity,
or match on a regular expression.

Recipient Port ■ Is Greater Than Specifies the network port of the


■ Is Less Than message recipient.
■ Is Greater Than or Equals
■ Is Less Than or Equals
■ Equals
■ Range

Sender IP ■ Match Specifies the IP address of the


■ Match Type message sender. You can match
exactly with or without case sensitivity,
or match on a regular expression.

Sender Port ■ Is Greater Than Specifies the network port of the


■ Is Less Than message sender.
■ Is Greater Than or Equals Symantec Web Security Service does
■ Is Less Than or Equals not use this attribute.
■ Equals
■ Range
Using contextual attributes for Application Detection 960
Configuring contextual attribute conditions

Table 42-4 Data transfer attributes (continued)

Attribute Value Description

Site Classification ■ Match Specifies the type of site that is


■ Match Type associated with the detection request,
such as "Social Media." You can
match exactly with or without case
sensitivity, or match on a regular
expression.

Site Risk Score ■ Is Greater Than Specifies a numeric value indicating


■ Is Less Than the risk level of the target site.
■ Is Greater Than or Equals
■ Is Less Than or Equals
■ Equals
■ Range

Source Protocol ■ Match Specifies the OSI Level 7 network


■ Match Type protocol for the detection request. For
example, SMTP, HTTP, FTP, and so
on. You can match exactly with or
without case sensitivity, or match on
a regular expression.

User Agent ■ Match Specifies the user agent for the


■ Match Type detection request that is related to
HTTP traffic. You can match exactly
with or without case sensitivity, or
match on a regular expression.

Custom attributes
Custom attributes let you enter any attributes for your Application Detection policies that are
not provided by default.

Table 42-5 Custom attributes

Attribute Value Description

String Attribute ■ Name Specifies a custom string attribute.


■ Match Name your attribute, then specify the
■ Match Type match and match type for your string.
You can match exactly with or without
case sensitivity, or match on a regular
expression.
Using contextual attributes for Application Detection 961
Configuring contextual attribute conditions

Table 42-5 Custom attributes (continued)

Attribute Value Description

Numeric Attribute ■ Name Specifies a custom numeric attribute.


■ Is Greater Than Name your attribute, then specify the
■ Is Less Than numeric property and value.
■ Is Greater Than or Equals
■ Is Less Than or Equals
■ Equals
■ Range

Boolean Attribute ■ Name Specifies a custom Boolean attribute.


■ True Name your attribute, then specify the
■ False Boolean value.

Date Attribute ■ Name Specifies a custom date attribute.


■ After Name your attribute, then specify the
■ Before date property and value.
■ On or After
■ On or Before
■ On
■ Range
Chapter 43
Supported file formats for
detection
This chapter includes the following topics:

■ Overview of detection file format support

■ Supported formats for file type identification

■ Supported formats for content extraction

■ Supported encapsulation formats for subfile extraction

■ Supported file formats for metadata extraction

Overview of detection file format support


Symantec Data Loss Prevention detection supports various file formats for performing the
following operations:
■ File type identification
■ File contents extraction
■ Subfile extraction
■ Document metadata extraction
Table 43-1 summarizes the file formats that Symantec Data Loss Prevention supports for file
type identification and content, subfile and metadata extraction.
You configure the system to identify individual file formats using the Message Attachment
or File Type Match condition. This condition performs a context-based match that only identifies
the file format type; it does not extract file contents. In addition, you must explicitly select the
individual file format(s) you want to detect.
Supported file formats for detection 963
Overview of detection file format support

See “About file type matching” on page 900.


When you use a content-based detection condition in a policy (such as Content Matches
Keyword), the system automatically extracts file contents for supported file formats (such as
DOCX, PPTX, XSLX, PDF). In addition, the system automatically extracts subfiles from
supported encapsulation file formats (such as ZIP, RAR, TAR).
See “Content matching conditions” on page 387.
Lastly, you can enable metadata extraction for a limited number of document formats (such
as DOCX), and use keyword matching to detect document metadata.
See “About document metadata detection” on page 989.

Note: While there is some overlap among file types supported for extraction and for identification
(because if the system can crack the file it must be able to identify its type), the supported
formats for each operation are distinct and implemented using different match conditions. The
number of file formats supported for type identification is much broader than those supported
for content extraction.

Table 43-1 File format support for detection operations

Operation Description Configuration Supported formats


type

File type Symantec Data Loss Prevention does Explicitly using the Message See “Supported formats for file
identification not rely on file extensions to identify the Attachment or File Type type identification”
format. File type is identified by the Match file property condition. on page 964.
unique binary signature of the file
format.

File contents File contents is any text-based content Implicitly using one or more See “Supported formats for
extraction that can be viewed through the native content match conditions, content extraction”
or source application. including EDM, IDM, VML, on page 980.
data identifiers, keyword,
regular expressions.

Subfile Subfiles are files encapsulated in a Implicitly using one or more See “Supported encapsulation
extraction parent file. Subfiles are extracted and content match conditions, formats for subfile extraction”
(Subfile) processed individually for identification including EDM, IDM, VML, on page 987.
and content extraction. If the subfile data identifiers, keyword,
format is not supported by default, a regular expressions.
custom method can be used to detect
and crack the file.
Supported file formats for detection 964
Supported formats for file type identification

Table 43-1 File format support for detection operations (continued)

Operation Description Configuration Supported formats


type

Metadata Metadata is information about the file, Available for content-based See “Supported file formats
extraction such as author, version, or user-defined match conditions. Must be for metadata extraction”
(Metadata) tags. Generally limited to Microsoft enabled. on page 989.
Office documents (OLE-enabled) and
Adobe PDF files. Metadata support may
differ between agent and server.

Metadata includes data-security tags


that were created in Information Centric
Tagging (ICT).

Supported formats for file type identification


Table 43-2 lists the file types you can identify using the Message Attachment or File Type
Match policy condition.
See “About file type matching” on page 900.
The Unknown file format identifies any format that is unknown to Symantec Data Loss
Prevention. The Unknown file format is only supported for file type identification. This type
identifies files that are not known to Data Loss Prevention and blocks them using the file type
rule.
If the file format you want to identify is not supported, you can use the Symantec Data Loss
Prevention Scripting Language to identify custom file types.
See “About custom file type identification” on page 901.

Note: The Message Attachment or File Type Match condition is a context-based match
condition that only supports file type identification. This condition does not support file contents
extraction. To extract file contents for policy evaluation you must use a content-based detection
rule. See “Supported formats for content extraction” on page 980.

See “Overview of detection file format support” on page 962.

Table 43-2 Formats supported for file type identification

Message Attachment or File Type Match formats

7-Zip Compressed File (7Z)

Ability Office (SS)


Supported file formats for detection 965
Supported formats for file type identification

Table 43-2 Formats supported for file type identification (continued)

Message Attachment or File Type Match formats

Ability Office (DB)

Ability Office (GR)

Ability Office (WP)

Ability Office (COM)

ACT

Adobe FrameMaker

Adobe Maker Interchange Format (FrameMaker)

Adobe FrameMaker Markup Language

Adobe PDF

AES Multiplus Comm

Aldus Freehand (Macintosh)

Aldus PageMaker (DOS)

Aldus PageMaker (Macintosh)

Amiga IFF-8SVX sound

Amiga MOD sound

ANSI

Apple Double

Apple Single

Applix Alis

Applix Asterix

Applix Graphics

Applix Presents

Applix Spreadsheets

Applix Words

ARC/PAK Archive
Supported file formats for detection 966
Supported formats for file type identification

Table 43-2 Formats supported for file type identification (continued)

Message Attachment or File Type Match formats

ASCII

ASCII-armored PGP encoded

ASCII-armored PGP Public Keyring

ASCII-armored PGP signed

Audio Interchange File Format

AutoCAD Drawing

AutoCAD Drawing Exchange

AutoDesk Animator FLIC Animation

AutoDesk Animator Pro FLIC Animation

AutoDesk WHIP

AutoShade Rendering

BinHex

CADAM Drawing (CDD) (server only)

CADAM Drawing Overlay

CATIA Drawing (CAT) (server only)

CCITT Group 3 1-Dimensional (G31D)

COMET TOP Word

Comma Separated Values

Compactor/Compact Pro Archive

Computer Graphics Metafile

Convergent Tech DEF Comm.

Corel Draw CMX

Corel Presentations

Corel Quattro Pro (WB2)

Corel Quattro Pro (WB3)


Supported file formats for detection 967
Supported formats for file type identification

Table 43-2 Formats supported for file type identification (continued)

Message Attachment or File Type Match formats

Corel WordPerfect Linux

Corel WordPerfect Macintosh

Corel WordPerfect Windows (WO)

Corel WordPerfect Windows (WPD)

CorelDRAW

cpio Archive (UNIX)

cpio Archive (VAX)

cpio Archive (SUN)

CPT Communication

Creative Voice (VOC) sound

Curses Screen Image (UNIX)

Curses Screen Image (VAX)

Curses Screen Image (SUN)

Data Interchange Format

Data Point VISTAWORD

dBase Database

DCX Fax

DCX Fax System

DEC WPS PLUS

DECdx

Desktop Color Separation (DCS)

Device Independent file (DVI)

DG CEOwrite

DG Common Data Stream (CDS)

DIF Spreadsheet
Supported file formats for detection 968
Supported formats for file type identification

Table 43-2 Formats supported for file type identification (continued)

Message Attachment or File Type Match formats

Digital Document Interchange Format (DDIF)

Disk Doubler Compression

DisplayWrite

Domino XML Language

EMC EmailXtender Container File (EMX)

ENABLE

ENABLE Spreadsheet (SSF)

Encapsulated PostScript (raster)

Enhanced Metafile

Envoy (EVY)

Executable- Other

Executable- UNIX

Executable- VAX

Executable- SUN

FileMaker (Macintosh)

File Share Encryption

Folio Flat File

Framework

Framework II

FTP Session Data

Fujitsu Oasys

GEM Bit Image

GIF

Graphics Environment Manager (GEM VDI)

GZIP
Supported file formats for detection 969
Supported formats for file type identification

Table 43-2 Formats supported for file type identification (continued)

Message Attachment or File Type Match formats

Haansoft Hangul (Hangul 2010 SE+)

Harvard Graphics

Hewlett-Packard

Honey Bull DSA101

HP Graphics Language (HPG) (server only)

HP Printer Control Language (PCL)

HTML

IBM 1403 Line Printer

IBM DCA/RFT(Revisable Form Text)

IBM DCA-FFT

IBM DCF Script

iCalendar

Informix SmartWare II

Informix SmartWare II Communication File

Informix SmartWare II Database

Informix SmartWare Spreadsheet

Interleaf

Java Archive

JPEG

JPEG File Interchange Format (JFIF)

JustSystems Ichitaro

KW ODA G31D (G31)

KW ODA G4 (G4)

KW ODA Internal G32D (G32)

KW ODA Internal Raw Bitmap (RBM)


Supported file formats for detection 970
Supported formats for file type identification

Table 43-2 Formats supported for file type identification (continued)

Message Attachment or File Type Match formats

Lasergraphics Language

Legato Extender

Link Library- Other

Link Library UNIX

Link Library VAX

Link Library SUN

Lotus 1-2-3 (123)

Lotus 1-2-3 (WK4)

Lotus 1-2-3 Charts

Lotus AMI Pro

Lotus AMI Professional Write Plus

Lotus AMIDraw Graphics

Lotus Freelance Graphics

Lotus Freelance Graphics 2

Lotus Notes Bitmap

Lotus Notes CDF

Lotus Notes database

Lotus Pic

Lotus Screen Cam

Lotus SmartMaster

Lotus Word Pro

Lyrix MacBinary

MacBinary

Macintosh Raster

MacPaint
Supported file formats for detection 971
Supported formats for file type identification

Table 43-2 Formats supported for file type identification (continued)

Message Attachment or File Type Match formats

Macromedia (Adobe) Director

Macromedia (Adobe) Flash

MacWrite

MacWrite II

MASS-11

Micrografx Designer

Microsoft Access

Microsoft Advanced Systems Format (ASF)

Microsoft Compressed Folder (LZH)

Microsoft Compressed Folder (LHA)

Microsoft Device Independent Bitmap

Microsoft Excel Charts

Microsoft Excel Macintosh

Microsoft Excel Windows

Microsoft Excel Windows XML

Microsoft Office Access (ACCDB)

Microsoft Office Drawing

Microsoft OneNote

Microsoft Outlook Personal Folder

Microsoft Outlook

Microsoft Outlook Express

Microsoft PowerPoint Macintosh

Microsoft PowerPoint PC

Microsoft PowerPoint Windows

Microsoft PowerPoint Windows XML


Supported file formats for detection 972
Supported formats for file type identification

Table 43-2 Formats supported for file type identification (continued)

Message Attachment or File Type Match formats

Microsoft PowerPoint Windows Macro-Enabled XML

Microsoft PowerPoint Windows XML Template

Microsoft PowerPoint Windows Macro-Enabled XML Template

Microsoft PowerPoint Windows XML Show

Microsoft PowerPoint Windows Macro-Enabled Show

Microsoft Project

Microsoft Publisher

Microsoft RMS Encrypted Office Binary File

Microsoft RMS Encrypted Open Packaging Conventions File

Microsoft Visio

Microsoft Visio 2013

Microsoft Visio 2013_Macro Format

Microsoft Visio 2013_Stencil Format

Microsoft Visio 2013_Stencil_Macro Format

Microsoft Visio 2013_Template Format

Microsoft Visio _Template_Macro

Microsoft Visio XML

Microsoft Wave Sound

Microsoft Windows Cursor (CUR) Graphics

Microsoft Windows Group File

Microsoft Windows Help File

Microsoft Windows Icon (ICO)

Microsoft Windows OLE 2 Encapsulation

Microsoft Windows Write

Microsoft Word (UNIX)


Supported file formats for detection 973
Supported formats for file type identification

Table 43-2 Formats supported for file type identification (continued)

Message Attachment or File Type Match formats

Microsoft Word Macintosh

Microsoft Word PC

Microsoft Word Windows

Microsoft Word Windows XML

Microsoft Word Windows Template XML

Microsoft Word Windows Macro-Enabled Template XML

Microsoft Works (Macintosh)

Microsoft Works

Microsoft Works Communication (Macintosh)

Microsoft Works Communication (Windows)

Microsoft Works Database (Macintosh)

Microsoft Works Database (PC)

Microsoft Works Database (Windows)

Microsoft Works Spreadsheet (S30)

Microsoft Works Spreadsheet (S40)

Microsoft Works Spreadsheet (Macintosh)

Microstation

MIDI

MORE Database Outliner (Macintosh)

MPEG-1 Audio layer 3

MPEG-1 Video

MPEG-2 Audio

MS DOS Batch File format

MS DOS Device Driver

MultiMate 4.0
Supported file formats for detection 974
Supported formats for file type identification

Table 43-2 Formats supported for file type identification (continued)

Message Attachment or File Type Match formats

Multiplan Spreadsheet

Navy DIF

NBI Async Archive Format

NBI Net Archive Format

Netscape Bookmark file

NeWS font file (SUN)

NeXT/Sun Audio

NIOS TOP

Nota Bene

Nurestor Drawing (NUR) (server only)

Oasis Open Document Format (ODT)

Oasis Open Document Format (ODS)

Oasis Open Document Format (ODP)

Object Module UNIX

Object Module VAX

Object Module SUN

ODA/ODIF

ODA/ODIF (FOD 26)

Office Writer

OLE DIB object

OLIDIF

OmniOutliner (OO3)

OpenOffice Calc (SXC)

OpenOffice Calc (ODS)

OpenOffice Impress (SXI)


Supported file formats for detection 975
Supported formats for file type identification

Table 43-2 Formats supported for file type identification (continued)

Message Attachment or File Type Match formats

OpenOffice Impress (SXP)

OpenOffice Impress (ODP)

OpenOffice Writer (SXW)

OpenOffice Writer (ODT)

Open PGP

OS/2 PM Metafile Graphics

Paradox (PC) Database

PC COM executable

PC Library Module

PC Object Module

PC PaintBrush

PC True Type Font

PCD Image

PeachCalc Spreadsheet

Persuasion Presentation

PEX Binary Archive (SUN)

PGP Compressed Data

PGP Encrypted Data

PGP Public Keyring

PGP Secret Keyring

PGP Signature Certificate

PGP Signed and Encrypted Data

PGP Signed Data

Philips Script

PKZIP
Supported file formats for detection 976
Supported formats for file type identification

Table 43-2 Formats supported for file type identification (continued)

Message Attachment or File Type Match formats

Plan Perfect

Portable Bitmap Utilities (PBM)

Portable Greymap Utilities (PGM)

Portable Network Graphics

Portable Pixmap Utilities (PPM)

PostScript File

PRIMEWORD

Program Information File

Q & A for DOS

Q & A for Windows

Quadratron Q-One (V1.93J)

Quadratron Q-One (V2.0)

Quark Express (Macintosh)

QuickDraw 3D Metafile (3DMF)

QuickTime Movie

RAR archive

Real Audio

Reflex Database

Rich Text Format

RIFF Device Independent Bitmap

RIFF MIDI

RIFF Multimedia Movie

SAMNA Word IV

Serialized Object Format (SOF) Encapsulation

SGI RGB Image


Supported file formats for detection 977
Supported formats for file type identification

Table 43-2 Formats supported for file type identification (continued)

Message Attachment or File Type Match formats

SGML

Simple Vector Format (SVF)

SMTP document

SolidWorks Drawing (SLDASM, SLDPRT, SLDDRW)

StarOffice Calc (SXC)

StarOffice Calc (ODS)

StarOffice Impress (SXI)

StarOffice Impress (SXP)

StarOffice Impress (ODP)

StarOffice Writer (SXW)

StarOffice Writer (ODT)

Stuff It Archive (Macintosh)

Sun Raster Image

SUN vfont definition

Supercalc Spreadsheet

SYLK Spreadsheet

Symphony Spreadsheet

Tagged Image File

Tape Archive

Targon Word (V 2.0)

Text Mail (MIME)

Transmission Neutral Encapsulation Format

Truevision Targa

Ultracalc Spreadsheet

Unicode Text
Supported file formats for detection 978
Supported formats for file type identification

Table 43-2 Formats supported for file type identification (continued)

Message Attachment or File Type Match formats

Uniplex (V6.01)

Uniplex Ucalc Spreadsheet

UNIX Compress

UNIX SHAR Encapsulation

UNKNOWN

Usenet format

UUEncoding

Vcard

VCF

Volkswriter

VRML

Wang Office GDL Header Encapsulation

WANG PC

Wang WITA

WANG WPS Comm.

Windows Animated Cursor

Windows Bitmap

Windows C++ Object Storage

Windows Icon Cursor

Windows Metafile

Windows Micrografx Draw (DRW)

Windows Palette

Windows Media Video (WMV)

Windows Media Audio (WMA)

Windows Video (AVI)


Supported file formats for detection 979
Supported formats for file type identification

Table 43-2 Formats supported for file type identification (continued)

Message Attachment or File Type Match formats

WinZip (unzip reader)

WinZip

Word Connection

WordERA (V 1.0)

WordMARC word processor

WordPad

WordPerfect General File

WordPerfect Graphics 1

WordPerfect Graphics 2

WordStar

WordStar 2000

WordStar 6.0

WriteNow

Writing Assistant word processor

X Bitmap (XBM)

X Image

X Pixmap (XPM)

Xerox 860 Comm.

Xerox Writer word processor

XHTML

XML (generic)

XML Paper Specification

XyWrite
Supported file formats for detection 980
Supported formats for content extraction

Supported formats for content extraction


Symantec Data Loss Prevention cracks more than 100 file formats for performing content
extraction. You use content-based detection conditions to crack a file and extract its contents.
See “Content matching conditions” on page 387.
Table 43-3 lists the various file format categories whose content Symantec Data Loss Prevention
can extract. Refer to the associated link for the individual file formats supported for that category.
See “Overview of detection file format support” on page 962.

Table 43-3 Supported file format categories for content extraction

File format category Default support list

Word-processing file formats See “Supported word-processing formats for content extraction” on page 980.

Presentation file formats See “Supported presentation formats for content extraction” on page 982.

Spreadsheet file formats See “Supported spreadsheet formats for content extraction” on page 983.

Text and markup file formats See “Supported text and markup formats for content extraction” on page 984.

Email file formats See “Supported email formats for content extraction” on page 985.

CAD file formats See “Supported CAD formats for content extraction” on page 985.

Graphics file formats See “Supported graphics formats for content extraction” on page 986.

Database file formats See “Supported database formats for content extraction” on page 986.

Microsoft Office Open XML formats See “About high-performance content extraction for Office Open XML formats”
on page 996.

Other file formats See “Other file formats supported for content extraction” on page 986.

Encapsulation file formats See “Supported encapsulation formats for subfile extraction” on page 987.

Supported word-processing formats for content extraction


Table 43-4 lists the word-processing file formats whose content Symantec Data Loss Prevention
can extract for policy evaluation.

Table 43-4 Supported word-processing file formats for content extraction

Format Name Format Extension

Adobe Maker Interchange Format (FrameMaker) MIF

Apple iWork Pages PAGES


Supported file formats for detection 981
Supported formats for content extraction

Table 43-4 Supported word-processing file formats for content extraction (continued)

Format Name Format Extension

ApplixWords AW

Corel WordPerfect Linux WPS

Corel WordPerfect Macintosh WPS

Corel WordPerfect Windows WO

Corel WordPerfect Windows WPD

DisplayWrite IP

Folio Flat file FFF

Fujitsu Oasys OA2

Haansoft Hangul HWP

IBM DCA/RFT (Revisable Form Text) DC

JustSystems Ichitaro JTD

Lotus AMI Pro SAM

Lotus AMI ProfessionalWrite Plus AMI

LotusWord Pro LWP

Lotus SmartMaster MWP

Microsoft Word PC DOC

Microsoft Word Windows DOC

Microsoft Word Windows XML DOCX

Microsoft Word Windows Template XML DOTX

Microsoft Word Windows Macro-Enabled Template XML DOTM

Microsoft Word Macintosh DOC

Microsoft Works WPS

Microsoft Windows Write WRI

Microsoft OneNote ONE

OpenOfficeWriter SXW
Supported file formats for detection 982
Supported formats for content extraction

Table 43-4 Supported word-processing file formats for content extraction (continued)

Format Name Format Extension

OpenOfficeWriter ODT

StarOfficeWriter SXW

StarOfficeWriter ODT

WordPad RTF

XML Paper Specification XPS

XyWrite XY4

Supported presentation formats for content extraction


Table 43-5 lists the presentation file formats whose content Symantec Data Loss Prevention
can extract for policy evaluation.

Table 43-5 Supported presentation formats for files content extraction

Format Name Format Extension

Apple iWork Keynote KEYNOTE

Applix Presents AG

Corel Presentations SHW

Lotus Freelance Graphics PRZ

Lotus Freelance Graphics 2 PRE

Macromedia Flash SWF

Microsoft PowerPoint Windows PPT

Microsoft PowerPoint PC PPT

Microsoft PowerPoint Windows XML PPTX

Microsoft PowerPoint Windows Macro-Enabled XML PPTM

Microsoft PowerPoint Windows XML Template POTX

Microsoft PowerPoint Windows Macro-Enabled XML Template POTM

Microsoft PowerPoint Windows XML Show PPSX


Supported file formats for detection 983
Supported formats for content extraction

Table 43-5 Supported presentation formats for files content extraction (continued)

Format Name Format Extension

Microsoft PowerPoint Windows Macro-Enabled Show PPSM

Microsoft PowerPoint Macintosh PPT

OpenOffice Impress SXI

OpenOffice Impress SXP

OpenOffice Impress ODP

StarOffice Impress SXI

StarOffice Impress SXP

StarOffice Impress ODP

Supported spreadsheet formats for content extraction


Table 43-6 lists the spreadsheet file formats whose content Symantec Data Loss Prevention
can extract for policy evaluation.

Table 43-6 Supported spreadsheet formats for file contents extraction

Format Name Format Extension

Apple iWork Numbers NUMBERS

Applix Spreadsheets AS

Comma Separated Values CSV

Corel Quattro Pro WB2

Corel Quattro Pro WB3

Data Interchange Format DIF

Lotus 1-2-3 123

Lotus 1-2-3 WK4

Lotus 1-2-3 Charts 123

Microsoft Excel Windows XLS

Microsoft Excel Windows XML XLSX


Supported file formats for detection 984
Supported formats for content extraction

Table 43-6 Supported spreadsheet formats for file contents extraction (continued)

Format Name Format Extension

Microsoft Excel Charts XLS

Microsoft Excel 2007 Binary XLSB

Microsoft Excel Macintosh XLS

Microsoft Works Spreadsheet S30

Microsoft Works Spreadsheet S40

OpenOffice Calc SXC

OpenOffice Calc ODS

StarOffice Calc SXC

StarOffice Calc ODS

Supported text and markup formats for content extraction


Table 43-7 lists the text and markup file formats whose content Symantec Data Loss Prevention
can extract for policy evaluation.

Table 43-7 Supported text and markup file formats for content extraction

Format Name Format Extension

ANSI TXT

ASCII TXT

HTML HTM

Microsoft Excel Windows XML XML

Microsoft Word Windows XML XML

Microsoft Visio XML VDX

Oasis Open Document Format ODT

Oasis Open Document Format ODS

Oasis Open Document Format ODP

Rich Text Format RTF


Supported file formats for detection 985
Supported formats for content extraction

Table 43-7 Supported text and markup file formats for content extraction (continued)

Format Name Format Extension

Unicode Text TXT

XHTML HTM

XML (generic) XML

Supported email formats for content extraction


Table 43-8 lists the email file formats whose content Symantec Data Loss Prevention can
extract for evaluation.

Table 43-8 Supported email file formats for content extraction

Format Name Format Extension

Domino XML Language DXL

EMC EmailXtender Native Message ONM

Microsoft Outlook MSG

Microsoft Outlook Express EML

Text Mail (MIME) various

Transfer Neutral Encapsulation Format various

Supported CAD formats for content extraction


Table 43-9 lists the computer-aided design (CAD) file formats whose content Symantec Data
Loss Prevention can extract for evaluation.

Table 43-9 Supported CAD file formats

Format Name Format Extension

AutoCAD Drawing DWG

AutoCAD Drawing Exchange DFX

Microsoft Visio 2013 VSD

Microsoft Visio XML VSDX

Microsoft Visio 2013_Macro VSDM


Supported file formats for detection 986
Supported formats for content extraction

Table 43-9 Supported CAD file formats (continued)

Format Name Format Extension

Microsoft Visio 2013_Stencil VSSX

Microsoft Visio 2013_Stencil_Macro VSSM

Microsoft Visio 2013_Template VSTX

Microsoft Visio 2013_Template_Macro VSTM

Microstation DGN

Supported graphics formats for content extraction


Table 43-10 lists the graphics file formats whose content Symantec Data Loss Prevention can
extract for evaluation.

Table 43-10 Supported graphics file formats for content extraction

Format Name Format Extension

Enhanced Metafile EMF

Lotus Pic PIC

Tagged Image File (metadata only) TIFF

Windows Metafile WMF

Supported database formats for content extraction


The following table lists the database file formats whose content Symantec Data Loss Prevention
can extract for policy evaluation.

Table 43-11 Crackable database file formats

Format Name Format Extension

Microsoft Access MDB

Microsoft Project MPP

Other file formats supported for content extraction


Table 43-12 lists other file formats whose content Symantec Data Loss Prevention can extract
for policy evaluation.
Supported file formats for detection 987
Supported encapsulation formats for subfile extraction

Table 43-12 Other supported formats for content extraction

Format name Format extension

Adobe PDF PDF

iCalendar ICS

MPEG-1 Audio layer 3 (metadata MP3


only)

Microsoft Windows Backup Utility BKF


File

Microsoft Rights Management ■ PFILE


protected files ■ Microsoft Office 2003 and older
■ Files that use Open Packaging Conventions (OPC) file technology, including
Office Open XML (including Office 2007 and greater), and XML Paper
Specification (XPS)

Note: This type of content extraction is only supported on detection servers


running on Windows servers

File Share Encryption (PGP You can decrypt Symantec File Share encrypted files and extract file contents for
Netshare) policy evaluation using the File Share plugin. Refer to the Symantec Data Loss
Prevention Encryption Insight Implementation Guide.
Note: Encryption Insight is only available with Network Discover.

Custom You can write a plug-in to perform content, subfile, and metadata extraction
operations on custom file formats. Refer to the Symantec Data Loss Prevention
Content Extraction Plug-in Developers Guide.
Note: Content extraction plug-ins are limited to detection servers.

Virtual Card File VCF and VCARD electronic business card files

Supported encapsulation formats for subfile


extraction
Symantec Data Loss Prevention supports various encapsulation formats for subfile extraction,
such as ZIP, RAR, and TAR. The system automatically performs subfile extraction for supported
formats using content-based match conditions. Subfile extraction is a subset of content
extraction in that, if the system is successful in extracting a subfile from a supported
encapsulated file, the system automatically extracts the text-based subfile contents if the subfile
format is supported for content extraction.
See “Overview of detection file format support” on page 962.
Supported file formats for detection 988
Supported encapsulation formats for subfile extraction

Table 43-13 lists the file formats whose content Symantec Data Loss Prevention can extract
for content evaluation.

Table 43-13 Supported encapsulation formats for subfile extraction

Format Name Format Extension

7-Zip 7Z

BinHex HQX

GZIP GZ

iCalendar ICS

Java Archive JAR

Microsoft Cabinet CAB

Microsoft Compressed Folder LZH

Microsoft Compressed Folder LHA

Microsoft Visio 2013 VSD

Microsoft Visio 2013 XML VSDX

Microsoft Visio 2013_Macro VSDM

Microsoft Visio 2013_Stencil VSSX

Microsoft Visio 2013_Stencil_Macro VSSM

Microsoft Visio 2013_Template VSTX

Microsoft Visio 2013_Template_Macro VSTM

PKZIP ZIP

WinZip ZIP

RAR archive RAR

Tape Archive TAR

UNIX Compress Z

UUEncoding UUE

Virtual Card File VCF and VCARD electronic business card files

YENC YENC (server only)


Supported file formats for detection 989
Supported file formats for metadata extraction

Supported file formats for metadata extraction


Table 43-14 lists some of the file formats that Symantec Data Loss Prevention supports for
metadata detection, and provides some example metadata fields returned for those formats.
This list is not exhaustive and is provided for quick reference only. Other file formats may be
supported, and other custom fields may be returned. The best practice is to always use the
filter utility to verify metadata support for each file format you want to detect.
See “Always use the filter utility to verify file format metadata support” on page 991.

Table 43-14 Supported file formats for metadata detection

File formats Metadata Description

Example fields:
Microsoft Office documents, for
example: ■ Title
For Microsoft Office documents, the
■ Subject
■ Word (DOC, DOCX) system extracts Object Linking and
Embedding (OLE) metadata. ■ Author
■ Excel (XLS, XLSX)
■ Keywords
■ PowerPoint (PPT, PPTX)
■ Other custom fields

Example fields:
For Adobe PDF files, the system
extracts Document Information ■ Author
Dictionary (DID) metadata. The system ■ Title
Adobe PDF files
does not support Adobe Extensible ■ Subject
Metadata Platform (XMP) metadata ■ Creation
extraction.
■ Update dates

Microsoft Visio Supported format extensions

Use the filter utility to verify metadata See “Always use the filter utility to
Other file formats (including binary and
extraction for other file formats. verify file format metadata support”
text)
on page 991.

Content extraction plug-in that


Custom file formats Custom file type metadata supports the metadata extraction
operation.

About document metadata detection


In addition to file content and subfile extraction, Symantec Data Loss Prevention supports
metadata extraction for many file formats. File format metadata is data about a file that is
stored as file properties. By default metadata extraction is disabled because it can lead to false
positives. Used properly, metadata detection can enhance the accuracy of your content-based
policy rules.
Supported file formats for detection 990
Supported file formats for metadata extraction

For example, consider a business that uses Microsoft Office templates for their Word, Excel,
and PowerPoint documents. The business applies Microsoft OLE metadata properties in the
form of keywords to each template. The business has enabled metadata extraction and
deployed keyword policies to match on metadata keywords. These policies can detect keywords
in documents that are derived from the templates. The business also has the flexibility to use
policy exceptions to avoid generating incidents if certain metadata keywords are present.

Enabling server metadata detection


By default metadata extraction is disabled for detection servers.
To enable server metadata extraction
1 Log on to the Enforce Server administration console as a system administrator.
2 Navigate to the System > Servers and Detectors > Overview > Server/Detector Detail
- Advanced Settings screen for the detection server or cloud detector you want to enable
metadata extraction.
3 Click the Server Settings button.
4 Locate property ContentExtraction.EnableMetaData in the list.
5 Enter the value on for this property to enable metadata extraction.
6 Click Save to save the configuration.
7 Click Recycle the server at the Server Detail screen to restart the server.
8 Click Done at the Server Detail screen to complete the process.

Enabling endpoint metadata detection


By default metadata extraction is disabled for endpoints.
To enable endpoint metadata extraction
1 Log on to the Enforce Server administration console as a system administrator.
2 Navigate to the System > Agents > Agent Configuration screen for the endpoint server
you want to enable metadata extraction.
3 Create a new endpoint configuration for metadata detection, or select the default
configuration.
See “Create a separate endpoint configuration for metadata detection” on page 995.
4 Select the Advanced Agent Settings tab.
5 Locate property Detection.ENABLE_METADATA.str in the list.
Supported file formats for detection 991
Supported file formats for metadata extraction

6 Enter the value on for this property to enable metadata extraction.


7 Click Save and Apply to save the configuration change.

Best practices for using metadata detection


Best practices for using metadata detection lists best practices for implementing metadata
detection with links to corresponding topics for detailed considerations.

Table 43-15 Considerations for implementing metadata detection

Consideration Topic

Always use filter to verify file format metadata support. See “Always use the filter utility to verify file format
metadata support” on page 991.

Enable metadata detection only if it is necessary. See “Distinguish metadata from file content and application
data” on page 993.

Avoid generating false positives by selecting keywords See “Use and tune keyword lists to avoid false positives
carefully. on metadata” on page 995.

Understand resource implications of endpoint metadata See “Understand performance implications of enabling
extraction. endpoint metadata detection” on page 995.

Create a separate endpoint configuration for metadata See “Create a separate endpoint configuration for
detection. metadata detection” on page 995.

Use response rules to add metadata tags to incidents. See “Use response rules to tag incidents with metadata”
on page 995.

Always use the filter utility to verify file format metadata support
To help you create policies that detect file format metadata, use the filter utility that is available
with any Symantec Data Loss Prevention detection or Endpoint Server installation. This utility
provides an easy way to determine which metadata fields the system returns for a given file
format. The utility generates output that contains the metadata the system will extract at runtime
for each file format you test using filter.
To verify file format metadata extraction support using filter describes how to use the filter
utility. It is recommended that you always follow this process so that you can create and tune
policies that accurately detect file format metadata.

Note: The data output by the filter utility is in ASCII format. Symantec Data Loss Prevention
processes data in Unicode format. Therefore, you may rely on the existence of the fields
returned by the filter utility, but the metadata detected by Symantec Data Loss Prevention may
not look identical to the filter output.
Supported file formats for detection 992
Supported file formats for metadata extraction

To verify file format metadata extraction support using filter


1 On the file system where a detection server is installed, start a command prompt session.
2 Change directory to where the filter utility is located.
For example, on a default 64-bit Windows installation you would issue the following
command:
cd \Program Files\Symantec\DataLossPrevention\EnforceServer\15.5
\Protect\plugins\contentextraction\Verity\x64

3 Issue the following command to run the filter program and display its syntax and optional
parameters.
filter -help

As indicated by the help, you use the following syntax to execute the filter utility:
filter [options] inputfile outputfile

The inputfile is an instance of the file format you want to verify. The outputfile is a
file the filter utility writes the extracted data to.
Note the following extraction options:
■ To verify metadata extraction, use the "get doc summary info" option:-i
■ To verify content extraction, use no options: filter inputfile outputfile

4 Execute filter against an instance of the file format to verify metadata extraction.
For example, on Windows you would issue the following command:
filter -i \temp\myfile.doc \temp\metadata_output.txt

Where myfile.doc is a file containing metadata you want to verify and have copied to the
\temp directory, and metadata_output.txt is the name of the file you want the system to
generate and write the extracted data to.
5 Review the filter output. The output data should be similar to the following:

1 2 1252 CodePage 1 1 "S" Title 0 0 (null) 1 1 "P" Author 0 0 (null)


0 0 (null) 0 1 "" (null) 1 1 "m" LastAuthor 1 1 "1" RevNumber
1 3 6300 Minutes EditTime 1 3 Mon Aug 27 11:53:07 2007 LastPrinted

6 Refer to the following tables for an explanation of each metadata extraction field output
by the filter utility.
Table 43-16 repeats the output from Step 5, formatted for readability.
Table 43-17 explains each column field.
Supported file formats for detection 993
Supported file formats for metadata extraction

Table 43-16 Example filter metadata output

Column 1 Column 2 Column 3 Column 4

1 2 1252 CodePage

1 1 "S" Title

0 0 (null)

1 1 "P" Author

0 0 (null)

0 0 (null)

0 1 "" (null)

1 1 "m" LastAuthor

1 1 "1" RevNumber

1 3 6300 Minutes EditTime

1 3 Mon Aug 27 11:53:07 2007 LastPrinted

Table 43-17 Metadata fields generated by the filter utility

Column 1 Column 2 Column 3 Column 4

1 = valid field The type of data: The data payload for the The name of the field (empty
field. or null if the field is invalid).
0 = invalid field 1 = String
Note: You may ignore rows 2 = Integer
where the first column is 0.
3 = Date/Time

5 = Boolean

Distinguish metadata from file content and application data


Do not confuse metadata extraction with content extraction or application data. Some text that
may appear to be metadata is extracted as content or application data. Table 43-18 describes
some types of data that is not extracted as file format metadata to help you determine if and
when you need to enable metadata detection.
Supported file formats for detection 994
Supported file formats for metadata extraction

Note: This list is not exhaustive and is provided for quick reference only. There may be other
types of data that are not extracted as metadata. The best practice is to use the filter utility to
verify file format metadata support. See “Always use the filter utility to verify file format metadata
support” on page 991.

Table 43-18 Data not extracted as metadata

Content type Extraction method

Application data Application data including message transport information is extracted separately from
file format extraction. For all inbound messages, the system extracts message envelope
(header) and subject information as text at the application layer. The type of application
data that is extracted depends on the channels supported by the detection server or
endpoint.

Headers and footers Document header and footer text is extracted as content, not metadata. To avoid false
positives, it is recommended that you remove or whitelist headers and footers from
documents.

See “Use white listing to exclude non-sensitive content from partial matching”
on page 651.

See the Indexed Document Matching (IDM) chapter in the Symantec Data Loss
Prevention Administration Guide for details.

Markup text Markup text is extracted as content, not metadata. Markup text extraction is supported
for HTML, XML, SGML, and more. Markup text extraction is disabled by default.

See “Advanced server settings” on page 285.


See “Advanced agent settings” on page 2372.

See the "Advanced Server Settings" topic in the Symantec Data Loss Prevention
Administration Guide to enable it.

Hidden text Hidden text is extracted as content, not metadata. Hidden text extraction in the form
of tracked changes is supported for some Microsoft Office file formats. Hidden text
extraction is disabled by default.

See “Advanced server settings” on page 285.

See “Advanced agent settings” on page 2372.

See the "Advanced Server Settings" topic in the Symantec Data Loss Prevention
Administration Guide to enable it.

Watermarks Text-based watermarks are extracted as content, not metadata. Text-based watermark
detection is supported for Microsoft Word documents (versions 2003 and 2007). It is
not supported for other file formats.
Supported file formats for detection 995
Supported file formats for metadata extraction

Use and tune keyword lists to avoid false positives on metadata


Enabling metadata extraction can cause false positives because more text is checked for a
match. For example, if you have a policy that detects keywords and metadata extraction is
enabled, the policy reports a match if a keyword is present in the content or in the metadata.
Once the system has extracted the content and the metadata, the text is normalized and
streamed to the detection component for matching. The detection component has no knowledge
of the source of the text, whether it is application data, content, or metadata.
To detect file format metadata, you define keyword conditions for rules or exceptions that
contain keywords that are specific to one or more file formats. To avoid generating false
positives, clearly define the keyword lists in your policies. The keywords you use to detect
metadata should be unique and distinct from keywords or phrases you use to detect content.
Test and tune keyword lists to improve metadata detection accuracy.

Understand performance implications of enabling endpoint metadata


detection
On the endpoint, enabling metadata extraction does not add overhead if no content rules are
deployed. If content rules are deployed to the endpoint, enabling metadata extraction may
introduce minor overhead because there is extra data to inspect. Test and tune your endpoint
policy keyword lists to ensure that metadata detection is efficient.

Create a separate endpoint configuration for metadata detection


When you enable endpoint metadata detection, consider creating a custom endpoint
configuration specifically for metadata detection. By doing so you can easily revert to the
default configuration if necessary.

Use response rules to tag incidents with metadata


You cannot use metadata detection to apply tags to inbound files or documents that generate
incidents. If this is desired, consider using a FlexResponse plug-in.
See “About response rules” on page 1738.
See the Symantec Data Loss Prevention Administration Guide for details.
Chapter 44
Supported Office Open XML
formats for
high-performance content
extraction
This chapter includes the following topics:

■ About high-performance content extraction for Office Open XML formats

■ Enabling high-performance content extraction for Office Open XML files

■ About metadata extraction for Office Open XML files

■ About subfile extraction for Office Open XML files

About high-performance content extraction for Office


Open XML formats
High-performance content extraction for Office Open XML formats is enabled by default on
Symantec Data Loss Prevention cloud detectors. You can enable Office Open XML
high-performance content extraction on your on-premises detection servers. Office Open XML
content extraction is not available on the endpoint DLP Agent.
Enabling Office Open XML high-performance content extraction on your on-premises detection
servers significantly improves content extraction performance for such files.
Supported Office Open XML formats for high-performance content extraction 997
About high-performance content extraction for Office Open XML formats

Warning: Do not enable Office Open XML high-performance content extraction on detection
servers using Indexed Document Matching (IDM) policies.

Table 44-1 Office Open XML formats for high-performance content extraction

Format name Format extension

Office Open XML Word Processing DOCX

Office Open XML Word Processing Template DOTX

Office Open XML Macro-enabled Word Processing DOCM

Office Open XML Macro-enabled Word Processing DOTM


Template

Office Open XML Spreadsheet XLSX

Office Open XML Spreadsheet Template XLTX

Office Open XML Macro-enabled Spreadsheet XLSM

Office Open XML Macro-enabled Spreadsheet XLTM


Template

Office Open XML Spreadsheet Add-in XLAM

Office Open XML Presentation PPTX

Office Open XML Presentation Template POTX

Office Open XML Presentation Slide Show PPSX

Office Open XML Macro-enabled Presentation PPTM

Office Open XML Macro-enabled Presentation POTM


Template

Office Open XML Presentation Macro-enabled Slide PPSM


Show

Office Open XML Presentation Add-in PPAM


Supported Office Open XML formats for high-performance content extraction 998
Enabling high-performance content extraction for Office Open XML files

Enabling high-performance content extraction for


Office Open XML files
Warning: Do not enable Office Open XML high-performance content extraction on detection
servers using Indexed Document Matching (IDM) policies.

The following procedure describes how to enable Office Open XML high-performance content
extraction on your on-premises detection servers. Note that PowerPoint content extraction is
not enabled by default. If you want to extract content from PowerPoint files, follow the optional
third step in this procedure.
To enable Office Open XML high-performance content extraction
1 On your detection server, open the manifest.xml file, located in one of these locations:
■ Linux:
opt/Symantec/DataLossPrevention/ContentExtractionService/15.5/Plugins/Protect/
plugins/contentextraction/OfficeOpenXMLPlugin

■ Windows: \Program
Files\Symantec\DataLossPrevention\ContentExtractionService\15.5\Plugins\
Protect\plugins\contentextraction\OfficeOpenXMLPlugin

2 Locate the plugin id="OfficeOpenXMLPlugin" line, and set the disabled value to
false. The resulting line should read as follows (line breaks added for legibility):

<plugin id="OfficeOpenXMLPlugin"
version="1.0"
spiVersion="1.1"
disabled="false"
extractsAllSubfiles="true">

3 (Optional): To enable PowerPoint content extraction, add the following lines to the
manifest.xml file:

<documentType type="pptx">
<supportedOperations>
<operation type="FileTypeIdentification"/>
<operation type="TextExtraction"/>
<operation type="SubFileExtraction"/>
<operation type="MetadataExtraction"/>
</supportedOperations>
</documentType>
Supported Office Open XML formats for high-performance content extraction 999
About metadata extraction for Office Open XML files

4 Save and close the manifest.xml file.


5 Restart your detection server to apply the change.
6 Repeat steps 1-5 on all detection servers on which you want to enable Office Open XML
content extraction.

About metadata extraction for Office Open XML files


High-performance content extraction for Office Open XML formats supports metadata extraction
in all localized languages. The following table lists the extracted metadata properties:

Table 44-2 Office Open XML metadata

Property type Property

Core properties Author

Category

ContentStatus

ContentType

Create_DTM

Description

Identifier

Keywords

Language

LastAuthor

LastPrinted

LastSave_DTM

RevNumber

Subject

Title

Version

Application properties AppName

AppVersion
Supported Office Open XML formats for high-performance content extraction 1000
About subfile extraction for Office Open XML files

Table 44-2 Office Open XML metadata (continued)

Property type Property

CharCount

CharactersWithSpaces

Company

EditTime

HyperlinkBase

HyperlinksChanged

LineCount

LinksDirty

Manager

PageCount

Parcount

ScaleCrop

Security

SharedDoc

Template

TitleOfParts

WordCount

Custom properties RightsWatchMark, used by Symantec Information


Centric Tagging

All other custom properties

About subfile extraction for Office Open XML files


High-performance content extraction for Office Open XML formats supports subfile extraction
for image files, Object Linking and Embedding (OLE) Compound Files, and Open Packaging
Convention (OPC) container files.
Supported Office Open XML formats for high-performance content extraction 1001
About subfile extraction for Office Open XML files

Image file extraction


Image file extraction supports Symantec Data Loss Prevention's Form Recognition and Optical
Character Recognition (OCR) Sensitive Image Recognition features.
See “About Form Recognition detection” on page 695.
See “Server configuration—basic”on page 705 on page 705.
Symantec Data Loss Prevention supports content extraction for the following image formats:
■ Bitmap (BMP)
■ Portable Network Graphics (PNG)
■ Joint Photographic Experts Group (JPEG or JPG extensions)
■ Enhanced Metafile (EMF)
■ Windows Metafile (WMF)
There are two catagories of EMF/WMF files:
■ Files attached by users directly to Office Open XML documents.
■ Thumbnail or icon files created by Office applications to represent files attached to Office
Open XML documents.
All EMF and WMF files are counted as images, and therefore count against the maximum
image extraction limit. If you find that you are reaching the maximum image extraction limit
due to a large number of EMF/WMF files, you may want to disable EMF/WMF file extraction.

OLE and OPC file extraction


Symantec Data Loss Prevention can extract files embedded in Office Open XML documents.
The following table lists the supported file formats and embedding types:

Table 44-3
File format Embedding type

Adobe PDF OLE

Bitmap OLE

Excel 97 Worksheet OLE/OPC

Excel Binary OLE/OPC

Excel Chart OLE/OPC

Excel Macro-enabled Worksheet OLE/OPC

Excel Worksheet OLE/OPC


Supported Office Open XML formats for high-performance content extraction 1002
About subfile extraction for Office Open XML files

Table 44-3 (continued)

File format Embedding type

Graph Chart OLE

OpenDocument Presentation OLE

OpenDocument Slide OLE

OpenDocument Text OLE

Package 1 (Non-Office files, all formats) OLE

Package 2 (Non-Office files, all formats) OLE

PowerPoint 97 Presentation OLE/OPC

PowerPoint 97 Slide OLE/OPC

PowerPoint Macro-enabled Presentation OLE/OPC

PowerPoint Macro-enabled Slide OLE/OPC

PowerPoint Presentation OLE/OPC

PowerPoint Slide OLE/OPC

Visio OLE

Word OLE/OPC

Word 97 OLE/OPC

Word Macro OLE/OPC

WordPad OLE

Configuring plug-in settings


Symantec recommends using the default settings for high-performance Office Open XML
content extraction. You may encounter situations in which you want to adjust some settings,
however. This section documents the plugin_settings.txt configuration file, available in
one of the following locations on your detection server:
■ On Linux: /opt/Symantec/DataLossPrevention/ContentExtractionService/
15.5/Plugins/Protect/plugins/contentextraction/OfficeOpenXMLPlugin

■ On Windows: \Program
Files\Symantec\DataLossPrevention\ContentExtractionService\
15.5\Plugins\Protect\plugins\contentextraction\OfficeOpenXMLPlugin
Supported Office Open XML formats for high-performance content extraction 1003
About subfile extraction for Office Open XML files

The plugin_settings.txt file contains these settings (line breaks added for legibility):

dotnetcoreDir=/publish
extractEmfWmf=on
streamConfiguration=EmbeddedOdf,false,false;
CONTENTS,false,false;
Package,false,false;
AttachContents,false,false;
skipFilesWithSignatures=0x38,0x42,0x50,0x53;
imageSignatures=0x42,0x4d;
0xff,0xd8,0xff,0xe0;
0xff,0xd8,0xff,0xe1;
0xff,0xd8,0xff,0xe8;
0xff,0xd8,0xff,0xe2;
0xff,0xd8,0xff,0xe3;
0x89,0x50,0x4e,0x47,0x0d,0x0a,0x1a,0x0a;
0xd7,0xcd,0xc6,0x9a;

To disable EMF/WMF extraction, set extractEmfWmf=off.


The streamConfiguration settings specify the following:
■ The name of the stream in the OLE files that includes file content, such as EmbeddedOdf.
■ Whether to continue to the next stream if content is found in the current stream. This is
false by default, meaning that after finding the first valid content stream, the content
extractor will not continue evaluating the subsequent streams.
■ Whether to include the original OLE file as a subfile. This is also false by default.
The skipFilesWithSignatures setting specifies which file types to skip based on their hex
file signature. By default the content extractor skips PhotoShop Document (PSD) files, as
Symantec Data Loss Prevention cannot perform detection on these files. 0x38,0x42,0x50,0x53
is the hex file signature for PSD files.
The imageSignatures setting specifies which files should be treated as images based on their
hex file signature. By default, the list includes BMP, JPG, JPEG, PNG, and WMF file hex file
signatures.
Restart your detection server after editing the plugin_settings.txt file to apply your changes.
Chapter 45
Library of system data
identifiers
This chapter includes the following topics:

■ Library of system data identifiers

■ ABA Routing Number

■ Argentina Tax Identification Number

■ Australia Driver's License Number

■ Australian Business Number

■ Australian Company Number

■ Australian Medicare Number

■ Australian Passport Number

■ Australian Tax File Number

■ Austria Passport Number

■ Austria Tax Identification Number

■ Austria Value Added Tax (VAT) Number

■ Austrian Social Security Number

■ Belgian National Number

■ Belgium Driver's Licence Number

■ Belgium Passport Number


Library of system data identifiers 1005

■ Belgium Tax Identification Number

■ Belgium Value Added Tax (VAT) Number

■ Brazilian Election Identification Number

■ Brazilian National Registry of Legal Entities Number

■ Brazilian Natural Person Registry Number (CPF)

■ British Columbia Personal Healthcare Number

■ Bulgaria Value Added Tax (VAT) Number

■ Bulgarian Uniform Civil Number - EGN

■ Burgerservicenummer

■ Canada Driver's License Number

■ Canada Passport Number

■ Canada Permanent Residence (PR) Number

■ Canadian Social Insurance Number

■ Chilean National Identification Number

■ China Passport Number

■ Codice Fiscale

■ Colombian Addresses

■ Colombian Cell Phone Number

■ Colombian Personal Identification Number

■ Colombian Tax Identification Number

■ Credit Card Magnetic Stripe Data

■ Credit Card Number

■ Croatia National Identification Number

■ CUSIP Number

■ Cyprus Tax Identification Number

■ Cyprus Value Added Tax (VAT) Number

■ Czech Republic Driver's Licence Number


Library of system data identifiers 1006

■ Czech Republic Personal Identification Number

■ Czech Republic Tax Identification Number

■ Czech Republic Value Added Tax (VAT) Number

■ Denmark Personal Identification Number

■ Denmark Tax Identification Number

■ Denmark Value Added Tax (VAT) Number

■ Driver's License Number – CA State

■ Driver's License Number - FL, MI, MN States

■ Driver's License Number - IL State

■ Driver's License Number - NJ State

■ Driver's License Number - NY State

■ Driver's License Number - WA State

■ Driver's License Number - WI State

■ Drug Enforcement Agency (DEA) Number

■ Estonia Driver's Licence Number

■ Estonia Passport Number

■ Estonia Personal Identification Code

■ Estonia Value Added Tax (VAT) Number

■ European Health Insurance Card Number

■ Finland Driver's Licence Number

■ Finland European Health Insurance Number

■ Finland Passport Number

■ Finland Tax Identification Number

■ Finland Value Added Tax (VAT) Number

■ Finnish Personal Identification Number

■ France Driver's License Number

■ France Health Insurance Number


Library of system data identifiers 1007

■ France Tax Identification Number

■ France Value Added Tax (VAT) Number

■ French INSEE Code

■ French Passport Number

■ French Social Security Number

■ German Passport Number

■ German Personal ID Number

■ Germany Driver's License Number

■ Germany Value Added Tax (VAT) Number

■ Germany Tax Identification Number

■ Greece Passport Number

■ Greece Social Security Number (AMKA)

■ Greek Tax Identification Number

■ Greece Value Added Tax (VAT) Number

■ Healthcare Common Procedure Coding System (HCPCS CPT Code)

■ Health Insurance Claim Number

■ Hong Kong ID

■ Hungary Driver's Licence Number

■ Hungary Passport Number

■ Hungarian Social Security Number

■ Hungarian Tax Identification Number

■ Hungarian VAT Number

■ IBAN Central

■ IBAN East

■ IBAN West

■ Iceland National Identification Number

■ Iceland Passport Number


Library of system data identifiers 1008

■ Iceland Value Added Tax (VAT) Number

■ Indian Aadhaar Card Number

■ Indian Permanent Account Number

■ India RuPay Card Number

■ Indonesian Identity Card Number

■ International Mobile Equipment Identity Number

■ International Securities Identification Number

■ IP Address

■ IPv6 Address

■ Ireland Passport Number

■ Ireland Tax Identification Number

■ Ireland Value Added Tax (VAT) Number

■ Irish Personal Public Service Number

■ Israel Personal Identification Number

■ Italy Driver's Licence Number

■ Italy Health Insurance Number

■ Italy Passport Number

■ Italy Value Added Tax (VAT) Number

■ Japan Driver's License Number

■ Japan Passport Number

■ Japanese Juki-Net Identification Number

■ Japanese My Number - Corporate

■ Japanese My Number - Personal

■ Kazakhstan Passport Number

■ Korea Passport Number

■ Korea Residence Registration Number for Foreigners

■ Korea Residence Registration Number for Korean


Library of system data identifiers 1009

■ Latvia Driver's Licence Number

■ Latvia Passport Number

■ Latvia Personal Identification Number

■ Latvia Value Added Tax (VAT) Number

■ Liechtenstein Passport Number

■ Lithuania Personal Identification Number

■ Lithuania Tax Identification Number

■ Lithuania Value Added Tax (VAT) Number

■ Luxembourg National Register of Individuals Number

■ Luxembourg Passport Number

■ Luxembourg Tax Identification Number

■ Luxembourg Value Added Tax (VAT) Number

■ Macau National Identification Number

■ Malaysia Passport Number

■ Malaysian MyKad Number (MyKad)

■ Malta National Identification Number

■ Malta Tax Identification Number

■ Malta Value Added Tax (VAT) Number

■ Medicare Beneficiary Identifier

■ Mexican Personal Registration and Identification Number

■ Mexican Tax Identification Number

■ Mexican Unique Population Registry Code

■ Mexico CLABE Number

■ National Drug Code (NDC)

■ National Provider Identifier Number

■ Netherlands Bank Account Number

■ Netherlands Driver's License Number


Library of system data identifiers 1010

■ Netherlands Passport Number

■ Netherlands Tax Identification Number

■ Netherlands Value Added Tax (VAT) Number

■ New Zealand Driver's Licence Number

■ New Zealand National Health Index Number

■ New Zealand Passport Number

■ Norway Driver's Licence Number

■ Norway National Identification Number

■ Norway Value Added Tax Number

■ Norwegian Birth Number

■ People's Republic of China ID

■ Poland Driver's Licence Number

■ Poland European Health Insurance Number

■ Poland Passport Number

■ Poland Value Added Tax (VAT) Number

■ Polish Identification Number

■ Polish REGON Number

■ Polish Social Security Number (PESEL)

■ Polish Tax Identification Number

■ Portugal Driver's Licence Number

■ Portugal National Identification Number

■ Portugal Passport Number

■ Portugal Tax Identification Number

■ Portugal Value Added Tax (VAT) Number

■ Randomized US Social Security Number (SSN)

■ Romania Driver's Licence Number

■ Romania National Identification Number


Library of system data identifiers 1011

■ Romania Value Added Tax (VAT) Number

■ Romanian Numerical Personal Code

■ Russian Passport Identification Number

■ Russian Taxpayer Identification Number

■ SEPA Creditor Identifier Number North

■ SEPA Creditor Identifier Number South

■ SEPA Creditor Identifier Number West

■ Serbia Unique Master Citizen Number

■ Serbia Value Added Tax (VAT) Number

■ Singapore NRIC data identifier

■ Slovakia Driver's Licence Number

■ Slovakia National Identification Number

■ Slovakia Passport Number

■ Slovakia Value Added Tax (VAT) Number

■ Slovenia Passport Number

■ Slovenia Tax Identification Number

■ Slovenia Unique Master Citizen Number

■ Slovenia Value Added Tax (VAT) Number

■ South African Personal Identification Number

■ South Korea Resident Registration Number

■ Spain Value Added Tax (VAT) Number

■ Spain Driver's Licence Number

■ Spanish Customer Account Number

■ Spanish DNI ID

■ Spanish Passport Number

■ Spanish Social Security Number

■ Spanish Tax Identification (CIF)


Library of system data identifiers 1012

■ Sri Lanka National Identity Number

■ Sweden Driver's Licence Number

■ Sweden Tax Identification Number

■ Sweden Value Added Tax (VAT) Number

■ Swedish Passport Number

■ Sweden Personal Identification Number

■ SWIFT Code

■ Swiss AHV Number

■ Swiss Social Security Number (AHV)

■ Switzerland Health Insurance Card Number

■ Switzerland Passport Number

■ Switzerland Value Added Tax (VAT) Number

■ Taiwan ROC ID

■ Thailand Passport Number

■ Thailand Personal Identification Number

■ Turkish Identification Number

■ UK Bank Account Number Sort Code

■ UK Drivers Licence Number

■ UK Electoral Roll Number

■ UK National Health Service (NHS) Number

■ UK National Insurance Number

■ UK Passport Number

■ UK Tax ID Number

■ UK Value Added Tax (VAT) Number

■ Ukraine Identity Card

■ Ukraine Passport (Domestic)

■ Ukraine Passport (International)


Library of system data identifiers 1013
Library of system data identifiers

■ United Arab Emirates Personal Number

■ US Individual Tax Identification Number (ITIN)

■ US Passport Number

■ US Social Security Number (SSN)

■ US ZIP+4 Postal Codes

■ Venezuela National Identification Number

Library of system data identifiers


This section lists all data identifiers provided by the Data Loss Prevention system.

ABA Routing Number


The American Banking Association (ABA) routing number, also known as a routing transit
number (RTN), is used to identify financial institutions and process transactions.
The ABA Routing Number data identifier detects a nine-digit number that matches the ABA
Routing Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects a nine-digit number with checksum validation.
See “ABA Routing Number wide breadth” on page 1013.
■ The medium breadth detects a nine-digit number with checksum validation, and eliminates
common test numbers.
See “ABA Routing Number medium breadth” on page 1014.
■ The narrow breadth detects a nine-digit number with checksum validation, eliminates
common test numbers, and requires the presence of related keywords.
See “ABA Routing Number narrow breadth” on page 1014.

ABA Routing Number wide breadth


The wide breadth detects a nine-digit number with checksum validation.

Table 45-1 ABA Routing Number wide-breadth patterns

Pattern

[0123678]\d{8}
Library of system data identifiers 1014
ABA Routing Number

Table 45-1 ABA Routing Number wide-breadth patterns (continued)

Pattern

[0123678]\d{3}-\d{4}-\d

Table 45-2 ABA Routing Number wide-breadth validators

Mandatory validator Description

ABA Checksum Every ABA routing number must start with the following
two digits: 00-15,21-32,61-72,80 and pass an ABA-specific,
position-weighted checksum.

ABA Routing Number medium breadth


The medium breadth detects a nine-digit number with checksum validation, and eliminates
common test numbers.

Table 45-3 ABA Routing Number medium-breadth patterns

Pattern

[0123678]\d{8}

[0123678]\d{3}-\d{4}-\d

Table 45-4 ABA Routing Number medium-breadth validators

Mandatory validator Description

ABA Checksum Computes the checksum and validates the pattern against
it.

Exclude beginning characters Data beginning with any of the following list of values is
not matched:

123456789

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding numbers.

ABA Routing Number narrow breadth


The narrow breadth detects a nine-digit number with checksum validation, eliminates common
test numbers, and requires the presence of related keywords.
Library of system data identifiers 1015
Argentina Tax Identification Number

Table 45-5 ABA Routing Number narrow-breadth patterns

Pattern

[0123678]\d{8}

[0123678]\d{3}-\d{4}-\d

Table 45-6 ABA Routing Number narrow-breadth validators

Mandatory validator Description

ABA Checksum Computes the checksum and validates the pattern against
it.

Exclude beginning characters Data beginning with any of the following list of values is
not matched:

123456789

Duplicate digits Ensures that a string of digits is not all the same.

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched when you use this
option.

aba, aba #, aba routing #, aba routing number, aba#,


abarouting#, abaroutingnumber, american bank
association routing #, american bank association
routing number, americanbankassociationrouting#,
americanbankassociationroutingnumber, bank routing
#, bank routing number, bankrouting#,
bankroutingnumber

Number delimiter Validates a match by checking the surrounding numbers.

Argentina Tax Identification Number


Argentina issues a DNI (Documento Nacional de Identidad) as its national form of identification.
It is assigned at birth by the National Registry for People. For tax paying purposes, the CUIT
and the CUIL numbers are issued which are based on the DNI.
The Argentina Tax Identification Number data identifier detects an 11-digit number that matches
the Argentina Tax Identification Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects an 11-digit number without checksum validation.
See “Argentina Tax Identification Number wide breadth” on page 1016.
Library of system data identifiers 1016
Argentina Tax Identification Number

■ The medium breadth detects an 11-digit number with checksum validation. It also checks
for common test numbers and duplicate digits.
See “Argentina Tax Identification Number medium breadth” on page 1016.
■ The narrow breadth detects an 11-digit number that passes checksum validation. It also
checks for common test numbers, duplicate digits, and requires the presence of related
keywords.
See “Argentina Tax Identification Number narrow breadth” on page 1017.

Argentina Tax Identification Number wide breadth


The wide breadth detects an 11-digit number without checksum validation.

Table 45-7 Argentina Tax Identification Number wide-breadth patterns

Pattern

20-\d{8}-\d

23-\d{8}-\d

27-\d{8}-\d

30-\d{8}-\d

33-\d{8}-\d

34-\d{8}-\d

Table 45-8 Argentina Tax Identification Number wide-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Argentina Tax Identification Number medium breadth


The medium breadth detects an 11-digit number with checksum validation. It also checks for
common test numbers and duplicate digits.

Table 45-9 Argentina Tax Identification Number medium-breadth patterns

Pattern

20-\d{8}-\d

23-\d{8}-\d
Library of system data identifiers 1017
Argentina Tax Identification Number

Table 45-9 Argentina Tax Identification Number medium-breadth patterns (continued)

Pattern

27-\d{8}-\d

30-\d{8}-\d

33-\d{8}-\d

34-\d{8}-\d

Table 45-10 Argentina Tax Identification Number medium breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Argentinian Tax Identity Number Validation Check Computes the checksum and validates the pattern against
it.

Argentina Tax Identification Number narrow breadth


The narrow breadth detects an 11-digit number that passes checksum validation. It also checks
for common test numbers, duplicate digits, and requires the presence of related keywords.

Table 45-11 Argentina Tax Identification Number narrow-breadth patterns

Pattern

20-\d{8}-\d

23-\d{8}-\d

27-\d{8}-\d

30-\d{8}-\d

33-\d{8}-\d

34-\d{8}-\d

Table 45-12 Argentina Tax Identification Number narrow-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.


Library of system data identifiers 1018
Australia Driver's License Number

Table 45-12 Argentina Tax Identification Number narrow-breadth validators (continued)

Mandatory validator Description

Argentinian Tax Identity Number Validation Check Computes the checksum and validates the pattern against
it.

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched when you use this
option.

Inputs:

Tax ID, tax number, Tax No., taxpayer ID, tax identity
number, tax identification no, tax identification number,
TaxID#, taxidnumber#, taxpayer number, Argentina
taxpayer ID

Número de Identificación Fiscal, número de


contribuyente

Australia Driver's License Number


A driver's license is required in Australia before a person is permitted to drive a motor vehicle
of any description on a road in Australia.
The Australia Driver's License Number data identifier detects an 8-, 9-, or 10-digit number, or
a six-digit alphanumeric pattern that matches the Australia Driver's license number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects an eight-, nine-, or 10-digit number, or a six-digit alphanumeric
pattern that matches the Australia Driver's license number format. It also checks for common
test numbers.
See “Australia Driver's License Number wide breadth” on page 1018.
■ The wide breadth detects an eight-, nine-, or 10-digit number, or a six-digit alphanumeric
pattern that matches the Australia Driver's license number format. It also checks for common
test numbers, and requires the presence of related keywords.
See “Australia Driver's License Number narrow breadth” on page 1019.

Australia Driver's License Number wide breadth


The wide breadth detects an eight-, nine-, or 10-digit number, or a six-digit alphanumeric
pattern that matches the Australia Driver's license number format. It also checks for common
test numbers.
Library of system data identifiers 1019
Australia Driver's License Number

Table 45-13 Australia Driver's License Number wide-breadth patterns

Pattern

\d\d\d \d\d\d \d\d\d

\d\d \d\d\d \d\d\d

[A-Za-z]\d\d\d\d\d

\d\d\d[-]\d\d\d[-]\d\d\d\d

Table 45-14 Australia Driver's License Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

00000, 11111, 22222, 33333, 44444, 55555, 66666,


77777, 88888, 99999

Australia Driver's License Number narrow breadth


The wide breadth detects an eight-, nine-, or 10-digit number, or a six-digit alphanumeric
pattern that matches the Australia Driver's license number format. It also checks for common
test numbers, and requires the presence of related keywords.

Table 45-15 Australia Driver's License Number narrow-breadth patterns

Pattern

\d\d\d \d\d\d \d\d\d

\d\d \d\d\d \d\d\d

[A-Za-z]\d\d\d\d\d

\d\d\d[-]\d\d\d[-]\d\d\d\d

Table 45-16 Australia Driver's License Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.


Library of system data identifiers 1020
Australian Business Number

Table 45-16 Australia Driver's License Number narrow-breadth validators (continued)

Mandatory validator Description

Exclude ending characters Data ending with any of the following list of values is not
matched:

00000, 11111, 22222, 33333, 44444, 55555, 66666,


77777, 88888, 99999

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

driver license, drivers license, driving license, driver


license number, drivers license number, driving license
number, dlno#, drivers lic., driver's license number,
driver licence, drivers licence, driving licence, driver
permit, drivers permit, driving permit, license number,
licence number

Australian Business Number


The Australian Business Number, or ABN, is a unique identifier issued by the Australian
Business Register (ABR), operated by the Australian Taxation Office (ATO).
The Australian Business Number data identifier detects an 11-digit number that matches the
Australian Business Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects an 11-digit number without checksum validation.
See “Australian Business Number wide breadth” on page 1020.
■ The medium breadth detects an 11-digit number with checksum validation. It also eliminates
common test numbers and ranges reserved for future use.
See “Australian Business Number medium breadth” on page 1021.
■ The narrow breadth detects an 11-digit number that passes checksum validation. It also
eliminates common test numbers, ranges reserved for future use, duplicate digits, and
requires the presence of ABN-related keywords.
See “Australian Business Number narrow breadth” on page 1021.

Australian Business Number wide breadth


The wide breadth detects an 11-digit number without checksum validation.
Library of system data identifiers 1021
Australian Business Number

Table 45-17 Australian Business Number wide-breadth patterns

Pattern

\d{11}

\d{2}[ -]\d{3}[ -]\d{3}[ -]\d{3}

Table 45-18 Australian Business Number wide-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Australian Business Number medium breadth


The medium breadth detects an 11-digit number with checksum validation. It also eliminates
common test numbers, such as 123456789, and ranges reserved for future use.

Table 45-19 Australian Business Number medium-breadth patterns

Pattern

\d{11}

\d{2}[ -]\d{3}[ -]\d{3}[ -]\d{3}

Table 45-20 Australian Business Number medium-breadth validator

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Australian Business Number Validation Check Computes the checksum and validates the pattern against
it.

Australian Business Number narrow breadth


The narrow breadth detects an 11-digit number that passes checksum validation. It also
eliminates common test numbers, such as 123456789, ranges reserved for future use, duplicate
digits, and requires the presence of ABN-related keywords.

Table 45-21 Australian Business Number narrow-breadth patterns

Pattern

\d{11}
Library of system data identifiers 1022
Australian Company Number

Table 45-21 Australian Business Number narrow-breadth patterns (continued)

Pattern

\d{2}[ -]\d{3}[ -]\d{3}[ -]\d{3}

Table 45-22 Australian Business Number narrow-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

Australian Business Number Validation Check Computes the checksum and validates the pattern against
it.

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched when you use this
option.

Inputs:

Australia Business No, Business No, BusinessNo#,


Business Number, Australia Business No., ABN, abn#,
businessID#, business ID, abn, ABN#, business
number, businessno#

Australian Company Number


An Australian Company Number (ACN) is a unique nine-digit number issued by the Australian
Securities and Investments Commission to every company registered under the Commonwealth
Corporations Act 2001.
The Australian Company Number data identifier detects a nine-digit number that matches the
Australian Company Number format.
The Australia Company Number data identifier provides three breadths of detection:
■ The wide breadth detects a nine-digit number without checksum validation.
See “Australian Company Number wide breadth” on page 1023.
■ The medium breadth detects a nine-digit number with checksum validation.
See “Australian Company Number medium breadth” on page 1023.
■ The narrow breadth detects a nine-digit number with checksum validation. It also requires
the presence of ACN-related keywords.
See “Australian Company Number narrow breadth” on page 1023.
Library of system data identifiers 1023
Australian Company Number

Australian Company Number wide breadth


The wide breadth detects a nine-digit number without checksum validation.

Table 45-23 Australian Company Number wide-breadth pattern

Pattern

\d{3} \d{3} \d{3}

Table 45-24 Australian Company Number wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Australian Company Number medium breadth


The wide breadth detects a nine-digit number without checksum validation.

Table 45-25 Australian Company Number medium-breadth pattern

Pattern

\d{3} \d{3} \d{3}

Table 45-26 Australian Company Number medium-breadth validators

Mandatory validator Description

Australian Company Number Validation Check Computes the checksum and validates the pattern against
it.

Australian Company Number narrow breadth


The wide breadth detects a nine-digit number without checksum validation.

Table 45-27 Australian Company Number narrow-breadth pattern

Pattern

\d{3} \d{3} \d{3}


Library of system data identifiers 1024
Australian Medicare Number

Table 45-28 Australian Company Number narrow-breadth validators

Mandatory validator Description

Australian Company Number Validation Check Computes the checksum and validates the pattern against
it.

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding numbers.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

Australia Company Number, ACN, Australia Company


No., ACN No, ACN No#, Australia Company No#, ACN
Number

Australian Medicare Number


The Australian Medicare Number is a personal identifier allocated by the Australian Health
Insurance Commission to eligible persons under the Medicare scheme. This number appears
on the Australian Medicare card.
The Australian Medicare Number data identifier detects an eight- or nine-digit number that
matches the Australian Medicare Number format.
The Australian Medicare Number data identifier provides three breadths of detection:
■ The wide breadth detects an eight- or nine-digit number without checksum validation.
See “Australian Medicare Number wide breadth” on page 1024.
■ The medium breadth detects an eight- or nine-digit number with checksum validation.
See “Australian Medicare Number medium breadth” on page 1025.
■ The narrow breadth detects an eight- or nine-digit number with checksum validation. It also
requires the presence of related keywords.
See “Australian Medicare Number narrow breadth” on page 1026.

Australian Medicare Number wide breadth


The wide breadth detects an eight- or nine-digit number without checksum validation.
Library of system data identifiers 1025
Australian Medicare Number

Table 45-29 Australian Medicare Number wide-breadth patterns

Pattern

[2-6]\d{10}

[2-6]\d{9}

[2-6]\d{3} \d{5} \d{1}

[2-6]\d{3}-\d{5}-\d{1}

[2-6]\d{9}[ -/]\d{1}

[2-6]\d{3} \d{5} \d{1}[ -/]\d{1}

[2-6]\d{3}-\d{5}-\d{1}[ -/]\d{1}

[2-6]\d{3} \d{5} \d \d

[2-6]\d{3}-\d{5}-\d-\d

Table 45-30 Australian Medicare Number wide-breadth validator

Validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Australian Medicare Number medium breadth


The medium breadth detects an eight- or nine-digit number with checksum validation.

Table 45-31 Australian Medicare Number medium breadth patterns

Pattern

[2-6]\d{10}

[2-6]\d{9}

[2-6]\d{3} \d{5} \d{1}

[2-6]\d{3}-\d{5}-\d{1}

[2-6]\d{9}[ -/]\d{1}

[2-6]\d{3} \d{5} \d{1}[ -/]\d{1}

[2-6]\d{3}-\d{5}-\d{1}[ -/]\d{1}
Library of system data identifiers 1026
Australian Medicare Number

Table 45-31 Australian Medicare Number medium breadth patterns (continued)

Pattern

[2-6]\d{3} \d{5} \d \d

[2-6]\d{3}-\d{5}-\d-\d

Table 45-32 Australian Medicare Number medium breadth validators

Validator Description

Australian Medicare Number Validation Check Computes the checksum and validates the pattern against
it.

Number delimiter Validates a match by checking the surrounding characters.

Australian Medicare Number narrow breadth


The narrow breadth detects an eight- or nine-digit number with checksum validation. It also
requires the presence of related keywords.

Table 45-33 Australian Medicare Number narrow breadth patterns

Pattern

[2-6]\d{10}

[2-6]\d{9}

[2-6]\d{3} \d{5} \d{1}

[2-6]\d{3}-\d{5}-\d{1}

[2-6]\d{9}[ -/]\d{1}

[2-6]\d{3} \d{5} \d{1}[ -/]\d{1}

[2-6]\d{3}-\d{5}-\d{1}[ -/]\d{1}

[2-6]\d{3} \d{5} \d \d

[2-6]\d{3}-\d{5}-\d-\d

Table 45-34 Australian Medicare Number narrow breadth validators

Validator Description

Duplicate digits Ensures that a string of digits is not all the same.
Library of system data identifiers 1027
Australian Passport Number

Table 45-34 Australian Medicare Number narrow breadth validators (continued)

Validator Description

Australian Medicare Number Validation Check Computes the checksum and validates the pattern against
it.

Number delimiter Validates a match by checking the surrounding characters.

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched when you use this
option.

Inputs:

Australian Medicare Number, Medicare Number,


Medicare No., Medicare No#, Australian Medicare No.,
Australian Medicare No#

Australian Passport Number


Australian passports are travel documents issued to Australian citizens by the Australian
Passport Office of the Department of Foreign Affairs and Trade.
The Australian Passport Number data identifier detects an eight-character alphanumeric pattern
that matches the Australian Passport Number format.
The Australia Passport Number data identifier provides two breadths of detection:
■ The wide breadth detects an eight-character alphanumeric pattern without checksum
validation.
See “ Australian Passport Number wide breadth” on page 1027.
■ The narrow breadth detects an eight-character alphanumeric pattern without checksum
validation. It requires the presence of related keywords.
See “Australian Passport Number narrow breadth” on page 1028.

Australian Passport Number wide breadth


The wide breadth detects an eight-character alphanumeric pattern without checksum validation.

Table 45-35 Australian Passport Number wide-breadth patterns

Pattern

[XBCEGTHJLMNP]\d{7}

[XBCEGTHJLMNP] \d{7}
Library of system data identifiers 1028
Australian Passport Number

Table 45-36 Australian Passport Number wide-breadth validator

Mandatory validator Description

Exclude ending characters Data ending with any of the following list of values is not
matched:

0000000, 1111111, 2222222, 3333333, 4444444,


5555555, 6666666, 7777777, 8888888, 9999999

Australian Passport Number narrow breadth


The narrow breadth detects an eight-character alphanumeric pattern without checksum
validation. It requires the presence of related keywords.

Table 45-37 Australian Passport Number narrow-breadth patterns

Pattern

[XBCEGTHJLMNP]\d{7}

[XBCEGTHJLMNP] \d{7}

Table 45-38 Australian Passport Number narrow-breadth validators

Mandatory validator Description

Exclude ending characters Data ending with any of the following list of values is not
matched:

0000000, 1111111, 2222222, 3333333, 4444444,


5555555, 6666666, 7777777, 8888888, 9999999

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

Australian passport no., Australian Passport Number,


Australian passport number, Passport number,
passport number, passport#, passportno,
passportnumber#, australianpassportnumber,
passportno#
Library of system data identifiers 1029
Australian Tax File Number

Australian Tax File Number


The Australian Tax File Number (TFN) is an eight- or nine-digit number issued by the Australian
Taxation Office (ATO) to taxpayers (individual, company, superannuation fund, partnership,
or trust) to identify their Australian tax dealings.
The Australian Tax File Number data identifier detects an eight- or nine-digit number that
matches the Australian Tax File Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects an eight- or nine-digit number with checksum validation.
See Table 45-39 on page 1029.
■ The narrow breadth detects an eight- or nine-digit number with checksum validation. It also
requires the presence of related keywords.
See “Australian Tax File Number narrow breadth” on page 1029.

Australian Tax File Number wide breadth


The wide breadth detects an eight- or nine-digit number with checksum validation.

Table 45-39 Australian Tax File Number wide-breadth patterns

Patterns

\d{8}

\d{9}

Table 45-40 Australian Tax File Number wide-breadth validators

Mandatory validator Description

Australian Tax File validation check Computes the checksum and validates the pattern against
it.

Australian Tax File Number narrow breadth


The narrow breadth detects an eight- or nine-digit number with checksum validation. It also
requires the presence of related keywords.

Table 45-41 Australian Tax File Number narrow-breadth patterns

Patterns

\d{8}
Library of system data identifiers 1030
Austria Passport Number

Table 45-41 Australian Tax File Number narrow-breadth patterns (continued)

Patterns

\d{9}

Table 45-42 Australian Tax File Number narrow-breadth validators

Mandatory validators Description

Australian Tax File validation check Computes the checksum and validates the pattern
against it.

Find keywords At least one of the following keywords or key


phrases must be present for the data to be matched
when you use this option.

Inputs:

TFN, Tax File Number, Australia TFN, Australia


Tax File Number, ATO, ATO TFN, ATO tax file
number

Austria Passport Number


Austrian passports are travel documents issued to Austrian citizens by the Austrian Passport
Office of the Department of Foreign Affairs and Trade, both in Austria and overseas, and
enable the passport holder to travel internationally.
The Austria Passport Number data identifier detects an eight-character alphanumeric pattern
that matches the Austria Passport Number format.
The Austria Passport Number data identifier provides two breadths of detection:
■ The wide breadth detects an eight-character alphanumeric pattern without checksum
validation.
See “Austria Passport Number wide breadth” on page 1030.
■ The narrow breadth detects an eight-character alphanumeric pattern. It also requires the
presence of passport-related keywords.
See “Austria Passport Number narrow breadth” on page 1031.

Austria Passport Number wide breadth


The wide breadth detects an eight-character alphanumeric pattern without checksum validation.
Library of system data identifiers 1031
Austria Tax Identification Number

Table 45-43 Austria Passport Number wide-breadth patterns

Patterns

\l[ ]\d{7}

\l\d{7}

Table 45-44 Austria Passport Number wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Austria Passport Number narrow breadth


The narrow breadth detects an eight-character alphanumeric pattern. It also requires the
presence of passport-related keywords.

Table 45-45 Austria Passport Number narrow-breadth patterns

Pattern

\l[ ]\d{7}

\l\d{7}

Table 45-46 Austria Passport Number narrow-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched when you use this
option.

Inputs:

REISEPASS, passport, ÖSTERREICHISCH REISEPASS,


reisepass

Austria Tax Identification Number


Austria issues nine-digit tax identification numbers to individuals based on their area of residence
to identify taxpayers and facilitate national taxes.
Library of system data identifiers 1032
Austria Tax Identification Number

The Austria Tax Identification Number data identifier detects a nine-digit number that matches
the Austria Tax Identification Number format.
The Austria Tax Identification Number provides two breadths of detection:
■ The wide breadth detects a nine-digit number without checksum validation.
See “Austria Tax Identification Number wide breadth” on page 1032.
■ The narrow breadth detects a nine-digit number. It also requires the presence of related
keywords.
See “Austria Tax Identification Number narrow breadth” on page 1032.

Austria Tax Identification Number wide breadth


The wide breadth detects a nine-digit number without checksum validation.

Table 45-47 Austria Tax Identification Number wide-breadth patterns

Pattern

\d{2}-\d{3}/\d{4}

\d{2} \d{3} \d{4}

\d{9}

Table 45-48 Austria Tax Identification Number wide-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

Austria Tax Identification Number narrow breadth


The narrow breadth detects a nine-digit number. It also requires the presence of related
keywords.

Table 45-49 Austria Tax Identification Number narrow-breadth patterns

Patterns

\d{2}-\d{3}/\d{4}

\d{2} \d{3} \d{4}

\d{9}
Library of system data identifiers 1033
Austria Value Added Tax (VAT) Number

Table 45-50 Austria Tax Identification Number narrow-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

Austria, TIN, tax identification number, tax number,


Austrian Tax Number, Österreich, Steuernummer

Austria Value Added Tax (VAT) Number


Value Added Tax (VAT) is a consumption tax that is borne by the end consumer. VAT is paid
for each transaction in the manufacturing and distribution process. For Austria, the VAT number
is issued by the tax office for the region in which the business is established.
The Austria Value Added Tax (VAT) Number data identifier detects an 11-character
alphanumeric pattern that matches Austria Value Added Tax (VAT) Number format.
The Austria Value Added Tax (VAT) Number data identifier provides three breadths of detection:
■ The wide breadth detects an 11-character alphanumeric pattern preceded with ATU without
checksum validation.
See “Austria Value Added Tax (VAT) Number wide breadth” on page 1033.
■ The medium breadth detects an 11-character alphanumeric pattern preceded with ATU with
checksum validation.
See “Austria Value Added Tax (VAT) Number medium breadth” on page 1034.
■ The narrow breadth detects an 11-character alphanumeric pattern preceded with ATU with
checksum validation. It also requires the presence of related keywords.
See “Austria Value Added Tax (VAT) Number narrow breadth” on page 1035.

Austria Value Added Tax (VAT) Number wide breadth


The wide breadth detects an 11-character alphanumeric pattern preceded with ATU without
checksum validation.
Library of system data identifiers 1034
Austria Value Added Tax (VAT) Number

Table 45-51 Austria Value Added Tax (VAT) Number wide-breadth patterns

Patterns

[Aa][Tt][Uu]\d{8}

[Aa][Tt] [Uu]\d{8}

[Aa][Tt][Uu] \d{8}

[Aa][Tt][Uu]\d{3} \d{4} \d

[Aa][Tt][Uu]\d{2} \d{4} \d{2}

Table 45-52 Austria Value Added Tax (VAT) Number wide-breadth validators

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

00000000, 11111111, 22222222, 33333333, 44444444,


55555555, 66666666, 77777777, 88888888, 99999999

Austria Value Added Tax (VAT) Number medium breadth


The medium breadth detects an 11-character alphanumeric pattern preceded with ATU with
checksum validation.

Table 45-53 Austria Value Added Tax (VAT) Number medium-breadth patterns

Patterns

[Aa][Tt][Uu]\d{8}

[Aa][Tt] [Uu]\d{8}

[Aa][Tt][Uu] \d{8}

[Aa][Tt][Uu]\d{3} \d{4} \d

[Aa][Tt][Uu]\d{2} \d{4} \d{2}


Library of system data identifiers 1035
Austria Value Added Tax (VAT) Number

Table 45-54 Austria Value Added Tax (VAT) Number medium-breadth validators

Mandatory validator Description

Austria VAT Number Validation Check Computes the checksum and validates the pattern against
it.

Austria Value Added Tax (VAT) Number narrow breadth


The narrow breadth detects an 11-character alphanumeric pattern preceded with ATU with
checksum validation. It also requires the presence of VAT-related keywords.

Table 45-55 Austria Value Added Tax (VAT) Number narrow-breadth patters

Patterns

[Aa][Tt][Uu]\d{8}

[Aa][Tt] [Uu]\d{8}

[Aa][Tt][Uu] \d{8}

[Aa][Tt][Uu]\d{3} \d{4} \d

[Aa][Tt][Uu]\d{2} \d{4} \d{2}

Table 45-56 Austria Value Added Tax (VAT) Number narrow breadth-validators

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding characters.

Austria VAT Number Validation Check Computes the checksum and validates the pattern against
it.

Exclude ending characters Data ending with any of the following list of values is not
matched:

00000000, 11111111, 22222222, 33333333, 44444444,


55555555, 66666666, 77777777, 88888888, 99999999
Library of system data identifiers 1036
Austrian Social Security Number

Table 45-56 Austria Value Added Tax (VAT) Number narrow breadth-validators (continued)

Mandatory validators Description

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

vat number, vat, vat#, austrian vat number, vat no.,


vatno#, value added tax number, austrian vat, MwSt,
Umsatzsteuernummer, MwStNummer,
Ust.-Identifikationsnummer, umsatzsteuer,
Umsatzsteuer-Identifikationsnummer, vat identification
number, atu number, uid number

Austrian Social Security Number


A 10-digit social security number is allocated to Austrian citizens who receive available social
security benefits. It is allocated by the umbrella association of the Austrian social security
authorities.
The Austrian Social Security Number data identifier detects a 10-digit number that matches
the Austrian Social Security Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects a 10-digit number without checksum validation.
See “Austrian Social Security Number wide breadth” on page 1036.
■ The medium breadth detects a 10-digit number that passes checksum validation. It also
eliminates common test numbers and ranges reserved for future use.
See “Austrian Social Security Number medium breadth” on page 1037.
■ The narrow breadth detects a 10-digit number that passes checksum validation. It also
eliminates common test numbers, ranges reserved for future use, duplicate digits, and
requires the presence of Austrian Social Security Number-related keywords.
See “Austrian Social Security Number narrow breadth” on page 1037.

Austrian Social Security Number wide breadth


The wide breadth detects a 10-digit number without checksum validation.
Library of system data identifiers 1037
Austrian Social Security Number

Table 45-57 Austrian Social Security Number wide-breadth patterns

Pattern

\d{10}

\d{4}-\d{6}

\d{4} \d{6}

Table 45-58 Austrian Social Security Number wide-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Austrian Social Security Number medium breadth


The narrow breadth detects a 10-digit number that passes checksum validation. It also
eliminates common test numbers, such as 123456789, and ranges reserved for future use.

Table 45-59 Austrian Social Security Number medium-breadth patterns

Pattern

\d{10}

\d{4}-\d{6}

\d{4} \d{6}

Table 45-60 Austrian Social Security Number medium-breadth validator

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Austrian Social Security Number Validation Check Computes the checksum and validates the pattern against
it.

Austrian Social Security Number narrow breadth


The narrow breadth detects a 10-digit number that passes checksum validation. It also
eliminates common test numbers, ranges reserved for future use, duplicate digits, and requires
the presence of Austrian Social Security Number-related keywords.
Library of system data identifiers 1038
Austrian Social Security Number

Table 45-61 Austrian Social Security Number narrow-breadth patterns

Pattern

\d{10}

\d{4}-\d{6}

\d{4} \d{6}

Table 45-62 Austrian Social Security Number narrow-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

Austrian Social Security Number Validation Check Computes the checksum and validates the pattern against
it.
Library of system data identifiers 1039
Belgian National Number

Table 45-62 Austrian Social Security Number narrow-breadth validators (continued)

Mandatory validator Description

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched when you use this
option.

Inputs:

social security no, social security number, social


security code, Austrian SSN, SSN#, ssn#, SSN, ssn,
socialsecurityno#,

sozialversicherungsnummer, soziale sicherheit kein,


sozialversicherungsnummer#, sozialesicherheitkein#

insurance number, insurance code, insurancecode#,


national insurance number, insurance no, health
insurance number, health insurance, health insurance
no, EHIC number, EHIC no

versicherungsnummer, versicherungscode, nationale


versicherungsnummer, krankenkassennummer,
krankenversicherung

zdravstveno zavarovanje

EHIC Nummer, Österreichischen SSN,


Österreichischen Sozialversicherungs kein

številka zavarovanja, biztosítási szám, zavarovalna


šifra, biztosítási kód, társadalombiztosítási azonosító
jel, nacionalna številka zavarovanja,
egészségbiztosítási szám, številka zdravstvenega
zavarovanja, egészségbiztosítás, EHIC szám, Številka
EHIC

Belgian National Number


All citizens of Belgium have a National Number. Belgians 12 years of age and older are issued
a Belgian identity card. The Belgian National Number is used also as a Belgian Social Security
Number for citizens.
The Belgian National Number data identifier detects an 11-digit number that matches the
Belgian National Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects an 11-digit number without checksum validation.
See “Belgian National Number wide breadth ” on page 1040.
Library of system data identifiers 1040
Belgian National Number

■ The medium breadth detects an 11-digit number with checksum validation.


See “Belgian National Number medium breadth” on page 1040.
■ The narrow breadth detects an 11-digit number with checksum validation. It also requires
the presence of related keywords.
See “Belgian National Number narrow breadth” on page 1041.

Belgian National Number wide breadth


The wide breadth detects an 11-digit number without checksum validation.

Table 45-63 Belgian National Number wide-breadth patterns

Pattern

\d{11}

\d{6} \d{3} \d{2}

\d{2}.\d{2}.\d{2}-\d{3}.\d{2}

\d{2}[ .][012345]\d[ .][0123]\d[ -.]\d{3}[ .-]\d{2}

Table 45-64 Belgian National Number wide-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Belgian National Number medium breadth


The medium breadth detects an 11-digit number with checksum validation.

Table 45-65 Belgian National Number medium-breadth patterns

Pattern

\d{11}

\d{6} \d{3} \d{2}

\d{2}.\d{2}.\d{2}-\d{3}.\d{2}

\d{2}[ .][012345]\d[ .][0123]\d[ -.]\d{3}[ .-]\d{2}


Library of system data identifiers 1041
Belgian National Number

Table 45-66 Belgian National Number medium-breadth validator

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Belgian National Number Validation Check Computes the checksum and validates the pattern against
it.

Belgian National Number narrow breadth


The narrow breadth detects an 11-digit number with checksum validation. It also requires the
presence of related keywords.

Table 45-67 Belgian National Number narrow-breadth patterns

Pattern

\d{11}

\d{6} \d{3} \d{2}

\d{2}.\d{2}.\d{2}-\d{3}.\d{2}

\d{2}[ .][012345]\d[ .][0123]\d[ -.]\d{3}[ .-]\d{2}

Table 45-68 Belgian National Number narrow-breadth validators

Mandatory validator Description

Belgian National Number Validation Check Computes the checksum and validates the pattern against
it.

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.


Library of system data identifiers 1042
Belgium Driver's Licence Number

Table 45-68 Belgian National Number narrow-breadth validators (continued)

Mandatory validator Description

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

Belgian national number, national number ,social


security number, nationalnumber#, ssn#, ssn,
nationalnumber, bnn#, bnn, personal ID number,
personalIDnumber#

Numéro national, numéro de sécurité, numéro


d'assuré, identifiant national, identifiantnational#,
Numéronational#

Belgium Driver's Licence Number


Identification number for an individual's driver's licence issued by the Driver and Vehicle
Licensing Agency of Belgium.
The Belgium Driver's Licence Number data identifier detects a 10-digit number that matches
the Belgium Driver's Licence Number format.
The Belgium Driver's License Number data identifier provides two breadths of detection:
■ The wide breadth detects a 10-digit number without checksum validation.
See “Belgium Driver's Licence Number wide breadth” on page 1042.
■ The narrow breadth detects a 10-digit number without checksum validation. It requires the
presence of related keywords.
See “Belgium Driver's Licence Number narrow breadth” on page 1043.

Belgium Driver's Licence Number wide breadth


The wide breadth detects a 10-digit number without checksum validation.

Table 45-69 Belgium Driver's Licence Number wide-breadth pattern

Pattern

\d{10}
Library of system data identifiers 1043
Belgium Driver's Licence Number

Table 45-70 Belgium Driver's Licence Number wide-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

Belgium Driver's Licence Number narrow breadth


The narrow breadth detects a 10-digit number without checksum validation. It requires the
presence of related keywords.

Table 45-71 Belgium Driver's Licence Number narrow-breadth pattern

Pattern

\d{10}

Table 45-72 Belgium Driver's License Number narrow-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

Führerschein, Fuhrerschein, Fuehrerschein,


Führerscheinnummer, Fuhrerscheinnummer,
Fuehrerscheinnummer, Führerscheinnummer,
Fuhrerscheinnummer, Fuehrerscheinnummer,
Führerschein- Nr, Fuhrerschein- Nr, Fuehrerschein-
Nr

DL#, Driver License, Driver License Number, driver


license number, Driver Licence, Drivers Lic., Drivers
License, Drivers Licence, Driver's License, Driver's
License Number, driver's license number, Driver's
Licence Number, Driving License number, driving
license number, DLNo#, dlno#

permis de conduire, rijbewijs, Rijbewijsnummer,


Numéro permis conduire
Library of system data identifiers 1044
Belgium Passport Number

Belgium Passport Number


Belgian passports are issued by the Belgian state to its citizens to facilitate international travel.
The Federal Public Service Foreign Affairs, formerly known as the Ministry of Foreign Affairs,
is responsible for issuing and renewing Belgian passports.
The Belgium Passport Number data identifier detects an eight-character alphanumeric pattern
that matches the Belgium Passport Number format.
The Belgium Passport Number data identifier provides two breadths of detection:
■ The wide breadth detects an eight-character alphanumeric pattern without checksum
validation.
See “Belgium Passport Number wide breadth” on page 1044.
■ The narrow breadth detects an eight-character alphanumeric pattern without checksum
validation. It requires the presence of related keywords.
See “Belgium Passport Number narrow breadth” on page 1044.

Belgium Passport Number wide breadth


The wide breadth detects an eight-character alphanumeric pattern without checksum validation.

Table 45-73 Belgium Passport Number wide-breadth pattern

Pattern

\l{2}\d{6}

Table 45-74 Belgium Passport Number wide-breadth validator

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Belgium Passport Number narrow breadth


The narrow breadth detects an eight-character alphanumeric pattern without checksum
validation. It requires the presence of related keywords.

Table 45-75 Belgium Passport Number narrow-breadth patterns

Patterns

\l{2}\d{6}
Library of system data identifiers 1045
Belgium Tax Identification Number

Table 45-76 Belgium Passport Number narrow-breadth patterns

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

passport number

Paspoort, paspoort, paspoortnummer, Reisepass kein,


Reisepass, Passnummer, Passeport, Passeport livre,
Passeport carte, numéro passeport

Belgian Passport Number, belgian passport number,


passport no

Belgium Tax Identification Number


Belgium issues a tax identification number for persons who has obligations to declare taxes
in Belgium.
The Belgium Tax Identification Number data identifier detects an 11-digit number that matches
the Belgium Tax Identification Number format.
The Belgium Tax Identification Number data identifier provides two breadths of detection:
■ The wide breadth detects an 11-digit number without checksum validation. It also requires
the presence of related keywords.
See “Belgium Tax Identification Number wide breadth” on page 1045.
■ The narrow breadth detects an 11-digit number with checksum validation. It also requires
the presence of related keywords.
See “Belgium Tax Identification Number narrow breadth” on page 1046.

Belgium Tax Identification Number wide breadth


The wide breadth detects an 11-digit number without checksum validation. It also requires the
presence of related keywords.

Table 45-77 Belgium Tax Identification Number wide-breadth patterns

Patterns

\d{2}[01]\d[0123]\d{6}
Library of system data identifiers 1046
Belgium Tax Identification Number

Table 45-77 Belgium Tax Identification Number wide-breadth patterns (continued)

Patterns

\d{2}[01]\d[0123]\d \d{3} \d{2}

\d{2}.[01]\d.[0123]\d-\d{3}.\d{2}

\d{2}[ .][01]\d[ .][0123]\d[ -.]\d{3}[ .-]\d{2}

Table 45-78 Belgium Tax Identification Number wide-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

tax number, national registration number, National


Registration Number, tax registration number, tax id,
Tax ID, TAX Number

Numéro de registre national, numéro d'identification


fiscale, belasting aantal, Steuernummer, NIF, nif, NIF#,
nif#

Belgium Tax Identification Number narrow breadth


The narrow breadth detects an 11-digit number that passes checksum validation. It also
requires the presence of related keywords.

Table 45-79 Belgium Tax Identification Number narrow-breadth patterns

Patterns

\d{2}[01]\d[0123]\d{6}

\d{2}[01]\d[0123]\d \d{3} \d{2}

\d{2}.[01]\d.[0123]\d-\d{3}.\d{2}

\d{2}[ .][01]\d[ .][0123]\d[ -.]\d{3}[ .-]\d{2}


Library of system data identifiers 1047
Belgium Value Added Tax (VAT) Number

Table 45-80 Belgium Tax Identification Number narrow-breadth validators

Mandatory validator Description

Belgian Tax Identification Number Validation Check Checksum validator for Belgium Tax Identification Number.

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

tax number, national registration number, National


Registration Number, tax registration number, tax id,
Tax ID, TAX Number

Numéro de registre national, numéro d'identification


fiscale, belasting aantal, Steuernummer, NIF, nif, NIF#,
nif#

Belgium Value Added Tax (VAT) Number


Value Added Tax (VAT) is a consumption tax that is borne by the end consumer. VAT is paid
for each transaction in the manufacturing and distribution process. For Belgium, the Value
Added Tax is issued by VAT office for the region in which the business is established.
The Belgium Value Added Tax (VAT) Number detects a 12-character alphanumeric pattern
that matches the Belgium Value Added Tax (VAT) Number format.
The Belgium Value Added Tax (VAT) Number data identifier provides three breadths of
detection:
■ The wide breadth detects a 12-character alphanumeric pattern beginning with BE without
checksum validation.
See “Belgium Value Added Tax (VAT) Number wide breadth” on page 1048.
■ The medium breadth detects a 12-character alphanumeric pattern beginning with BE with
checksum validation.
See “Belgium Value Added Tax (VAT) Number medium breadth” on page 1048.
■ The narrow breadth detects a 12-character alphanumeric pattern beginning with BE with
checksum validation. It also requires the presence of related keywords.
See “Belgium Value Added Tax (VAT) Number narrow breadth” on page 1049.
Library of system data identifiers 1048
Belgium Value Added Tax (VAT) Number

Belgium Value Added Tax (VAT) Number wide breadth


The wide breadth detects a 12-character alphanumeric pattern beginning with BE without
checksum validation.

Table 45-81 Belgium Value Added Tax (VAT) Number wide-breadth patterns

Patterns

[Bb][Ee][0][123456789]\d{8}

[Bb][Ee][0][123456789].\d{4}.\d{4}

[Bb][Ee][0][123456789]-\d{4}-\d{4}

[Bb][Ee][0][123456789] \d{4} \d{4}

Table 45-82 Belgium Value Added Tax (VAT) Number wide-breadth validator

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Belgium Value Added Tax (VAT) Number medium breadth


The medium breadth detects a 12-character alphanumeric pattern beginning with BE with
checksum validation.

Table 45-83 Belgium Value Added Tax (VAT) Number medium breadth patterns

Patterns

[Bb][Ee][0][123456789]\d{8}

[Bb][Ee][0][123456789].\d{4}.\d{4}

[Bb][Ee][0][123456789]-\d{4}-\d{4}

[Bb][Ee][0][123456789] \d{4} \d{4}

Table 45-84 Belgium Value Added Tax (VAT) Number medium-breadth validators

Mandatory validator Description

Belgium VAT Number Validation Check Checksum validator for the Belgian Value Added Tax (VAT)
Number.
Library of system data identifiers 1049
Brazilian Election Identification Number

Belgium Value Added Tax (VAT) Number narrow breadth


The narrow breadth detects a 12-character alphanumeric pattern beginning with BE with
checksum validation. It also requires the presence of related keywords.

Table 45-85 Belgium Value Added Tax (VAT) Number narrow-breadth patterns

Pattern

[Bb][Ee][0][123456789]\d{8}

[Bb][Ee][0][123456789].\d{4}.\d{4}

[Bb][Ee][0][123456789]-\d{4}-\d{4}

[Bb][Ee][0][123456789] \d{4} \d{4}

Table 45-86 Belgium Value Added Tax (VAT) Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Belgium VAT Number Validation Check Checksum validator for the Belgian Value Added Tax (VAT)
Number.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

Numéro T.V.A., BTW number, Nº TVA, BTW NR, VAT


Number, vat no, vat number, Numéro T.V.A,
Umsatzsteuer-Identifikationsnummer,
Umsatzsteuernummer, BTW, BTW#, VAT#, vat#

Brazilian Election Identification Number


Brazil voting is compulsory to all citizens between 18 and 70 years old. To vote, all citizens
must be registered to vote and should present an official identity document, usually the election
identification number card.
The Brazilian Election Identification Number detects a 9- to 14-digit number that matches the
Brazilian Election Identification Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects a 9- to 14-digit number without checksum validation.
Library of system data identifiers 1050
Brazilian Election Identification Number

See “Brazilian Election Identification Number wide breadth” on page 1050.


■ The medium breadth detects a 9- to 14-digit number that passes checksum validation.
See “Brazilian Election Identification Number medium breadth” on page 1051.
■ The narrow breadth detects a 9- to 14-digit number that passes checksum validation, and
requires the presence of related keywords.
See “Brazilian Election Identification Number narrow breadth” on page 1052.

Brazilian Election Identification Number wide breadth


The wide breadth detects a 9- to 14-digit number without checksum validation.

Table 45-87 Brazilian Election Identification Number wide-breadth patterns

Patterns

\d{5}[0]\d{3}

\d{5}[12]\d\d{2}

\d{6}[0]\d{3}

\d{6}[0]\d[/]\d{2}

\d{6}[12]\d\d{2}

\d{6}[12]\d[/]\d{2}

\d{7}[0]\d{3}

\d{7}[0]\d[/]\d{2}

\d{7}[12]\d[/]\d{2}

\d{7}[12]\d\d{2}

\d{8}[0]\d{3}

\d{8}[0]\d[/]\d{2}

\d{8}[0]\d{3}[/]\d{2}

\d{8}[12]\d[/]\d{2}

\d{8}[12]\d\d{2}

\d{8}[12]\d\d{2}[/]\d{2}
Library of system data identifiers 1051
Brazilian Election Identification Number

Table 45-88 Brazilian Election Identification Number wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Brazilian Election Identification Number medium breadth


The medium breadth detects a 9- to 14-digit number that passes checksum validation.

Table 45-89 Brazilian Election Identification Number medium-breadth patterns

Patterns

\d{5}[0]\d{3}

\d{5}[12]\d\d{2}

\d{6}[0]\d{3}

\d{6}[0]\d[/]\d{2}

\d{6}[12]\d\d{2}

\d{6}[12]\d[/]\d{2}

\d{7}[0]\d{3}

\d{7}[0]\d[/]\d{2}

\d{7}[12]\d[/]\d{2}

\d{7}[12]\d\d{2}

\d{8}[0]\d{3}

\d{8}[0]\d[/]\d{2}

\d{8}[0]\d{3}[/]\d{2}

\d{8}[12]\d[/]\d{2}

\d{8}[12]\d\d{2}

\d{8}[12]\d\d{2}[/]\d{2}
Library of system data identifiers 1052
Brazilian Election Identification Number

Table 45-90 Brazilian Election Identification Number medium-breadth validators

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding characters.

Brazil Election Identification Number Validation Check Computes Brazil Election Identification Number checksum
every Brazil Election Identification Number must pass.

Brazilian Election Identification Number narrow breadth


The narrow breadth detects a 9- to 14-digit number that passes checksum validation. It also
requires the presence of related keywords.

Table 45-91 Brazilian Election Identification Number narrow-breadth patterns

Patterns

\d{5}[0]\d{3}

\d{5}[12]\d\d{2}

\d{6}[0]\d{3}

\d{6}[0]\d[/]\d{2}

\d{6}[12]\d\d{2}

\d{6}[12]\d[/]\d{2}

\d{7}[0]\d{3}

\d{7}[0]\d[/]\d{2}

\d{7}[12]\d[/]\d{2}

\d{7}[12]\d\d{2}

\d{8}[0]\d{3}

\d{8}[0]\d[/]\d{2}

\d{8}[0]\d{3}[/]\d{2}

\d{8}[12]\d[/]\d{2}

\d{8}[12]\d\d{2}

\d{8}[12]\d\d{2}[/]\d{2}
Library of system data identifiers 1053
Brazilian National Registry of Legal Entities Number

Table 45-92 Brazilian Election Identification Number narrow-breadth validators

Mandatory validators Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

Brazil Election Identification Number Validation Check Computes Brazil Election Identification Number checksum
every Brazil Election Identification Number must pass.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

election ID, identification number, electrol no., voter


ID, electrol identification number, Voter ID, electrol
number, election voter ID, Electrol Number, Electrol
No., Identification Number, Election Identification No.

número de identificação, identificação do eleitor,


número de identificação eleitoral, ID eleitor eleição,
Número identificação eleitoral brasileira

Brazilian National Registry of Legal Entities Number


The Brazilian National Registry of Legal Entities (CNPJ) Number is a unique number that
identifies an entity or other legal arrangement without legal personality by the Brazilian IRS
(an agency of the Ministry of Finance).
The Brazilian National Registry of Legal Entities (CNPJ) Number data identifier detects a
14-digit number that matches the Brazilian National Registry of Legal Entities (CNPJ) Number
format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects a 14-digit number without checksum validation.
See “Brazilian National Registry of Legal Entities Number wide breadth” on page 1054.
■ The medium breadth detects a 14-digit number with checksum validation.
See “Brazilian National Registry of Legal Entities Number medium breadth” on page 1054.
■ The narrow breadth detects a 14-digit number that passes checksum validation. It also
requires the presence of related keywords.
See “Brazilian National Registry of Legal Entities Number narrow breadth” on page 1055.
Library of system data identifiers 1054
Brazilian National Registry of Legal Entities Number

Brazilian National Registry of Legal Entities Number wide breadth


The wide breadth detects a 14-digit number without checksum validation.

Table 45-93 Brazilian National Registry of Legal Entities Number wide-breadth patterns

Pattern

\d{14}

\d{8}[/]\d{6}

\d{8}[/]\d{4}-\d{2}

\d{2}.\d{3}.\d{3}[/]\d{4}-\d{2}

Table 45-94 Brazilian National Registry of Legal Entities Number wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Brazilian National Registry of Legal Entities Number medium breadth


The medium breadth detects a 14-digit number with checksum validation.

Table 45-95 Brazilian National Registry of Legal Entities Number medium-breadth patterns

Pattern

\d{14}

\d{8}[/]\d{6}

\d{8}[/]\d{4}-\d{2}

\d{2}.\d{3}.\d{3}[/]\d{4}-\d{2}

Table 45-96 Brazilian National Registry of Legal Entities Number medium-breadth validator

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Brazilian National Registry of Legal Entities Number Computes the checksum and validates the pattern against
Validation Check it.
Library of system data identifiers 1055
Brazilian Natural Person Registry Number (CPF)

Brazilian National Registry of Legal Entities Number narrow breadth


The narrow breadth detects a 14-digit number that passes checksum validation. It also requires
the presence of related keywords.

Table 45-97 Brazilian National Registry of Legal Entities Number narrow-breadth patterns

Pattern

\d{14}

\d{8}[/]\d{6}

\d{8}[/]\d{4}-\d{2}

\d{2}.\d{3}.\d{3}[/]\d{4}-\d{2}

Table 45-98 Brazilian National Registry of Legal Entities Number narrow-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

Brazilian National Registry of Legal Entities Number Computes the checksum and validates the pattern against
Validation Check it.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

Brazil legal entities number, legalnumber#,legal ID,


legal no., Brazilianlegalno#, legalnumber# ,legal no.,
legal entities number, CNPJ, CNPJ:, CNPJ#, cnpj#,
cnpj CNPJ n º, Registro Nacional de Pessoas Jurídicas
n º, entidades jurídicas ID

Brazilian Natural Person Registry Number (CPF)


The Cadastro de Pessoas Fisicas (CPF, "Natural Person Register") is a number assigned by
the Brazilian Federal Revenue to both Brazilians and resident aliens who pay taxes or take
part, directly or indirectly, in activities that provide revenue for any of the dozens of different
types of taxes existing in Brazil.
The Brazilian Natural Person Registry Number (CPF) data identifier detects an 11-digit number
that matches the Brazilian Natural Person Registry Number (CPF) format.
Library of system data identifiers 1056
Brazilian Natural Person Registry Number (CPF)

This data identifier provides the following breadths of detection:


■ The wide breadth detects an 11-digit number without checksum validation.
See “Brazilian Natural Person Registry Number wide breadth” on page 1056.
■ The medium breadth detects an 11-digit number with checksum validation.
See “Brazilian Natural Person Registry Number medium breadth” on page 1056.
■ The narrow breadth detects an 11-digit number that passes checksum validation. It also
requires the presence of related keywords.
See “Brazilian Natural Person Registry Number narrow breadth ” on page 1057.

Brazilian Natural Person Registry Number wide breadth


The wide breadth detects an 11-digit number without checksum validation.

Table 45-99 Brazilian Natural Person Registry Number wide-breadth patterns

Pattern

\d{11}

\d{9}[-]\d{2}

\d{3}[.]\d{3}[.]\d{3}[-]\d{2}

Table 45-100 Brazilian Natural Person Registry Number wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Brazilian Natural Person Registry Number medium breadth


The medium breadth detects an 11-digit number with checksum validation.

Table 45-101 Brazilian Natural Person Registry Number medium-breadth patterns Pattern

Pattern

\d{11}

\d{9}[-]\d{2}

\d{3}[.]\d{3}[.]\d{3}[-]\d{2}
Library of system data identifiers 1057
Brazilian Natural Person Registry Number (CPF)

Table 45-102 Brazilian Natural Person Registry Number medium breadth-validator

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Brazilian Natural Person Registry Number Validation Computes Brazilian Natural Person Registry Number
Check checksum every Brazilian Natural Person Registry Number
must pass.

Brazilian Natural Person Registry Number narrow breadth


The narrow breadth detects an 11-digit number that passes checksum validation. It also
requires the presence of related keywords.

Table 45-103 Brazilian Natural Person Registry Number narrow-breadth patterns

Pattern

\d{11}

\d{9}[-]\d{2}

\d{3}[.]\d{3}[.]\d{3}[-]\d{2}

Table 45-104 Brazilian Natural Person Registry Number narrow-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

Brazilian Natural Person Registry Number Validation Computes Brazilian Natural Person Registry Number
Check checksum every Brazilian Natural Person Registry Number
must pass.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

registry of individuals, CPF#, cpf no, CPF no,


Registration number, natural persons registry no, cpf
no, natural persons record no, cpfno#, CPFno#

Cadastro de Pessoas Físicas, pessoas singulares


registro NO pessoa natural número de registro
Library of system data identifiers 1058
British Columbia Personal Healthcare Number

British Columbia Personal Healthcare Number


British Columbia (BC) residents are required by law to enroll in a Medical Service Plan (MSP)
to access basic medical care facilities.
The MSP membership card is called a Care Card and the MSP number is called a Personal
Healthcare Number.
The British Columbia Personal Healthcare Number data identifier detects a 10-digit number
that matches the format of the British Columbia Personal Healthcare Number.
This data identifier provides the following breadths of detection:
■ The wide breadth detects a 10-digit number without checksum validation.
See “British Columbia Personal Healthcare Number wide breadth ” on page 1058.
■ The medium breadth detects a 10-digit number that passes checksum validation.
See “ British Columbia Personal Healthcare Number medium breadth” on page 1058.
■ The narrow breadth detects a 10-digit number that passes checksum validation. It also
requires the presence of related keywords.
See “British Columbia Personal Healthcare Number narrow breadth” on page 1059.

British Columbia Personal Healthcare Number wide breadth


The wide breadth detects a 10-digit number without checksum validation.

Table 45-105 British Columbia Personal Healthcare Number wide-breadth patterns

Pattern

[9]\d{9}

[9]\d{3} \d{3} \d{3}

Table 45-106 British Columbia Personal Healthcare Number wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

British Columbia Personal Healthcare Number medium breadth


The medium breadth detects a 10-digit number that passes checksum validation.
Library of system data identifiers 1059
British Columbia Personal Healthcare Number

Table 45-107 British Columbia Personal Healthcare Number medium-breadth patterns

Pattern

[9]\d{9}

[9]\d{3} \d{3} \d{3}

Table 45-108 British Columbia Personal Healthcare Number medium-breadth validator

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

British Columbia Personal Healthcare Number Computes British Columbia Personal Healthcare Number
Validation Check checksum that every British Columbia Personal Healthcare
Number must pass.

British Columbia Personal Healthcare Number narrow breadth


The narrow breadth detects a 10-digit number that passes checksum validation. It also requires
the presence of related keywords.

Table 45-109 British Columbia Personal Healthcare Number narrow-breadth patterns

Pattern

[9]\d{9}

[9]\d{3} \d{3} \d{3}

Table 45-110 British Columbia Personal Healthcare Number narrow-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

British Columbia Personal Healthcare Number Computes British Columbia Personal Healthcare Number
Validation Check checksum that every British Columbia Personal Healthcare
Number must pass.
Library of system data identifiers 1060
Bulgaria Value Added Tax (VAT) Number

Table 45-110 British Columbia Personal Healthcare Number narrow-breadth validator


(continued)

Mandatory validator Description

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

MSP Number,msp number,MSP no,personal healthcare


number,healthcare no,Healthcare
No,PHN,phn,phn#,msp#,mspno#,PHN#,healthcare
number

MSP nombre,soins de santé no,soins de santé


personnels nombre,MSPNombre#,soinsdesanténo#

Bulgaria Value Added Tax (VAT) Number


Value Added Tax (VAT) is a consumption tax that is borne by the end consumer. VAT is paid
for each transaction in the manufacturing and distribution process. In Bulgaria, VAT is
administered by the National Revenue Agency, which is overseen by the Bulgarian Ministry
of Finance.
The Bulgaria Value Added Tax (VAT) Number data identifier detects a 9- or 10-character
alphanumeric pattern beginning with the letters BG that matches the Bulgaria VAT Number
format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects a 9- or 10-character alphanumeric pattern beginning with the
letters BG without checksum validation. It checks for common test numbers.
See “Bulgaria Value Added Tax (VAT) Number wide breadth” on page 1061.
■ The medium breadth detects a 9- or 10-character alphanumeric pattern beginning with the
letters BG with checksum validation.
See “Bulgaria Value Added Tax (VAT) Number medium breadth” on page 1061.
■ The narrow breadth detects a 9- or 10-character alphanumeric pattern beginning with the
letters BG with checksum validation. It also requires the presence of related keywords and
checks for common test numbers.
See “Bulgaria Value Added Tax (VAT) Number narrow breadth” on page 1062.
Library of system data identifiers 1061
Bulgaria Value Added Tax (VAT) Number

Bulgaria Value Added Tax (VAT) Number wide breadth


The wide breadth detects a 9- or 10-character alphanumeric pattern beginning with the letters
BG without checksum validation. It checks for common test numbers.

Table 45-111 Bulgaria Value Added Tax (VAT) Number wide-breadth patterns

Pattern

[bB][gG]\d{9}

[bB][gG] \d{9}

[bB][gG]\d{10}

[bB][gG] \d{10}

Table 45-112 Bulgaria Value Added Tax (VAT) Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

000000000, 111111111, 222222222, 333333333,


444444444, 555555555, 666666666, 777777777,
888888888, 999999999, 0000000000, 1111111111,
2222222222, 3333333333, 4444444444, 5555555555,
6666666666, 7777777777, 8888888888, 9999999999

Bulgaria Value Added Tax (VAT) Number medium breadth


The medium breadth detects a 9- or 10-character alphanumeric pattern beginning with the
letters BG with checksum validation.

Table 45-113 Bulgaria Value Added Tax (VAT) Number medium-breadth patterns

Pattern

[bB][gG]\d{9}

[bB][gG] \d{9}

[bB][gG]\d{10}

[bB][gG] \d{10}
Library of system data identifiers 1062
Bulgaria Value Added Tax (VAT) Number

Table 45-114 Bulgaria Value Added Tax (VAT) Number medium-breadth validators

Mandatory validator Description

Bulgaria Value Added Tax (VAT) Number Validation Computes the checksum and validates the pattern against
Check it.

Bulgaria Value Added Tax (VAT) Number narrow breadth


The narrow breadth detects a 9- or 10-character alphanumeric pattern beginning with the
letters BG with checksum validation. It also requires the presence of related keywords and
checks for common test numbers.

Table 45-115 Bulgaria Value Added Tax (VAT) Number narrow-breadth patterns

Pattern

[bB][gG]\d{9}

[bB][gG] \d{9}

[bB][gG]\d{10}

[bB][gG] \d{10}

Table 45-116 Bulgaria Value Added Tax (VAT) Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

000000000, 111111111, 222222222, 333333333,


444444444, 555555555, 666666666, 777777777,
888888888, 999999999, 0000000000, 1111111111,
2222222222, 3333333333, 4444444444, 5555555555,
6666666666, 7777777777, 8888888888, 9999999999

Bulgaria Value Added Tax (VAT) Number Validation Computes the checksum and validates the pattern against
Check it.
Library of system data identifiers 1063
Bulgarian Uniform Civil Number - EGN

Table 45-116 Bulgaria Value Added Tax (VAT) Number narrow-breadth validators (continued)

Mandatory validator Description

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

vat number, vat, VAT, vat#, VAT#, vat no., vatno#, value
added tax number, vatin, VATIN, value added tax, vat
no

номер на таксата, ДДС, ДДС#, ДДС номер., ДДС


номер.#, номер на данъка върху добавената
стойност, данък върху добавената стойност, ДДС
номер

Bulgarian Uniform Civil Number - EGN


The uniform civil number (EGN) is unique number assigned to each Bulgarian citizen or resident
foreign national. It serves as a national identification number. An EGN is assigned to Bulgarians
at birth, or when a birth certificate is issued.
The Bulgarian Uniform Civil Number - EGN data identifier detects a 10-digit number that
matches the Bulgarian Uniform Civil Number - EGN format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects a 10-digit number without checksum validation.
See “Bulgarian Uniform Civil Number - EGN wide breadth” on page 1063.
■ The medium breadth detects a 10-digit number that passes checksum validation.
See “Bulgarian Uniform Civil Number - EGN medium breadth” on page 1064.
■ The narrow breadth detects a 10-digit number that passes checksum validation. It also
requires the presence of related keywords.
See “Bulgarian Uniform Civil Number - EGN narrow breadth” on page 1065.

Bulgarian Uniform Civil Number - EGN wide breadth


The wide breadth detects a 10-digit number without checksum validation.

Table 45-117 Bulgarian Uniform Civil Number - EGN wide-breadth pattern

Pattern

\d\d[024][123456789]0[123456789]\d{4}
Library of system data identifiers 1064
Bulgarian Uniform Civil Number - EGN

Table 45-117 Bulgarian Uniform Civil Number - EGN wide-breadth pattern (continued)

Pattern

\d\d[135][012]0[123456789]\d{4}

\d\d[024][123456789][12]\d{5}

\d\d[135][012][12]\d{5}

\d\d[024][123456789]3[01]\d{4}

\d\d[135][012]3[01]\d{4}

Table 45-118 Bulgarian Uniform Civil Number - EGN wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Bulgarian Uniform Civil Number - EGN medium breadth


The medium breadth detects a 10-digit number that passes checksum validation.

Table 45-119 Bulgarian Uniform Civil Number - EGN medium-breadth pattern

Pattern

\d\d[024][123456789]0[123456789]\d{4}

\d\d[135][012]0[123456789]\d{4}

\d\d[024][123456789][12]\d{5}

\d\d[135][012][12]\d{5}

\d\d[024][123456789]3[01]\d{4}

\d\d[135][012]3[01]\d{4}

Table 45-120 Bulgarian Uniform Civil Number - EGN medium-breadth validator

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Bulgarian Uniform Civil Number Validation Check Computes the checksum and validates the pattern against
it.
Library of system data identifiers 1065
Bulgarian Uniform Civil Number - EGN

Bulgarian Uniform Civil Number - EGN narrow breadth


The narrow breadth detects a 10-digit number that passes checksum validation. It also requires
the presence of related keywords.

Table 45-121 Bulgarian Uniform Civil Number - EGN narrow-breadth pattern

Pattern

\d\d[024][123456789]0[123456789]\d{4}

\d\d[135][012]0[123456789]\d{4}

\d\d[024][123456789][12]\d{5}

\d\d[135][012][12]\d{5}

\d\d[024][123456789]3[01]\d{4}

\d\d[135][012]3[01]\d{4}

Table 45-122 Bulgarian Uniform Civil Number - EGN narrow-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

Bulgarian Uniform Civil Number Validation Check Computes the checksum and validates the pattern against
it.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

BUCN, uniform civil number, uniform civil ID, uniform


civil no, EGN, Bulgarian uniform civil number,
uniformcivilno#, BUCN#, EGN#, bucn, egn#, bucn#,
uniformcivilnumber#, personal number, personal no,
identification number, personal id, national id

Униформ граждански номер, Униформ ID, Униформ


граждански ID, Униформ граждански не., български
Униформ граждански номер,
УниформгражданскиID#, Униформгражданскине.#,
личен номер, лично не, идентификационен номер,
лична идентификация, национален номер
Library of system data identifiers 1066
Burgerservicenummer

Burgerservicenummer
In the Netherlands, the Burgerservicenummer is used to uniquely identify citizens and is printed
on driving licenses, passports and international ID cards under the header Personal Number.
The Burgerservicenummer data identifier detects an eight- or nine-digit number that matches
the Burgerservicenummer format and passes checksum validation.
The Burgerservicenummer data identifier provides two breadths of detection:
■ The wide breadth detects an eight- or nine-digit number that passes checksum validation.
See “Burgerservicenummer wide breadth” on page 1066.
■ The narrow breadth detects an eight- or nine-digit number that passes checksum validation.
It also requires the presence of related keywords.
See “Burgerservicenummer narrow breadth” on page 1066.

Burgerservicenummer wide breadth


The wide breadth detects an eight- or nine-digit number that passes checksum validation.

Table 45-123 Burgerservicenummer wide-breadth pattern

Pattern

\d{9}

Table 45-124 Burgerservicenummer wide-breadth validator

Mandatory validator Description

Burgerservicenummer Check Computes the checksum and validates the pattern against
it.

Burgerservicenummer narrow breadth


The narrow breadth detects an eight- or nine-digit number that passes checksum validation.
It also requires the presence of related keywords.

Table 45-125 Burgerservicenummer narrow-breadth pattern

Pattern

\d{9}
Library of system data identifiers 1067
Canada Driver's License Number

Table 45-126 Burgerservicenummer narrow-breadth validators

Mandatory validator Description

Burgerservicenummer Check Computes the checksum and validates the pattern against
it.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

Persoonsnummer, sofinummer, sociaal-fiscaal


nummer, persoonsgebonden, person number,
social-fiscal number, person-related number

Canada Driver's License Number


In Canada, driver's licenses are issued by the government of the province or territory in which
the driver is residing. Specific regulations relating to driver's licenses vary province to province,
though they are all similar.
The Canada Driver's License Number data identifier detects a 9-, 10-, 12-, 13-, 14-, or
15-character alphanumeric pattern that matches the Canada Driver's License Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects a 9-, 10-, 12-, 13-, 14-, or 15-character alphanumeric pattern
that matches the Canada Driver's License Number format without checksum validation. It
checks for common test numbers.
See “Canada Driver's License Number wide breadth” on page 1067.
■ The medium breadth detects a 9-, 10-, 12-, 13-, 14-, or 15-character alphanumeric pattern
that matches the Canada Driver's License Number format with checksum validation.
See “Canada Driver's License Number medium breadth” on page 1068.
■ The narrow breadth detects a 9-, 10-, 12-, 13-, 14-, or 15-character alphanumeric pattern
that matches the Canada Driver's License Number format with checksum validation. It also
requires the presence of related keywords and checks for common test numbers.
See “Canada Driver's License Number narrow breadth” on page 1069.

Canada Driver's License Number wide breadth


The wide breadth detects a 9-, 10-, 12-, 13-, 14-, or 15-character alphanumeric pattern that
matches the Canada Driver's License Number format without checksum validation. It checks
for common test numbers.
Library of system data identifiers 1068
Canada Driver's License Number

Table 45-127 Canada Driver's License Number wide-breadth patterns

Pattern

\d\d\d\d\d\d-\d\d\d

[Dd]\d\d\d\d\d\d\d\d\d

[A-Za-z]{2}-[A-Za-z]{2}-[A-Za-z]{2}-[A-Za-z]\d\d\d[A-Za-z]{2}

[A-Za-z]\d\d\d\d-\d\d\d\d\d-\d\d\d\d\d

[A-Za-z]{5}\d\d\d\d\d\d\d\d\d

[A-Za-z]\d\d\d\d-\d\d\d\d\d\d-\d\d

Table 45-128 Canada Driver's License Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

00000000000000, 11111111111111, 22222222222222,


33333333333333, 44444444444444, 55555555555555,
66666666666666, 77777777777777, 88888888888888,
99999999999999, 000000000000, 111111111111,
222222222222, 333333333333, 444444444444,
555555555555, 666666666666, 777777777777,
888888888888, 999999999999, 000000000, 111111111,
222222222, 333333333, 444444444, 555555555,
666666666, 777777777, 888888888, 999999999

Canada Driver's License Number medium breadth


The medium breadth detects a 9-, 10-, 12-, 13-, 14-, or 15-character alphanumeric pattern
that matches the Canada Driver's License Number format with checksum validation.

Table 45-129 Canada Driver's License Number medium-breadth patterns

Pattern

\d\d\d\d\d\d-\d\d\d

[Dd]\d\d\d\d\d\d\d\d\d

[A-Za-z]{2}-[A-Za-z]{2}-[A-Za-z]{2}-[A-Za-z]\d\d\d[A-Za-z]{2}
Library of system data identifiers 1069
Canada Driver's License Number

Table 45-129 Canada Driver's License Number medium-breadth patterns (continued)

Pattern

[A-Za-z]\d\d\d\d-\d\d\d\d\d-\d\d\d\d\d

[A-Za-z]{5}\d\d\d\d\d\d\d\d\d

[A-Za-z]\d\d\d\d-\d\d\d\d\d\d-\d\d

Table 45-130 Canada Driver's License Number medium-breadth validators

Mandatory validator Description

Canada Driver's License Number Check Computes the checksum and validates the pattern against
it.

Canada Driver's License Number narrow breadth


The narrow breadth detects a 9-, 10-, 12-, 13-, 14-, or 15-character alphanumeric pattern that
matches the Canada Driver's License Number format with checksum validation. It also requires
the presence of related keywords and checks for common test numbers.

Table 45-131 Canada Driver's License Number narrow-breadth patterns

Pattern

\d\d\d\d\d\d-\d\d\d

[Dd]\d\d\d\d\d\d\d\d\d

[A-Za-z]{2}-[A-Za-z]{2}-[A-Za-z]{2}-[A-Za-z]\d\d\d[A-Za-z]{2}

[A-Za-z]\d\d\d\d-\d\d\d\d\d-\d\d\d\d\d

[A-Za-z]{5}\d\d\d\d\d\d\d\d\d

[A-Za-z]\d\d\d\d-\d\d\d\d\d\d-\d\d

Table 45-132 Canada Driver's License Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.


Library of system data identifiers 1070
Canada Passport Number

Table 45-132 Canada Driver's License Number narrow-breadth validators (continued)

Mandatory validator Description

Exclude ending characters Data ending with any of the following list of values is not
matched:

00000000000000, 11111111111111, 22222222222222,


33333333333333, 44444444444444, 55555555555555,
66666666666666, 77777777777777, 88888888888888,
99999999999999, 000000000000, 111111111111,
222222222222, 333333333333, 444444444444,
555555555555, 666666666666, 777777777777,
888888888888, 999999999999, 000000000, 111111111,
222222222, 333333333, 444444444, 555555555,
666666666, 777777777, 888888888, 999999999

Canada Driver's License Number Check Computes the checksum and validates the pattern against
it.

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

driver license, drivers license, driving license, driver


license number, drivers license number, driving license
number, dlno#, drivers lic., driver''''s license number,
driver licence, drivers licence, driving licence, driver
permit, drivers permit, driving permit, license number,
licence number, drivers permit number, dl#

permis de conduire

Canada Passport Number


The Canadian passport is issued to citizens of Canada for the purposes of international travel.
The Canada Passport Number data identifier detects an eight- or nine-character alphanumeric
pattern that matches the Canada Passport Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects an eight- or nine-character alphanumeric pattern that matches
the Canada Passport Number format. It checks for common test numbers.
See “Canada Passport Number wide breadth” on page 1071.
■ The narrow breadth detects an eight- or nine-character alphanumeric pattern that matches
the Canada Passport Number format. It checks for common test numbers, and also requires
the presence of related keywords.
Library of system data identifiers 1071
Canada Passport Number

See “Canada Passport Number narrow breadth” on page 1071.

Canada Passport Number wide breadth


The wide breadth detects an eight- or nine-character alphanumeric pattern that matches the
Canada Passport Number format. It checks for common test numbers.

Table 45-133 Canada Passport Number wide-breadth patterns

Pattern

[a-zA-Z]{2}\d{6}

[a-zA-Z]{2}\d{7}

Table 45-134 Canada Passport Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

000000, 111111, 222222, 333333, 444444, 555555,


666666, 777777, 888888, 999999, 0000000, 1111111,
2222222, 3333333, 4444444, 5555555, 6666666,
7777777, 8888888, 9999999

Canada Passport Number narrow breadth


The narrow breadth detects an eight- or nine-character alphanumeric pattern that matches
the Canada Passport Number format. It checks for common test numbers, and also requires
the presence of related keywords.

Table 45-135 Canada Passport Number narrow-breadth patterns

Pattern

[a-zA-Z]{2}\d{6}

[a-zA-Z]{2}\d{7}

Table 45-136 Canada Passport Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.


Library of system data identifiers 1072
Canada Permanent Residence (PR) Number

Table 45-136 Canada Passport Number narrow-breadth validators (continued)

Mandatory validator Description

Exclude ending characters Data ending with any of the following list of values is not
matched:

000000, 111111, 222222, 333333, 444444, 555555,


666666, 777777, 888888, 999999, 0000000, 1111111,
2222222, 3333333, 4444444, 5555555, 6666666,
7777777, 8888888, 9999999

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

passport, passport number, passport no, passportno,


passport no., passport#, passportno#

passeport, numéro passeport, No passeport,


passeport#

Canada Permanent Residence (PR) Number


The Canadian Permanent Resident card is an identification document for permanent residents
of Canada who are not Canadian citizens. This document is required for permanent residents
returning to Canada by air.
The Canada Permanent Residence (PR) Number data identifier detects a 9- or 12-character
alphanumeric pattern that matches the Canada Permanent Residence (PR) Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects a 9- or 12-character alphanumeric pattern that matches the
Canada Permanent Residence (PR) Number format. It checks for common test numbers.
See “Canada Permanent Residence (PR) Number wide breadth” on page 1072.
■ The narrow breadth detects a 9- or 12-character alphanumeric pattern that matches the
Canada Permanent Residence (PR) Number format. It checks for common test numbers,
and also requires the presence of related keywords.
See “Canada Permanent Residence (PR) Number narrow breadth” on page 1073.

Canada Permanent Residence (PR) Number wide breadth


The wide breadth detects a 9- or 12-character alphanumeric pattern that matches the Canada
Permanent Residence (PR) Number format. It checks for common test numbers.
Library of system data identifiers 1073
Canada Permanent Residence (PR) Number

Table 45-137 Canada Permanent Residence (PR) Number wide-breadth patterns

Pattern

[a-zA-Z]{2}\d{7}

[a-zA-Z]{2}\d{10}

Table 45-138 Canada Permanent Residence (PR) Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

0000000, 1111111, 2222222, 3333333, 4444444,


5555555, 6666666, 7777777, 8888888, 9999999,
0000000000, 1111111111, 2222222222, 3333333333,
4444444444, 5555555555, 6666666666, 7777777777,
8888888888, 9999999999

Canada Permanent Residence (PR) Number narrow breadth


The narrow breadth detects a 9- or 12-character alphanumeric pattern that matches the Canada
Permanent Residence (PR) Number format. It checks for common test numbers, and also
requires the presence of related keywords.

Table 45-139 Canada Permanent Residence (PR) Number narrow-breadth patterns

Pattern

[a-zA-Z]{2}\d{7}

[a-zA-Z]{2}\d{10}

Table 45-140 Canada Permanent Residence (PR) Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.


Library of system data identifiers 1074
Canadian Social Insurance Number

Table 45-140 Canada Permanent Residence (PR) Number narrow-breadth validators


(continued)

Mandatory validator Description

Exclude ending characters Data ending with any of the following list of values is not
matched:

0000000, 1111111, 2222222, 3333333, 4444444,


5555555, 6666666, 7777777, 8888888, 9999999,
0000000000, 1111111111, 2222222222, 3333333333,
4444444444, 5555555555, 6666666666, 7777777777,
8888888888, 9999999999

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

permanent resident number, permanent resident no,


permanent resident no., permanent resident card,
permanent resident card number, pr number, pr no,
pr no.

numéro résident permanent, résident permanent non,


résident permanent no., carte résident permanent,
numéro carte résident permanent, pr non

Canadian Social Insurance Number


The Canadian Social Insurance Number (SIN) is a personal identification number issued by
Human Resources and Skills Development Canada primarily for administering national pension
and employment plans.
The Canadian Social Insurance Number data identifier detects a nine-digit number that matches
the Canadian Social Insurance Number format.
The Canadian Social Insurance Number data identifier provides three breadths of detection:
■ The wide breadth detects nine-digit numbers with the format DDD-DDD-DDD separated
by dashes, spaces, periods, slashes, or without separators. It also performs Luhn-check
validation.
See “Canadian Social Insurance Number wide breadth” on page 1075.
■ The medium breadth detects nine-digit numbers with the format DDD-DDD-DDD separated
by dashes, spaces, or periods. It also performs Luhn check validation and eliminates
non-assigned numbers and common test numbers.
See “Canadian Social Insurance Number medium breadth” on page 1075.
Library of system data identifiers 1075
Canadian Social Insurance Number

■ The narrow breadth detects nine-digit numbers with the format DDD-DDD-DDD separated
by dashes or spaces. It also performs Luhn-check validation; eliminates non-assigned
numbers, fictitiously assigned numbers, and common test numbers; and requires the
presence of related keywords.
See “Canadian Social Insurance Number narrow breadth” on page 1076.

Canadian Social Insurance Number wide breadth


The wide breadth detects nine-digit numbers with the format DDD-DDD-DDD separated by
dashes, spaces, periods, slashes, or without separators. It also performs Luhn-check validation.

Table 45-141 Canadian Social Insurance Number wide-breadth patterns

Patterns

\d{3} \d{3} \d{3}

\d{9}

\d{3}/\d{3}/\d{3}

\d{3}.\d{3}.\d{3}

\d{3}-\d{3}-\d{3}

Table 45-142 Canadian Social Insurance Number wide-breadth validator

Mandatory validator Description

Luhn Check Validator computes the Luhn checksum which every


Canadian Insurance Number must pass.

Canadian Social Insurance Number medium breadth


The medium breadth detects nine-digit numbers with the format DDD-DDD-DDD separated
by dashes, spaces, or periods. It also performs Luhn check validation and eliminates
non-assigned numbers and common test numbers.

Table 45-143 Canadian Social Insurance Number medium-breadth patterns

Patterns

\d{3} \d{3} \d{3}

\d{3}.\d{3}.\d{3}

\d{3}-\d{3}-\d{3}
Library of system data identifiers 1076
Canadian Social Insurance Number

Table 45-144 Canadian Social Insurance Number medium-breadth validators

Mandatory validators Description

Luhn Check Validator computes the Luhn checksum which every


Canadian Insurance Number must pass.

Number delimiter Validates a match by checking the surrounding numbers.

Exclude beginning characters Data beginning with any of the following list of values is
not matched:

8, 123456789

Canadian Social Insurance Number narrow breadth


The narrow breadth detects nine-digit numbers with the format DDD-DDD-DDD separated by
dashes or spaces. It also performs Luhn-check validation; eliminates non-assigned numbers,
fictitiously assigned numbers, and common test numbers; and requires the presence of related
keywords.

Table 45-145 Canadian Social Insurance Number narrow-breadth patterns

Patterns

\d{3} \d{3} \d{3}

\d{3}-\d{3}-\d{3}

Table 45-146 Canadian Social Insurance Number narrow-breadth validators

Mandatory validators Description

Luhn Check Validator computes the Luhn checksum which every


Canadian Insurance Number must pass.

Number delimiter Validates a match by checking the surrounding numbers.

Exclude beginning characters Data beginning with any of the following list of values is
not matched:
0, 8, 123456789
Library of system data identifiers 1077
Chilean National Identification Number

Table 45-146 Canadian Social Insurance Number narrow-breadth validators (continued)

Mandatory validators Description

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

pension, pensions, soc ins, ins #, social ins, CSIN,


SSN, social security, social insurance, Canada,
Canadian

Chilean National Identification Number


The Chilean National Identity Number or National Unique Role (RUN) is the only identifying
number assigned to all Chilean residents in or out of Chile, and to aliens residing temporarily
or permanently in the country.
The Chilean National Identification Number data identifier detects an eight- or nine-digit number
that matches the Chilean National Identification Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects an eight- or nine-digit number without checksum validation.
See “Chilean National Identification Number wide breadth” on page 1077.
■ The medium breadth detects an eight- or nine-digit number with checksum validation.
See “Chilean National Identification Number medium breadth” on page 1078.
■ The narrow breadth detects an eight- or nine-digit number that passes checksum validation.
It also requires the presence of related keywords.
See “Chilean National Identification Number narrow breadth” on page 1078.

Chilean National Identification Number wide breadth


The wide breadth detects an eight- or nine-digit number without checksum validation.

Table 45-147 Chilean National Identification Number wide-breadth patterns

Patterns

\d{7}[0123456789Kk]

\d{7}[-][0123456789Kk]

\d[.]\d{3}[.]\d{3}[-][0123456789Kk]
Library of system data identifiers 1078
Chilean National Identification Number

Table 45-147 Chilean National Identification Number wide-breadth patterns (continued)

Patterns

\d{8}[0123456789Kk]

\d{8}[-][0123456789Kk]

\d{2}[.]\d{3}[.]\d{3}[-][0123456789Kk]

Table 45-148 Chilean National Identification Number wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Chilean National Identification Number medium breadth


The medium breadth detects an eight- or nine-digit number with checksum validation.

Table 45-149 Chilean National Identification Number medium-breadth patterns

Patterns

\d{7}[0123456789Kk]

\d{7}[-][0123456789Kk]

\d[.]\d{3}[.]\d{3}[-][0123456789Kk]

\d{8}[0123456789Kk]

\d{8}[-][0123456789Kk]

\d{2}[.]\d{3}[.]\d{3}[-][0123456789Kk]

Table 45-150 Chilean National Identification Number medium-breadth validator

Mandatory validator Description

Chilean National Identification Number Validation Computes the checksum and validates the pattern against
Check it.

Chilean National Identification Number narrow breadth


The narrow breadth detects an eight- or nine-digit number that passes checksum validation.
It also requires the presence of related keywords.
Library of system data identifiers 1079
China Passport Number

Table 45-151 Chilean National Identification Number narrow-breadth patterns

Patterns

\d{7}[0123456789Kk]

\d{7}[-][0123456789Kk]

\d[.]\d{3}[.]\d{3}[-][0123456789Kk]

\d{8}[0123456789Kk]

\d{8}[-][0123456789Kk]

\d{2}[.]\d{3}[.]\d{3}[-][0123456789Kk]

Table 45-152 Chilean National Identification Number narrow-breadth validators

Mandatory validators Description

Duplicate digits Ensures that a string of digits is not all the same.

Chilean National Identification Number Validation Computes the checksum and validates the pattern against
Check it .

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:
RUT, RUN, national identification number, Chilean
identity no., national unique role, rut#, run#,
identificationnumber, identityno.#, identity number

nationaluniqueroleID#, nacional identidad, número


identificación, número identificación nacional,
identidad número

China Passport Number


The People's Republic of China passport, commonly referred to as the Chinese passport, is
issued to nationals of the People's Republic of China who do not permanently reside in Hong
Kong or Macau for international travel.
The China Passport Number data identifier detects a 9- to 10-character identifier that matches
the China Passport Number format.
The China Passport Number data identifier provides two breadths of detection:
Library of system data identifiers 1080
China Passport Number

■ The wide breadth detects a 9- to 10-character identifier.


See “China Passport Number wide breadth” on page 1080.
■ The narrow breadth detects a 9- 10-character identifier. It also requires the presence of
related keywords.
See “China Passport Number narrow breadth” on page 1080.

China Passport Number wide breadth


The wide breadth detects a 9- to 10-character identifier.

Table 45-153 China Passport Number wide-breadth patterns

Patterns

\d{9}

\l\d{8}

\l{2}\d{8}

Table 45-154 China Passport Number wide-breadth validator

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding numbers.

China Passport Number narrow breadth


The wide breadth detects a 9- to 10-character identifier. It also requires the presence of related
keywords.

Table 45-155 China Passport Number narrow-breadth patterns

Patterns

\d{9}

\l\d{8}

\l{2}\d{8}

Table 45-156 China Passport Number narrow-breadth validators

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding numbers.


Library of system data identifiers 1081
Codice Fiscale

Table 45-156 China Passport Number narrow-breadth validators (continued)

Mandatory validators Description

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

中国护照, 护照, 护照本

passport, Passport, CHINA PASSPORT, China


Passport, china passport, Passport Book, passport
book

Codice Fiscale
The Codice Fiscale uniquely identifies an Italian citizen or permanent resident alien and
issuance of the code is centralized to the Ministry of Treasure. The Codice Fiscale is issued
to every Italian at birth.
The Codice Fiscale data identifier detects a 16-character identifier that matches the Codice
Fiscale format.
The Codice Fiscale data identifier provides two breadths of detection:
■ The wide breadth detects a 16-character identifier with checksum validation.
See “Codice Fiscale wide breadth” on page 1081.
■ The narrow breadth detects a 16-character identifier with checksum validation. It also
requires the presence of related keywords.
See “Codice Fiscale narrow breadth” on page 1082.

Codice Fiscale wide breadth


The wide breadth detects a 16-character identifier that passes checksum validation.

Table 45-157 Codice Fiscale wide-breadth patterns

Patterns

[A-Z]{6}[0-9LMNPQRSTUV]{2}[ABCDEHLMPRST][0-9LMNPQRSTUV]{2}[A-Z] [0-9LMNPQRSTUV]{3}[A-Z]

[A-Z]{3} [A-Z]{3} [0-9LMNPQRSTUV]{2}[ABCDEHLMPRST][0-9LMNPQRSTUV]{2}


[A-Z][0-9LMNPQRSTUV]{3}[A-Z]
Library of system data identifiers 1082
Colombian Addresses

Table 45-158 Codice Fiscale wide-breadth validator

Mandatory validator Description

Codice Fiscale Control Key Check Computes the control key and checks if it is valid.

Codice Fiscale narrow breadth


The narrow breadth detects a 16-character identifier that passes checksum validation. It also
requires the presence of related keywords.

Table 45-159 Codice Fiscale narrow-breadth patterns

Patterns

[A-Z]{6}[0-9LMNPQRSTUV]{2}[ABCDEHLMPRST][0-9LMNPQRSTUV]{2}[A-Z] [0-9LMNPQRSTUV]{3}[A-Z]

[A-Z]{3} [A-Z]{3} [0-9LMNPQRSTUV]{2}[ABCDEHLMPRST][0-9LMNPQRSTUV]{2}


[A-Z][0-9LMNPQRSTUV]{3}[A-Z]

Table 45-160 Codice Fiscale narrow-breadth validators

Mandatory validators Description

Codice Fiscale Control Key Check Computes the control key and checks if it is valid.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

codice fiscal, dati anagrafici, partita I.V.A., p. iva, tax


code, personal data, VAT number

Colombian Addresses
The Colombian Addresses data identifier detects home addresses and physical locations in
Columbia.
The Colombian Addresses data identifier provides two breadths of detection:
■ The wide breadth detects an address without validation.
See “ Colombian Addresses wide breadth” on page 1083.
■ The narrow breadth detects an address with keyword validation.
See “Colombian Addresses narrow breadth” on page 1084.
Library of system data identifiers 1083
Colombian Addresses

Colombian Addresses wide breadth


The wide breadth detects an address without validation.

Table 45-161 Colombian Addresses wide-breadth patterns

Patterns

\d{1,3} No. \d{1,3}-\d{1,3}

\d{1,3} \d{1,3}-\d{1,3}

\d{1,3} Bis \d{1,3}[A-Za-z]-\d{1,3}

\d{1,3}[A-Za-z] Bis \d{1,3}[A-Za-z]-\d{1,3}

\d{1,3}[A-Za-z] \d{1,3}[A-Za-z]-\d{1,3}

\d{1,3} \d{1,3}[A-Za-z]-\d{1,3}

\d{1,3}[A-Za-z] \d{1,3}-\d{1,3}

\d{1,3} Bis No \d{1,3}[A-Za-z]-\d{1,3}

\d{1,3} Bis No. \d{1,3}[A-Za-z]-\d{1,3}

\d{1,3}[A-Za-z] Bis No. \d{1,3}[A-Za-z]-\d{1,3}

\d{1,3}[A-Za-z] Bis # \d{1,3}[A-Za-z]-\d{1,3}

\d{1,3}[A-Za-z] No. \d{1,3}[A-Za-z]-\d{1,3}

\d{1,3} # \d{1,3}[A-Za-z]-\d{1,3}

\d{1,3} No. \d{1,3}[A-Za-z]-\d{1,3}

\d{1,3}[A-Za-z] Bis No \d{1,3}[A-Za-z]-\d{1,3}

\d{1,3}[A-Za-z] No \d{1,3}[A-Za-z]-\d{1,3}

\d{1,3} # \d{1,3}-\d{1,3}

\d{1,3}[A-Za-z] # \d{1,3}-\d{1,3}

\d{1,3} No \d{1,3}-\d{1,3}

\d{1,3}[A-Za-z] No. \d{1,3}-\d{1,3}

\d{1,3}[A-Za-z] No \d{1,3}-\d{1,3}

\d{1,3} Bis # \d{1,3}[A-Za-z]-\d{1,3}


Library of system data identifiers 1084
Colombian Addresses

Table 45-161 Colombian Addresses wide-breadth patterns (continued)

Patterns

\d{1,3}[A-Za-z] # \d{1,3}[A-Za-z]-\d{1,3}

\d{1,3} No \d{1,3}[A-Za-z]-\d{1,3}

The wide breadth of the Colombian Addresses data identifier does not include a validator.

Colombian Addresses narrow breadth


The narrow breadth detects an address with keyword validation.

Table 45-162 Colombian Addresses narrow-breadth patterns

Patterns

\d{1,3} No. \d{1,3}-\d{1,3}

\d{1,3} \d{1,3}-\d{1,3}

\d{1,3} Bis \d{1,3}[A-Za-z]-\d{1,3}

\d{1,3}[A-Za-z] Bis \d{1,3}[A-Za-z]-\d{1,3}

\d{1,3}[A-Za-z] \d{1,3}[A-Za-z]-\d{1,3}

\d{1,3} \d{1,3}[A-Za-z]-\d{1,3}

\d{1,3}[A-Za-z] \d{1,3}-\d{1,3}

\d{1,3} Bis No \d{1,3}[A-Za-z]-\d{1,3}

\d{1,3} Bis No. \d{1,3}[A-Za-z]-\d{1,3}

\d{1,3}[A-Za-z] Bis No. \d{1,3}[A-Za-z]-\d{1,3}

\d{1,3}[A-Za-z] Bis # \d{1,3}[A-Za-z]-\d{1,3}

\d{1,3}[A-Za-z] No. \d{1,3}[A-Za-z]-\d{1,3}

\d{1,3} # \d{1,3}[A-Za-z]-\d{1,3}

\d{1,3} No. \d{1,3}[A-Za-z]-\d{1,3}

\d{1,3}[A-Za-z] Bis No \d{1,3}[A-Za-z]-\d{1,3}

\d{1,3}[A-Za-z] No \d{1,3}[A-Za-z]-\d{1,3}

\d{1,3} # \d{1,3}-\d{1,3}
Library of system data identifiers 1085
Colombian Cell Phone Number

Table 45-162 Colombian Addresses narrow-breadth patterns (continued)

Patterns

\d{1,3}[A-Za-z] # \d{1,3}-\d{1,3}

\d{1,3} No \d{1,3}-\d{1,3}

\d{1,3}[A-Za-z] No. \d{1,3}-\d{1,3}

\d{1,3}[A-Za-z] No \d{1,3}-\d{1,3}

\d{1,3} Bis # \d{1,3}[A-Za-z]-\d{1,3}

\d{1,3}[A-Za-z] # \d{1,3}[A-Za-z]-\d{1,3}

\d{1,3} No \d{1,3}[A-Za-z]-\d{1,3}

Table 45-163 Colombian Addresses narrow-breadth validator

Mandatory validator Description

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

Calle, Cll, Carrera, Cra, Cr, Avenida, Av, Dg, Diagonal,


Diag, Tv, Trans, Transversal, vereda

Colombian Cell Phone Number


The Colombian Cell Phone Number data identifier detects Colombian cell phone numbers.
The Colombian Cell Phone Number data identifier provides two breadths of detection:
■ The wide breadth detects a 8- to 10- digit number with duplicate digit validation.
See “Colombian Cell Phone Number wide breadth” on page 1085.
■ The narrow breadth detects an 8- to 10-digit number with required characters at the
beginning. It also checks for duplicate digits, and it requires the presence of related
keywords.
See “Colombian Cell Phone Number narrow breadth” on page 1086.

Colombian Cell Phone Number wide breadth


The wide breadth detects an 8- to 10-digit number with duplicate digit validation.
Library of system data identifiers 1086
Colombian Cell Phone Number

Table 45-164 Colombian Cell Phone Number wide-breadth patterns

Patterns

\d{8}

\d{2}.\d{3}.\d{3}

\d{2} \d{3} \d{3}

\d{2}/\d{3}/\d{3}

\d{2}-\d{3}-\d{3}

\d{2},\d{3},\d{3}

\d{9}

\d{3} \d{3} \d{3}

\d{3}-\d{3}-\d{3}

\d{3},\d{3},\d{3}

\d{3}/\d{3}/\d{3}

\d{3}.\d{3}.\d{3}

\d{10}

\d{1}/\d{3}/\d{3}/\d{3}

\d{1},\d{3},\d{3},\d{3}

\d{1}.\d{3}.\d{3}.\d{3}

\d{1}-\d{3}-\d{3}-\d{3}

\d{1} \d{3} \d{3} \d{3}

Table 45-165 Colombian Cell Phone Number wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Colombian Cell Phone Number narrow breadth


The narrow breadth detects an 8- to 10-digit number with required characters at the beginning.
It also checks for duplicate digits, and it requires the presence of related keywords.
Library of system data identifiers 1087
Colombian Cell Phone Number

Table 45-166 Colombian Cell Phone Number narrow-breadth patterns

Patterns

\d{8}

\d{2}.\d{3}.\d{3}

\d{2} \d{3} \d{3}

\d{2}/\d{3}/\d{3}

\d{2}-\d{3}-\d{3}

\d{2},\d{3},\d{3}

\d{9}

\d{3} \d{3} \d{3}

\d{3}-\d{3}-\d{3}

\d{3},\d{3},\d{3}

\d{3}/\d{3}/\d{3}

\d{3}.\d{3}.\d{3}

\d{10}

\d{1}/\d{3}/\d{3}/\d{3}

\d{1},\d{3},\d{3},\d{3}

\d{1}.\d{3}.\d{3}.\d{3}

\d{1}-\d{3}-\d{3}-\d{3}

\d{1} \d{3} \d{3} \d{3}

Table 45-167 Colombian Cell Phone Number narrow-breadth validators

Mandatory validators Description

Require beginning characters This validator requires the following characters at the
beginning of the number:

300, 301, 302, 310, 311, 312, 313, 314, 315, 316, 317,
318, 319, 320, 321, 350

Duplicate digits Ensures that a string of digits is not all the same.
Library of system data identifiers 1088
Colombian Personal Identification Number

Table 45-167 Colombian Cell Phone Number narrow-breadth validators (continued)

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding numbers.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

numero celular, número de teléfono, teléfono celular


no., numero celular#

Colombian Personal Identification Number


The Colombian Personal Identification Number is a unique 8- or 10-digit number assigned to
Colombian citizens at birth.
The Colombian Personal Identification Number data identifier detects an 8 or 10-digit number
that matches the Colombian Personal Identification Number format.
The Colombian Personal Identification Number data identifier provides two breadths of detection:
■ The wide breadth detects an 8- or 10-digit number with duplicate digit validation.
See “Colombian Personal Identification Number wide breadth” on page 1088.
■ The narrow breadth detects an 8- or 10-digit number with duplicate digit validation; prefix
and suffix exclusion; and beginning character exclusion. It also requires the presence of
related keywords.
See “Colombian Personal Identification Number narrow breadth” on page 1089.

Colombian Personal Identification Number wide breadth


The wide breadth detects an 8- or 10-digit number with duplicate digit validation.

Table 45-168 Colombian Personal Identification Number wide-breadth patterns

Patterns

\d{9}

\d{3} \d{3} \d{3}

\d{3}-\d{3}-\d{3}

\d{3},\d{3},\d{3}
Library of system data identifiers 1089
Colombian Personal Identification Number

Table 45-168 Colombian Personal Identification Number wide-breadth patterns (continued)

Patterns

\d{3}/\d{3}/\d{3}

\d{3}.\d{3}.\d{3}

Table 45-169 Colombian Personal Identification Number wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Colombian Personal Identification Number narrow breadth


The narrow breadth detects an 8- or 10-digit number with duplicate digit validation; prefix and
suffix exclusion; and beginning character exclusion. It also requires the presence of related
keywords.

Table 45-170 Colombian Personal Identification Number narrow-breadth patterns

Patterns

\d{9}

\d{3} \d{3} \d{3}

\d{3}-\d{3}-\d{3}

\d{3},\d{3},\d{3}

\d{3}/\d{3}/\d{3}

\d{3}.\d{3}.\d{3}

Table 45-171 Colombian Personal Identification Number narrow-breadth validators

Mandatory validators Description

Exclude beginning characters Data beginning with any of the following list of values is
not matched:

300, 301, 302, 310, 310, 312, 313, 314, 315, 316, 317,
318, 319, 320, 321, 350

Exclude prefix Excludes the following prefixes:

$ ,$
Library of system data identifiers 1090
Colombian Tax Identification Number

Table 45-171 Colombian Personal Identification Number narrow-breadth validators


(continued)

Mandatory validators Description

Exclude suffix Excludes the following suffix:

.00

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding numbers.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

cedula, cédula, c.c., c.c, C.C., C.C, cc, CC, NIE., NIE,
nie., nie, cedula de ciudadania, cédula de ciudadanía,
cc#, CC #, documento de identificacion, documento
de identificación, Nit.

Colombian Tax Identification Number


The Colombian Tax Identification Number is a nine-digit number assigned to persons who
must pay taxes in Colombia.
The Colombian Tax Identification Number data identifier detects a nine-digit number that
matches the Colombian Tax Identification Number format.
The Colombian Tax Identification Number data identifier provides two breadths of detection:
■ The wide breadth detects a nine-digit number with duplicate digit validation.
See “Colombian Tax Identification Number wide breadth” on page 1090.
■ The narrow breadth detects a nine-digit number with duplicate digit validation, required
beginning characters, and prefix exclusion. It also requires the presence of related keywords.
See “Colombian Tax Identification Number narrow breadth” on page 1091.

Colombian Tax Identification Number wide breadth


The wide breadth detects a 9-digit number with duplicate digit validation.
Library of system data identifiers 1091
Colombian Tax Identification Number

Table 45-172 Colombian Tax Identification Number wide-breadth patterns

Patterns

\d{9}

\d{3} \d{3} \d{3}

\d{3}-\d{3}-\d{3}

\d{3},\d{3},\d{3}

\d{3}/\d{3}/\d{3}

\d{3}.\d{3}.\d{3}

Table 45-173 Colombian Tax Identification Number wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Colombian Tax Identification Number narrow breadth


The narrow breadth detects a nine-digit number with duplicate digit validation, required beginning
characters, and prefix exclusion. It also requires the presence of related keywords.

Table 45-174 Colombian Tax Identification Number narrow-breadth patterns

Patterns

\d{9}

\d{3} \d{3} \d{3}

\d{3}-\d{3}-\d{3}

\d{3},\d{3},\d{3}

\d{3}/\d{3}/\d{3}

\d{3}.\d{3}.\d{3}

Table 45-175 Colombian Tax Identification Number narrow-breadth validators

Mandatory validators Description

Require beginning characters Requires these characters at the beginning of the number:

800, 860, 890, 900


Library of system data identifiers 1092
Credit Card Magnetic Stripe Data

Table 45-175 Colombian Tax Identification Number narrow-breadth validators (continued)

Mandatory validators Description

Exclude prefix Excludes the following prefix:


$

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding numbers.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

NIT., NIT, nit., nit, Nit.

Credit Card Magnetic Stripe Data


The magnetic stripe of a credit card contains information about the card. Storage of the complete
version of this data is a violation of the Payment Card Industry (PCI) Data Security Standard.
The Credit Card Magnetic Stripe Data data identifier detects the following raw data taken from
the credit card magnetic stripe:
■ Data from track one, format B, which typically contains account number, name, expiration
date, and possibly Card Verification Value or Card Verification Code 1 (CVV1/CVC1).
■ Data from track two, which typically contains account number and possibly expiration date,
service code and Card Verification Value or Card Verification Code 1 (CVV1/CVC1)
The Credit Card Magnetic Stripe data identifier detects the characteristic data pattern for track
two data which contains the start sentinel, format code, primary account number, name,
expiration date, service code, discretional data, and the end sentinel. It also includes standard
field separators. It validates the data using a Luhn-check validator.
Library of system data identifiers 1093
Credit Card Magnetic Stripe Data

Table 45-176 Credit Card Magnetic Stripe Data medium-breadth patterns

Patterns Patterns (continued)

%B3[068]\d{12}^[A-Z]{1}

%B3[068]\d{2} \d{6} \d{4}^[A-Z]{1}

%B3[068]\d{2}-\d{6}-\d{4}^[A-Z]{1}

%B4\d{12}^[A-Z]{1}

%B3[47]\d{2}-\d{6}-\d{5}^[A-Z]{1}

%B4\d{3} \d{4} \d{4} \d{4}^[A-Z]{1}

%B3[47]\d{2} \d{6} \d{5}^[A-Z]{1}

%B4\d{15}^[A-Z]{1}

%B3[47]\d{13}^[A-Z]{1}

%B5[1-5]\d{2}-\d{4}-\d{4}-\d{4}^[A-Z]{1}

%B4\d{3}-\d{4}-\d{4}-\d{4}^[A-Z]{1}

%B5[1-5]\d{2} \d{4} \d{4} \d{4}^[A-Z]{1}

%B5[1-5]\d{14}^[A-Z]{1}

%B2131\d{11}^[A-Z]{1}

%B3\d{3}-\d{4}-\d{4}-\d{4}^[A-Z]{1}

%B3\d{3} \d{4} \d{4} \d{4}^[A-Z]{1}

%B3\d{15}^[A-Z]{1}

%B2149\d{11}^[A-Z]{1}

%B2149 \d{6} \d{5}^[A-Z]{1}

%B2149-\d{6}-\d{5}^[A-Z]{1}

%B2014\d{11}^[A-Z]{1}

%B2014 \d{6} \d{5}^[A-Z]{1}

%B2014-\d{6}-\d{5}^[A-Z]{1}
Library of system data identifiers 1094
Credit Card Magnetic Stripe Data

Table 45-176 Credit Card Magnetic Stripe Data medium-breadth patterns (continued)

Patterns Patterns (continued)

;1800\d{11}=

;6011-\d{4}-\d{4}-\d{4}=

;6011 \d{4} \d{4} \d{4}=

;6011\d{12}=

;3[068]\d{12}=

;3[068]\d{2} \d{6} \d{4}=

;3[068]\d{2}-\d{6}-\d{4}=

;4\d{12}=

;3[47]\d{2}-\d{6}-\d{5}=

;4\d{3} \d{4} \d{4} \d{4}=

;3[47]\d{2} \d{6} \d{5}=

;4\d{15}= ;3[47]\d{13}=

;5[1-5]\d{2}-\d{4}-\d{4}-\d{4}=

;4\d{3}-\d{4}-\d{4}-\d{4}=

;5[1-5]\d{2} \d{4} \d{4} \d{4}=

;5[1-5]\d{14}= ;2131\d{11}=

;3\d{3}-\d{4}-\d{4}-\d{4}=

;3\d{3} \d{4} \d{4} \d{4}=

;3\d{15}=

;2149\d{11}=

;2149 \d{6} \d{5}=

;2149-\d{6}-\d{5}=

;2014\d{11}=

;2014 \d{6} \d{5}=

;2014-\d{6}-\d{5}=

%B1800\d{11}^[A-Z]{1}

%B6011-\d{4}-\d{4}-\d{4}^[A-Z]{1}

%B6011 \d{4} \d{4}


\d{4}^[A-Z]{1}
Library of system data identifiers 1095
Credit Card Number

Table 45-176 Credit Card Magnetic Stripe Data medium-breadth patterns (continued)

Patterns Patterns (continued)

%B6011\d{12}^[A-Z]{1}

Table 45-177 Credit Card Magnetic Stripe Data medium-breadth validator

Validator Description

Luhn Check Computes the Luhn checksum which every instance must
pass.

Credit Card Number


Account number needed to process credit card transactions. Often abbreviated as CCN. Also
known as a Primary Account Number (PAN).
The Credit Card Number data identifier detects valid credit card numbers that are separated
by spaces, dashes, periods, or without separators
The Credit Card Number data identifier offers three breadths of detection:
■ The wide breadth detects valid credit card numbers that are separated by spaces, dashes,
periods, or without separators. It also performs Luhn-check validation.
See “Credit Card Number wide breadth” on page 1095.
■ The medium breadth detects valid credit card numbers that are separated by spaces,
dashes, periods, or without separators. It also checks for common test numbers and
performs Luhn-check validation.
See “Credit Card Number medium breadth” on page 1096.
■ The narrow breadth detects valid credit card numbers that are separated by spaces, dashes,
periods, or without separators. It also checks for common test numbers, performs
Luhn-check validation and requires the presence of credit card number-related keywords.
See “Credit Card Number narrow breadth” on page 1100.

Credit Card Number wide breadth


The wide breadth detects valid credit card numbers that are separated by spaces, dashes,
periods, or without separators.
This validator includes formats for American Express, Diner's Club, Discover, Japan Credit
Bureau (JCB), MasterCard, and Visa.
This validator performs Luhn-check validation.
Library of system data identifiers 1096
Credit Card Number

Table 45-178 Credit Card Number wide-breadth patterns

Patterns Patterns (continued)

2149 \d{6} \d{5} 4\d{12}

2149-\d{6}-\d{5} \d{16}

2014\d{11} \d{4}.\d{4}.\d{4}.\d{4}

2014 .\d{6}.\d{5} \d{4}-\d{4}-\d{4}-\d{4}

2014 \d{6} \d{5} \d{4} \d{4} \d{4} \d{4}

2014-\d{6}-\d{5} 1800\d{11}

3[47]\d{2}.\d{6}.\d{5} 2131\d{11}

3[068]\d{2}.\d{6}.\d{4} 2149\d{11}

3[47]\d{2}-\d{6}-\d{5} 2149.\d{6}.\d{5}

3[068]\d{2}-\d{6}-\d{4}

3[47]\d{13}

3[068]\d{2} \d{6} \d{4}

3[47]\d{2} \d{6} \d{5}

3[068]\d{12}

Table 45-179 Credit Card Number wide-breadth validator

Mandatory validator Description

Luhn Check Computes the Luhn checksum, which every credit card number must pass.

Credit Card Number medium breadth


The medium breadth detects valid credit card numbers that are separated by spaces, dashes,
periods, or without separators. This validator performs Luhn check validation. This validator
includes formats for American Express, Diner's Club, Discover, Japan Credit Bureau (JCB),
MasterCard, and Visa. This validator eliminates common test numbers, including those reserved
for testing by credit card issuers.
Library of system data identifiers 1097
Credit Card Number

Table 45-180 Credit Card Number medium-breadth patterns

Patterns Patterns (continued)


Library of system data identifiers 1098
Credit Card Number

Table 45-180 Credit Card Number medium-breadth patterns (continued)

Patterns Patterns (continued)

1800\d{11} 2720.\d{4}.\d{4}.\d{4}

2131\d{11} 2720-\d{4}-\d{4}-\d{4}

3\d{3}.\d{4}.\d{4}.\d{4} 2720 \d{4} \d{4} \d{4}

3\d{3}-\d{4}-\d{4}-\d{4} 2720\d{12}

3\d{3} \d{4} \d{4} \d{4} 6221[2][6-8]\d{10}

3\d{15} 6221.[2][6-8]\d{2}.\d{4}.\d{4}

4\d{3}.\d{4}.\d{4}.\d{4} 6221-[2][6-8]\d{2}-\d{4}-\d{4}

4\d{3}-\d{4}-\d{4}-\d{4} 6221 [2][6-8]\d{2} \d{4} \d{4}

4\d{3} \d{4} \d{4} \d{4} 622[2-8]\d{12}

4\d{15} 622[2-8].\d{4}.\d{4}.\d{4}

4\d{12} 622[2-8]-\d{4}-\d{4}-\d{4}

5[1-5]\d{2}.\d{4}.\d{4}.\d{4} 622[2-8] \d{4} \d{4} \d{4}

5[1-5]\d{2}-\d{4}-\d{4}-\d{4} 6229[2][0-5]\d{10}

2149.\d{6}.\d{5} 6229.[2][0-5]\d{2}.\d{4}.\d{4}

5[1-5]\d{2} \d{4} \d{4} 6229-[2][0-5]\d{2}-\d{4}-\d{4}


\d{4}
6229 [2][0-5]\d{2} \d{4} \d{4}
2149 \d{6} \d{5}
2014 \d{6} \d{5}
5[1-5]\d{14}
2014-\d{6}-\d{5}
2149-\d{6}-\d{5}
2014\d{11}
2149\d{11}
6011.\d{4}.\d{4}.\d{4}
2014.\d{6}.\d{5}
6011-\d{4}-\d{4}-\d{4}
222[1-9]\d{12}
6011 \d{4} \d{4} \d{4}
222[1-9][.-]\d{4}[.-]\d{4}[.-]\d{4}
6011\d{12}
22[3-9]\d{13}
3[068]\d{2}.\d{6}.\d{4}
22[3-9]\d[.-]\d{4}[.-]\d{4}[.-]\d{4}
3[068]\d{2}-\d{6}-\d{4}
2[3-6]\d{14}
3[068]\d{2} \d{6} \d{4}
2[3-6]\d{2}.\d{4}.\d{4}.\d{4}
3[068]\d{12}
2[3-6]\d{2}-\d{4}-\d{4}-\d{4}
3[47]\d{13}
2[3-6]\d{2} \d{4} \d{4}
3[47]\d{2}.\d{6}.\d{5}
\d{4}
Library of system data identifiers 1099
Credit Card Number

Table 45-180 Credit Card Number medium-breadth patterns (continued)

Patterns Patterns (continued)

27[0-1]\d{13} 3[47]\d{2} \d{6} \d{5}

27[0-1]\d.\d{4}.\d{4}.\d{4} 3[47]\d{2}-\d{6}-\d{5}

27[0-1]\d-\d{4}-\d{4}-\d{4}

27[0-1]\d \d{4} \d{4} \d{4}

Table 45-181 Credit Card Number medium-breadth validators

Mandatory validators Description

Exclude exact match Excludes anything that matches the specified text.

Inputs:

0111111111111111, 1234567812345670, 180025848680889, 180026939516875,


201400000000009, 201411032364438, 201431736711288, 210002956344412,
214906110040367, 30000000000004, 30175572836108, 30203642658706,
30374367304832, 30569309025904, 3088000000000000, 3088000000000009,
3088272824427380, 3096666928988980, 3158060990195830, 340000000000009,
341019464477148, 341111111111111, 341132368578216, 343510064010360,
344400377306201, 3530111333300000, 3566002020360500, 370000000000002,
371449635398431, 374395534374782, 378282246310005, 378282246310005,
378282246310005, 378734493671000, 38520000023237, 4007000000027,
4012888888881880, 4024007116284, 4111111111111110, 4111111111111111,
4222222222222, 4242424242424242, 4485249610564758, 4539399050593,
4539475158333170, 4539603277651940, 4539687075612974, 4539890911376230,
4556657397647250, 4716733846619930, 4716976758661, 4916437046413,
4916451936094420, 4916491104658550, 4916603544909870, 4916759155933,
5105105105105100, 5119301340696760, 5263386793750340, 5268196752489640,
5283145597742620, 5424000000000015, 5429800397359070, 5431111111111111,
5455780586062610, 5472715456453270, 5500000000000004, 5539878514522540,
5547392938355060, 5555555555554440, 5555555555554444, 5556722757422205,
6011000000000000, 6011000000000004, 6011000000000012, 6011000990139420,
6011111111111110, 6011111111111117, 6011312054074430, 6011354276117410,
6011601160116611, 6011905056260500, 869908581608894, 869933317208876,
869989278167071

Luhn Check Validator computes the Luhn checksum, which every credit card number must pass.

Number delimiter Validates a match by checking the surrounding number.


Library of system data identifiers 1100
Credit Card Number

Credit Card Number narrow breadth


The narrow breadth detects valid credit card numbers that are separated by spaces, dashes,
periods, or without separators. It performs Luhn check validation. Includes formats for American
Express, Diner's Club, Discover, Japan Credit Bureau (JCB), MasterCard, and Visa. Eliminates
common test numbers, including those reserved for testing by credit card issuers. It also
requires presence of a credit card-related keyword.
Library of system data identifiers 1101
Credit Card Number

Table 45-182 Credit Card Number narrow-breadth patterns

Patterns Patterns (continued)

222[1-9]\d{12}

222[1-9][.-]\d{4}[.-]\d{4}[.-]\d{4}

22[3-9]\d{13}

22[3-9]\d[.-]\d{4}[.-]\d{4}[.-]\d{4}

2[3-6]\d{14}

2[3-6]\d{2}.\d{4}.\d{4}.\d{4}

2[3-6]\d{2}-\d{4}-\d{4}-\d{4}

2[3-6]\d{2} \d{4} \d{4} \d{4}

27[0-1]\d{13}

27[0-1]\d.\d{4}.\d{4}.\d{4}

27[0-1]\d-\d{4}-\d{4}-\d{4}

27[0-1]\d \d{4} \d{4} \d{4}

2720.\d{4}.\d{4}.\d{4}

2720-\d{4}-\d{4}-\d{4}

2720 \d{4} \d{4} \d{4}

2720\d{12}

6221[2][6-8]\d{10}

6221.[2][6-8]\d{2}.\d{4}.\d{4}

6221-[2][6-8]\d{2}-\d{4}-\d{4}

6221 [2][6-8]\d{2} \d{4} \d{4}

622[2-8]\d{12}

622[2-8].\d{4}.\d{4}.\d{4}

622[2-8]-\d{4}-\d{4}-\d{4}

622[2-8] \d{4} \d{4} \d{4}

6229[2][0-5]\d{10}

6229.[2][0-5]\d{2}.\d{4}.\d{4}

6229-[2][0-5]\d{2}-\d{4}-\d{4}

6229 [2][0-5]\d{2} \d{4} \d{4}


Library of system data identifiers 1102
Credit Card Number

Table 45-182 Credit Card Number narrow-breadth patterns (continued)

Patterns Patterns (continued)

2149 \d{6} \d{5}

2149-\d{6}-\d{5}

2014\d{11}

2014 \d{6} \d{5}

2014-\d{6}-\d{5}

6011-\d{4}-\d{4}-\d{4}

6011 \d{4} \d{4} \d{4}

6011\d{12}

3[068]\d{12}

3[068]\d{2} \d{6} \d{4}

3[068]\d{2}-\d{6}-\d{4}

3[47]\d{2}-\d{6}-\d{5}

3[47]\d{2} \d{6} \d{5}

3[47]\d{13}

4\d{3}-\d{4}-\d{4}-\d{4}

3\d{3}.\d{4}.\d{4}.\d{4}

2149.\d{6}.\d{5}

2014.\d{6}.\d{5}

6011.\d{4}.\d{4}.\d{4}

3[068]\d{2}.\d{6}.\d{4}

3[47]\d{2}.\d{6}.\d{5}

4\d{3}.\d{4}.\d{4}.\d{4}

1800\d{11}

4\d{12}

4\d{3} \d{4} \d{4} \d{4}

4\d{15}

5[1-5]\d{2}-\d{4}-\d{4}-\d{4}

5[1-5]\d{2} \d{4} \d{4}


\d{4}
Library of system data identifiers 1103
Credit Card Number

Table 45-182 Credit Card Number narrow-breadth patterns (continued)

Patterns Patterns (continued)

5[1-5]\d{14}

5[1-5]\d{2}.\d{4}.\d{4}.\d{4}

2131\d{11}

3\d{3}-\d{4}-\d{4}-\d{4}

3\d{3} \d{4} \d{4} \d{4}

3\d{15}

2149\d{11}

Table 45-183 Credit Card Number narrow-breadth validators

Mandatory validators Description

Exclude exact match Excludes anything that matches the specified text.

Inputs:

0111111111111111, 1234567812345670, 180025848680889, 180026939516875,


201400000000009, 201411032364438, 201431736711288, 210002956344412,
214906110040367, 30000000000004, 30175572836108, 30203642658706,
30374367304832, 30569309025904, 3088000000000000, 3088000000000009,
3088272824427380, 3096666928988980, 3158060990195830, 340000000000009,
341019464477148, 341111111111111, 341132368578216, 343510064010360,
344400377306201, 3530111333300000, 3566002020360500, 370000000000002,
371449635398431, 374395534374782, 378282246310005, 378282246310005,
378282246310005, 378734493671000, 38520000023237, 4007000000027,
4012888888881880, 4024007116284, 4111111111111110, 4111111111111111,
4222222222222, 4242424242424242, 4485249610564758, 4539399050593,
4539475158333170, 4539603277651940, 4539687075612974, 4539890911376230,
4556657397647250, 4716733846619930, 4716976758661, 4916437046413,
4916451936094420, 4916491104658550, 4916603544909870, 4916759155933,
5105105105105100, 5119301340696760, 5263386793750340, 5268196752489640,
5283145597742620, 5424000000000015, 5429800397359070, 5431111111111111,
5455780586062610, 5472715456453270, 5500000000000004, 5539878514522540,
5547392938355060, 5555555555554440, 5555555555554444, 5556722757422205,
6011000000000000, 6011000000000004, 6011000000000012, 6011000990139420,
6011111111111110, 6011111111111117, 6011312054074430, 6011354276117410,
6011601160116611, 6011905056260500, 869908581608894, 869933317208876,
869989278167071

Luhn Check Validator computes the Luhn checksum which every Credit Card Number must
pass.
Library of system data identifiers 1104
Croatia National Identification Number

Table 45-183 Credit Card Number narrow-breadth validators (continued)

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding number.

Find keywords With this option selected, at least one of the following keywords or key phrases
must be present for the data to be matched.

Inputs:

account number, account ps, american express, americanexpress, amex,


bank card, bankcard, card num, card number, cc #, cc#, ccn, check card,
checkcard, credit card, credit card #, credit card number, credit card#, debit
card, debitcard, diners club, dinersclub, discover, enroute, japanese card
bureau, jcb, mastercard, mc, visa

Croatia National Identification Number


The Croatian National Identification number (Osobni identifikacijski broj or OIB) is the permanent
personal and tax identifier for Croatian citizens and residents.
The Croatia National Identification Number data identifier detects an 11-digit number, optionally
preceded by the letters HR or hr, that matches the Croatia National Identification Number
format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects an 11-digit number, optionally preceded by the letters HR or hr,
that matches the Croatia National Identification Number format. It checks for duplicate
digits and common test numbers.
See “Croatia National Identification Number wide breadth” on page 1105.
■ The medium breadth detects an 11-digit number, optionally preceded by the letters HR or
hr, that matches the Croatia National Identification Number format with checksum validation.
See “Croatia National Identification Number medium breadth” on page 1105.
■ The narrow breadth detects an 11-digit number, optionally preceded by the letters HR or
hr, that matches the Croatia National Identification Number format with checksum validation.
It checks for duplicate digits and common test numbers, and requires the presence of
related keywords.
See “Croatia National Identification Number narrow breadth” on page 1105.
Library of system data identifiers 1105
Croatia National Identification Number

Croatia National Identification Number wide breadth


The wide breadth detects an 11-digit number, optionally preceded by the letters HR or hr, that
matches the Croatia National Identification Number format. It checks for duplicate digits and
common test numbers.

Table 45-184 Croatia National Identification Number wide-breadth patterns

Pattern

\d{11}

[Hh][Rr]\d{11}

Table 45-185 Croatia National Identification Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Duplicate digits Ensures that a string of digits is not all the same.

Croatia National Identification Number medium breadth


The medium breadth detects an 11-digit number, optionally preceded by the letters HR or hr,
that matches the Croatia National Identification Number format with checksum validation.

Table 45-186 Croatia National Identification Number medium-breadth patterns

Pattern

\d{11}

[Hh][Rr]\d{11}

Table 45-187 Croatia National Identification Number medium-breadth validators

Mandatory validator Description

Croatia National Identification Number Validation Computes the checksum and validates the pattern against
Check it.

Croatia National Identification Number narrow breadth


The narrow breadth detects an 11-digit number, optionally preceded by the letters HR or hr,
that matches the Croatia National Identification Number format with checksum validation. It
Library of system data identifiers 1106
CUSIP Number

checks for duplicate digits and common test numbers, and requires the presence of related
keywords.

Table 45-188 Croatia National Identification Number narrow-breadth patterns

Pattern

\d{11}

[Hh][Rr]\d{11}

Table 45-189 Croatia National Identification Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Duplicate digits Ensures that a string of digits is not all the same.

Croatia National Identification Number Validation Computes the checksum and validates the pattern against
Check it.

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

national ID, Osobna iskaznica, national identification


number,Nacionalni identifikacijski broj, personal ID,
osobni ID, personal identification number, osobni
identifikacijski broj, OIB, OIB#, nationalid#,
personalid#, tax id, tax number, tax identification
number, tax code, taxpayer code, taxpayer id, taxpayer
identification code, porez iskaznica, porezni broj,
porezni identifikacijski broj, porez kod, šifra poreznog
obveznika

CUSIP Number
The CUSIP number is a unique identifier assigned to North American stock or other securities.
This number is issued by the Committee on Uniform Security Identification Procedures (CUSIP)
to assist in clearing and settling trades. CINS is an extension of CUSIP used to identify securities
outside of North America.
The CUSIP Number data identifier detects a 9-character alphanumeric pattern that matches
the CUSIP Number format.
This data identifier provides three breadths of detection:
Library of system data identifiers 1107
CUSIP Number

■ The wide breadth detects a 9-character alphanumeric pattern with checksum validation.
See “CUSIP Number wide breadth” on page 1107.
■ The medium breadth detects a 9-character alphanumeric pattern with checksum validation.
It also requires the presence of related keywords.
See “CUSIP Number medium breadth” on page 1107.
■ The narrow breadth detects a 9-character alphanumeric pattern with checksum validation.
It also requires the presence of related keywords, excluding the NNA keyword.
See “CUSIP Number narrow breadth” on page 1108.

CUSIP Number wide breadth


The wide breadth detects a 9-character alphanumeric pattern with checksum validation. The
5th, 6th, 7th, and 8th character can be a letter or number, and all others are digits.

Table 45-190 CUSIP Number wide-breadth pattern

Pattern

w\d\w{6}\d

\w\d\w{4} \w{2} \d

Table 45-191 CUSIP Number wide-breadth validator

Mandatory validator Description

Cusip Validation Validator checks for invalid CUSIP ranges and computes the CUSIP checksum
(Modulus 10 Double Add Double algorithm).

CUSIP Number medium breadth


The medium breadth detects a 9-character alphanumeric pattern with checksum validation. It
also requires the presence of related keywords. The 5th, 6th, 7th, and 8th character can be a
letter or number, and all others are digits.

Table 45-192 CUSIP Number medium-breadth pattern

Pattern

w\d\w{6}\d

\w\d\w{4} \w{2} \d
Library of system data identifiers 1108
CUSIP Number

Table 45-193 CUSIP Number medium-breadth validator

Mandatory validator Description

Cusip Validation Validator checks for invalid CUSIP ranges and computes the CUSIP
checksum (Modulus 10 Double Add Double algorithm).

Find keywords With this option selected, at least one of the following keywords or key
phrases must be present for the data to be matched.

Inputs:

cusip, c.u.s.i.p., Committee on Uniform Security Identification


Procedures, American Bankers Association, Standard & Poor's, S&P,
National Numbering Association, NNA, National Securities
Identification Number

CUSIP Number narrow breadth


The narrow breadth detects a 9-character alphanumeric pattern with checksum validation. It
also requires the presence of related keywords, excluding the NNA keyword. The 5th, 6th,
7th, and 8th character can be a letter or number, and all others are digits.

Table 45-194 CUSIP Number narrow-breadth pattern

Pattern

w\d\w{6}\d

\w\d\w{4} \w{2} \d

Table 45-195 CUSIP Number narrow-breadth validators

Mandatory validator Description

Cusip Validation Validator checks for invalid CUSIP ranges and computes the CUSIP checksum
(Modulus 10 Double Add Double algorithm).

Find keywords With this option selected, at least one of the following keywords or key phrases
must be present for the data to be matched.
Inputs:

cusip, c.u.s.i.p., Committee on Uniform Security Identification Procedures,


American Bankers Association, Standard & Poor's, S&P, National Numbering
Association, National Securities Identification Number
Library of system data identifiers 1109
Cyprus Tax Identification Number

Cyprus Tax Identification Number


The Cyprus Tax Identification Number is a unique identifier for Cypriot taxpayers.
The Cyprus Tax Identification Number data identifier detects a nine-character alphanumeric
pattern that matches the Cyprus Tax Identification Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects a nine-character alphanumeric pattern that matches the Cyprus
Tax Identification Number format without checksum validation.
See “Cyprus Tax Identification Number wide breadth” on page 1109.
■ The medium breadth detects a nine-character alphanumeric pattern that matches the
Cyprus Tax Identification Number format with checksum validation.
See “Cyprus Tax Identification Number medium breadth” on page 1109.
■ The narrow breadth detects a nine-character alphanumeric pattern that matches the Cyprus
Tax Identification Number format with checksum validation. It also requires the presence
of related keywords.
See “Cyprus Tax Identification Number narrow breadth” on page 1110.

Cyprus Tax Identification Number wide breadth


The wide breadth detects a nine-character alphanumeric pattern that matches the Cyprus Tax
Identification Number format without checksum validation.

Table 45-196 Cyprus Tax Identification Number wide-breadth patterns

Pattern

\d{8}[A-Za-z]

Table 45-197 Cyprus Tax Identification Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Cyprus Tax Identification Number medium breadth


The medium breadth detects a nine-character alphanumeric pattern that matches the Cyprus
Tax Identification Number format with checksum validation.
Library of system data identifiers 1110
Cyprus Tax Identification Number

Table 45-198 Cyprus Tax Identification Number medium-breadth patterns

Pattern

\d{8}[A-Za-z]

Table 45-199 Cyprus Tax Identification Number medium-breadth validators

Mandatory validator Description

Cyprus Tax Identification Number Validation Check Computes the checksum and validates the pattern against
it.

Cyprus Tax Identification Number narrow breadth


The narrow breadth detects a nine-character alphanumeric pattern that matches the Cyprus
Tax Identification Number format with checksum validation. It also requires the presence of
related keywords.

Table 45-200 Cyprus Tax Identification Number narrow-breadth patterns

Pattern

\d{8}[A-Za-z]

Table 45-201 Cyprus Tax Identification Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Cyprus Tax Identification Number Validation Check Computes the checksum and validates the pattern against
it.

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

tax identification number, tax number, tax id, cyprus


TIN number, taxid#, taxnumber#, αριθμός φορολογικού
μητρώου, Vergi Kimlik Numarası, vergi numarası,
Kıbrıs TIN numarası, tin, TIN, tin#, TIN#, tin no
Library of system data identifiers 1111
Cyprus Value Added Tax (VAT) Number

Cyprus Value Added Tax (VAT) Number


Value Added Tax (VAT) is a consumption tax that is borne by the end consumer. VAT is paid
for each transaction in the manufacturing and distribution process. For Cyprus, VAT is
administered by the tax office for the region in which the business is established.
The Cyprus Value Added Tax (VAT) Number data identifier detects an 11-character
alphanumeric pattern that matches the Cyprus VAT Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects an 11-character alphanumeric pattern that matches the Cyprus
VAT Number format without checksum validation.
See “Cyprus Value Added Tax (VAT) Number wide breadth” on page 1111.
■ The medium breadth detects an 11-character alphanumeric pattern that matches the Cyprus
VAT Number format with checksum validation.
See “Cyprus Value Added Tax (VAT) Number medium breadth” on page 1111.
■ The narrow breadth detects an 11-character alphanumeric pattern that matches the Cyprus
VAT Number format with checksum validation. It also requires the presence of related
keywords.
See “Cyprus Value Added Tax (VAT) Number narrow breadth” on page 1112.

Cyprus Value Added Tax (VAT) Number wide breadth


The wide breadth detects an 11-character alphanumeric pattern that matches the Cyprus VAT
Number format without checksum validation.

Table 45-202 Cyprus Value Added Tax (VAT) Number wide-breadth patterns

Pattern

[Cc][Yy]\d{8}[A-Za-z]

Table 45-203 Cyprus Value Added Tax (VAT) Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Cyprus Value Added Tax (VAT) Number medium breadth


The medium breadth detects an 11-character alphanumeric pattern that matches the Cyprus
VAT Number format with checksum validation.
Library of system data identifiers 1112
Czech Republic Driver's Licence Number

Table 45-204 Cyprus Value Added Tax (VAT) Number medium-breadth patterns

Pattern

[Cc][Yy]\d{8}[A-Za-z]

Table 45-205 Cyprus Value Added Tax (VAT) Number medium-breadth validators

Mandatory validator Description

Cyprus Value Added Tax (VAT) Number Validation Computes the checksum and validates the pattern against
Check it.

Cyprus Value Added Tax (VAT) Number narrow breadth


The narrow breadth detects an 11-character alphanumeric pattern that matches the Cyprus
VAT Number format with checksum validation. It also requires the presence of related keywords.

Table 45-206 Cyprus Value Added Tax (VAT) Number narrow-breadth patterns

Pattern

[Cc][Yy]\d{8}[A-Za-z]

Table 45-207 Cyprus Value Added Tax (VAT) Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Cyprus Value Added Tax (VAT) Number Validation Computes the checksum and validates the pattern against
Check it.

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

vat no, vat, vat number, vat#, VAT, VAT#, value added
tax, vatin, VATIN, KDV, kdv#, KDV numarası, Katma
değer Vergisi, Φόρος Προστιθέμενης Αξίας

Czech Republic Driver's Licence Number


The Czech Republic Ministry of Transport grants driver's licenses in the Czech Republic,
confirming the rights of the holder to drive motor vehicles.
Library of system data identifiers 1113
Czech Republic Driver's Licence Number

The Czech Republic Driver's Licence Number data identifier detects an eight-character
alphanumeric pattern that matches the Czech Republic Driver's Licence Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects an eight-character alphanumeric pattern that matches the Czech
Republic Driver's Licence Number format. It checks for common test patterns.
See “Czech Republic Driver's License Number wide breadth” on page 1113.
■ The narrow breadth detects an eight-character alphanumeric pattern that matches the
Czech Republic Driver's Licence Number format. It checks for common test patterns, and
also requires the presence of related keywords.
See “Czech Republic Driver's License Number narrow breadth” on page 1113.

Czech Republic Driver's License Number wide breadth


The wide breadth detects an eight-character alphanumeric pattern that matches the Czech
Republic Driver's Licence Number format. It checks for common test patterns.

Table 45-208 Czech Republic Driver's License Number wide-breadth patterns

Pattern

[Ee][A-Za-z] \d{6}

Table 45-209 Czech Republic Driver's License Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

000000, 111111, 222222, 333333, 444444, 555555,


666666, 777777, 888888, 999999

Czech Republic Driver's License Number narrow breadth


The narrow breadth detects an eight-character alphanumeric pattern that matches the Czech
Republic Driver's Licence Number format. It checks for common test patterns, and also requires
the presence of related keywords.

Table 45-210 Czech Republic Driver's License Number narrow-breadth patterns

Pattern

[Ee][A-Za-z] \d{6}
Library of system data identifiers 1114
Czech Republic Personal Identification Number

Table 45-211 Czech Republic Driver's License Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

000000, 111111, 222222, 333333, 444444, 555555,


666666, 777777, 888888, 999999

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

driver license, drivers license, driving license, driver


license number, drivers license number, driving license
number, DLNo#, dlno#, drivers lic., Driver's License,
Driver's License Number, driver's license number,
Driver's Licence Number, driver licence, drivers
licence, driving licence, Driver's Licence, driver permit,
drivers permit, driving permit, license number, licence
number

řidičský průkaz, řidičský prúkaz, číslo řidičského


průkazu, řidičské číslo řidičů, ovladače lic., Číslo
licence řidiče, Řidičský průkaz, povolení řidiče, řidiči
povolení, povolení k jízdě, číslo licence

Czech Republic Personal Identification Number


All citizens of the Czech Republic are issued a unique personal identification number by the
Ministry of Interior.
The Czech Republic Personal Identification Number data identifier detects a 9- or 10-digit
number that matches the Czech Personal Identification Number format.
This data identifier provides three breadths of validation:
■ The wide breadth detects a 9- or 10-digit number without checksum validation.
See “Czech Republic personal Identification Number wide breadth” on page 1115.
■ The medium breadth detects a 9- or 10-digit number with checksum validation.
See “Czech Republic Personal Identification Number medium breadth” on page 1115.
■ The narrow breadth detects a 9- or 10-digit number with checksum validation. It also requires
the presence of related keywords.
See “Czech Republic Personal Identification Number narrow breadth” on page 1116.
Library of system data identifiers 1115
Czech Republic Personal Identification Number

Czech Republic personal Identification Number wide breadth


The wide breadth detects a 9- or 10-digit number without checksum validation.

Table 45-212 Czech Republic Personal Identification Number wide-breadth patterns

Pattern

\d\d[0156]\d[0123]\d[/]\d\d\d

\d\d[0156]\d[0123]\d[/]\d\d\d\d

\d\d[0156]\d[0123]\d\d\d\d

\d\d[0156]\d[0123]\d\d\d\d\d

\d\d[0156]\d[012345678]\d[/]\d\d\d

\d\d[0156]\d[012345678]\d[/]\d\d\d\d

\d\d[0156]\d[012345678]\d\d\d\d

\d\d[0156]\d[012345678]\d\d\d\d\d

Table 45-213 Czech Republic Personal Identification Number wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Czech Republic Personal Identification Number medium breadth


The medium breadth detects a 9- or 10-digit number with checksum validation.

Table 45-214 Czech Republic Personal Identification Number medium-breadth pattern

Pattern

\d\d[0156]\d[0123]\d[/]\d\d\d

\d\d[0156]\d[0123]\d[/]\d\d\d\d

\d\d[0156]\d[0123]\d\d\d\d

\d\d[0156]\d[0123]\d\d\d\d\d

\d\d[0156]\d[012345678]\d[/]\d\d\d

\d\d[0156]\d[012345678]\d[/]\d\d\d\d
Library of system data identifiers 1116
Czech Republic Personal Identification Number

Table 45-214 Czech Republic Personal Identification Number medium-breadth pattern


(continued)

Pattern

\d\d[0156]\d[012345678]\d\d\d\d

\d\d[0156]\d[012345678]\d\d\d\d\d

Table 45-215 Czech Republic Personal Identification Number medium-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Czech Personal Identity Computes the checksum and validates the pattern against it.
Number Validation Check

Exclude beginning characters 5555555555, 1111111111, 111111111

Czech Republic Personal Identification Number narrow breadth


The narrow breadth detects a 9- or 10-digit number with checksum validation. It also requires
the presence of related keywords.

Table 45-216 Czech Republic Personal Identification Number narrow-breadth patterns

Pattern

\d\d[0156]\d[0123]\d[/]\d\d\d

\d\d[0156]\d[0123]\d[/]\d\d\d\d

\d\d[0156]\d[0123]\d\d\d\d

\d\d[0156]\d[0123]\d\d\d\d\d

\d\d[0156]\d[012345678]\d[/]\d\d\d

\d\d[0156]\d[012345678]\d[/]\d\d\d\d

\d\d[0156]\d[012345678]\d\d\d\d

\d\d[0156]\d[012345678]\d\d\d\d\d
Library of system data identifiers 1117
Czech Republic Tax Identification Number

Table 45-217 Czech Republic Personal Identification Number narrow-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

Czech Personal Identity Number Validation Check Computes the checksum and validates the pattern against
it.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

personal ID number, PID, personal identity number,


Czech Personal ID Number, identity no, Czech Republic
ID, republic identity number, national number,
insurance number, unique identification number, PID#,
Czechidno#, identityno#

Osobní identifikační číslo, Pojištění číslo, unikátní


identifikační číslo , Osobní identifikační číslo,
identifikační číslo

Czech Republic Tax Identification Number


The Czech Republic Tax Identification Number is a unique identifier for taxpayers in the Czech
Republic.
The Czech Republic Tax Identification Number data identifier detects a 9- to 10-character
alphanumeric pattern that matches the Czech Tax Identification Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects a 9- to 10-character alphanumeric pattern that matches the Czech
Tax Identification Number format without checksum validation. It checks for common test
patterns.
See “Czech Republic Tax Identification Number wide breadth” on page 1118.
■ The medium breadth detects a 9- to 10-character alphanumeric pattern that matches the
Czech Tax Identification Number format with checksum validation.
See “Czech Republic Tax Identification Number medium breadth” on page 1119.
■ The narrow breadth detects a 9- to 10-character alphanumeric pattern that matches the
Czech Tax Identification Number format with checksum validation. It checks for common
test patterns, and also requires the presence of related keywords.
Library of system data identifiers 1118
Czech Republic Tax Identification Number

See “Czech Republic Tax Identification Number narrow breadth” on page 1120.

Czech Republic Tax Identification Number wide breadth


The wide breadth detects a 9- to 10-character alphanumeric pattern that matches the Czech
Republic Tax Identification Number format without checksum validation. It checks for common
test patterns.

Table 45-218 Czech Republic Tax Identification Number wide-breadth patterns

Pattern

\d{2}[05][1-9][012]\d{4,5}

\d{2}[05][1-9]3[01]\d{3,4}

\d{2}[05][1-9][012]\d[/]\d{3,4}

\d{2}[05][1-9]3[01][/]\d{3,4}

\d{2}[16][012]{2}\d{4,5}

\d{2}[16][012]3[01]\d{3,4}

\d{2}[16][012]{2}\d[/]\d{3,4}

\d{2}[16][012]3[01][/]\d{3,4}

\d{2}[27][1-9][012]\d{5}

\d{2}[27][1-9]3[01]\d{4}

\d{2}[27][1-9][012]\d[/]\d{4}

\d{2}[27][1-9]3[01][/]\d{4}

\d{2}[38][012]{2}\d{5}

\d{2}[38][012]3[01]\d{4}

\d{2}[38][012]{2}\d[/]\d{4}

\d{2}[38][012]3[01][/]\d{4}

Table 45-219 Czech Republic Tax Identification Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.


Library of system data identifiers 1119
Czech Republic Tax Identification Number

Table 45-219 Czech Republic Tax Identification Number wide-breadth validators (continued)

Mandatory validator Description

Exclude ending characters Data ending with any of the following list of values is not
matched:

000000, 1111111, 2222222, 33333, 44444, 55555, 66666,


77777, 88888, 99999

Czech Republic Tax Identification Number medium breadth


The medium breadth detects a 9- to 10-character alphanumeric pattern that matches the Czech
Republic Tax Identification Number format with checksum validation.

Table 45-220 Czech Republic Tax Identification Number medium-breadth patterns

Pattern

\d{2}[05][1-9][012]\d{4,5}

\d{2}[05][1-9]3[01]\d{3,4}

\d{2}[05][1-9][012]\d[/]\d{3,4}

\d{2}[05][1-9]3[01][/]\d{3,4}

\d{2}[16][012]{2}\d{4,5}

\d{2}[16][012]3[01]\d{3,4}

\d{2}[16][012]{2}\d[/]\d{3,4}

\d{2}[16][012]3[01][/]\d{3,4}

\d{2}[27][1-9][012]\d{5}

\d{2}[27][1-9]3[01]\d{4}

\d{2}[27][1-9][012]\d[/]\d{4}

\d{2}[27][1-9]3[01][/]\d{4}

\d{2}[38][012]{2}\d{5}

\d{2}[38][012]3[01]\d{4}

\d{2}[38][012]{2}\d[/]\d{4}

\d{2}[38][012]3[01][/]\d{4}
Library of system data identifiers 1120
Czech Republic Tax Identification Number

Table 45-221 Czech Republic Tax Identification Number medium-breadth validators

Mandatory validator Description

Czech Tax Identification Number Validation Check Computes the checksum and validates the pattern against
it.

Czech Republic Tax Identification Number narrow breadth


The narrow breadth detects a 9- to 10-character alphanumeric pattern that matches the Czech
Republic Tax Identification Number format with checksum validation. It checks for common
test patterns, and also requires the presence of related keywords.

Table 45-222 Czech Republic Tax Identification Number narrow-breadth patterns

Pattern

\d{2}[05][1-9][012]\d{4,5}

\d{2}[05][1-9]3[01]\d{3,4}

\d{2}[05][1-9][012]\d[/]\d{3,4}

\d{2}[05][1-9]3[01][/]\d{3,4}

\d{2}[16][012]{2}\d{4,5}

\d{2}[16][012]3[01]\d{3,4}

\d{2}[16][012]{2}\d[/]\d{3,4}

\d{2}[16][012]3[01][/]\d{3,4}

\d{2}[27][1-9][012]\d{5}

\d{2}[27][1-9]3[01]\d{4}

\d{2}[27][1-9][012]\d[/]\d{4}

\d{2}[27][1-9]3[01][/]\d{4}

\d{2}[38][012]{2}\d{5}

\d{2}[38][012]3[01]\d{4}

\d{2}[38][012]{2}\d[/]\d{4}

\d{2}[38][012]3[01][/]\d{4}
Library of system data identifiers 1121
Czech Republic Value Added Tax (VAT) Number

Table 45-223 Czech Republic Tax Identification Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

000000, 1111111, 2222222, 33333, 44444, 55555, 66666,


77777, 88888, 99999

Czech Tax Identification Number Validation Check Computes the checksum and validates the pattern against
it.

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

personal code, personal id, national identification


number, national ID, personal ID, personal
identification number, nationalid#, personalid#,
personal identification code, PID#, tin, tax identification
number, tin#, tax id, tin no, tin number, tax number,
tax code, taxpayer id, taxpayer identification number

osobní kód, Národní identifikační číslo, osobní


identifikační číslo, cínové číslo, daňové identifikačné
číslo, daňový poplatník id

Czech Republic Value Added Tax (VAT) Number


Value Added Tax (VAT) is a consumption tax that is borne by the end consumer. VAT is paid
for each transaction in the manufacturing and distribution process. In the Czech Republic, it
is also called DPH.
The Czech Republic Value Added Tax (VAT) Number data identifier detects a 10- to
15-character alphanumeric pattern that matches the Czech Republic Value Added Tax (VAT)
Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects a 10- to 15-character alphanumeric pattern that matches the
Czech Value Added Tax (VAT) Number format without checksum validation. It checks for
common test patterns.
See “Czech Republic Value Added Tax (VAT) Number wide breadth” on page 1122.
■ The medium breadth detects a 10- to 15-character alphanumeric pattern that matches the
Czech Value Added Tax (VAT) Number format with checksum validation.
Library of system data identifiers 1122
Czech Republic Value Added Tax (VAT) Number

See “Czech Republic Value Added Tax (VAT) Number medium breadth” on page 1123.
■ The narrow breadth detects a 10- to 15-character alphanumeric pattern that matches the
Czech Value Added Tax (VAT) Number format with checksum validation. It checks for
common test patterns, and also requires the presence of related keywords.
See “Czech Republic Value Added Tax (VAT) Number narrow breadth” on page 1124.

Czech Republic Value Added Tax (VAT) Number wide breadth


The wide breadth detects a 10- to 15-character alphanumeric pattern that matches the Czech
Value Added Tax (VAT) Number format without checksum validation. It checks for common
test numbers.

Table 45-224 Czech Republic Value Added Tax (VAT) Number wide-breadth patterns

Pattern

[Cc][Zz]\d{8,13}

[Cc][Zz] \d{8,13}

[Cc][Zz] \d{2} \d{2} \d{2} \d{2}

[Cc][Zz] \d{2} \d{2} \d{2} \d{2} \d{2}

[Cc][Zz]\d{3} \d{2} \d{3}

[Cc][Zz]\d{3} \d{2} \d{2} \d{3}

[Cc][Zz] \d{3} \d{2} \d{3}

[Cc][Zz] \d{3} \d{2} \d{2} \d{3}

Table 45-225 Czech Republic Value Added Tax (VAT) Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.


Library of system data identifiers 1123
Czech Republic Value Added Tax (VAT) Number

Table 45-225 Czech Republic Value Added Tax (VAT) Number wide-breadth validators
(continued)

Mandatory validator Description

Exclude ending characters Data ending with any of the following list of values is not
matched:

00000000, 11111111, 22222222, 33333333, 44444444,


55555555, 66666666, 77777777, 88888888, 99999999

000000000, 111111111, 222222222, 333333333,


444444444, 555555555, 666666666, 777777777,
888888888, 999999999

0000000000, 1111111111, 2222222222, 3333333333,


4444444444, 5555555555, 6666666666, 7777777777,
8888888888, 9999999999

00000000000, 11111111111, 22222222222,


33333333333, 44444444444, 55555555555,
66666666666, 77777777777, 88888888888, 99999999999

000000000000, 111111111111, 222222222222,


333333333333, 444444444444, 555555555555,
666666666666, 777777777777, 888888888888,
999999999999

0000000000000, 1111111111111, 2222222222222,


3333333333333, 4444444444444, 5555555555555,
6666666666666, 7777777777777, 8888888888888,
9999999999999

Czech Republic Value Added Tax (VAT) Number medium breadth


The medium breadth detects a 10- to 15-character alphanumeric pattern that matches the
Czech Value Added Tax (VAT) Number format with checksum validation.

Table 45-226 Czech Republic Value Added Tax (VAT) Number medium-breadth patterns

Pattern

[Cc][Zz]\d{8,13}

[Cc][Zz] \d{8,13}

[Cc][Zz] \d{2} \d{2} \d{2} \d{2}

[Cc][Zz] \d{2} \d{2} \d{2} \d{2} \d{2}

[Cc][Zz]\d{3} \d{2} \d{3}


Library of system data identifiers 1124
Czech Republic Value Added Tax (VAT) Number

Table 45-226 Czech Republic Value Added Tax (VAT) Number medium-breadth patterns
(continued)

Pattern

[Cc][Zz]\d{3} \d{2} \d{2} \d{3}

[Cc][Zz] \d{3} \d{2} \d{3}

[Cc][Zz] \d{3} \d{2} \d{2} \d{3}

Table 45-227 Czech Republic Value Added Tax (VAT) Number medium-breadth validators

Mandatory validator Description

Czech Republic VAT Number Validation Check Computes the checksum and validates the pattern against
it.

Czech Republic Value Added Tax (VAT) Number narrow breadth


The narrow breadth detects a 10- to 15-character alphanumeric pattern that matches the
Czech Value Added Tax (VAT) Number format with checksum validation. It checks for common
test numbers, and also requires the presence of related keywords.

Table 45-228 Czech Republic Value Added Tax (VAT) Number narrow-breadth patterns

Pattern

[Cc][Zz]\d{8,13}

[Cc][Zz] \d{8,13}

[Cc][Zz] \d{2} \d{2} \d{2} \d{2}

[Cc][Zz] \d{2} \d{2} \d{2} \d{2} \d{2}

[Cc][Zz]\d{3} \d{2} \d{3}

[Cc][Zz]\d{3} \d{2} \d{2} \d{3}

[Cc][Zz] \d{3} \d{2} \d{3}

[Cc][Zz] \d{3} \d{2} \d{2} \d{3}

Table 45-229 Czech Republic Value Added Tax (VAT) Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.


Library of system data identifiers 1125
Czech Republic Value Added Tax (VAT) Number

Table 45-229 Czech Republic Value Added Tax (VAT) Number narrow-breadth validators
(continued)

Mandatory validator Description

Exclude ending characters Data ending with any of the following list of values is not
matched:

00000000, 11111111, 22222222, 33333333, 44444444,


55555555, 66666666, 77777777, 88888888, 99999999

000000000, 111111111, 222222222, 333333333,


444444444, 555555555, 666666666, 777777777,
888888888, 999999999

0000000000, 1111111111, 2222222222, 3333333333,


4444444444, 5555555555, 6666666666, 7777777777,
8888888888, 9999999999

00000000000, 11111111111, 22222222222,


33333333333, 44444444444, 55555555555,
66666666666, 77777777777, 88888888888, 99999999999

000000000000, 111111111111, 222222222222,


333333333333, 444444444444, 555555555555,
666666666666, 777777777777, 888888888888,
999999999999

0000000000000, 1111111111111, 2222222222222,


3333333333333, 4444444444444, 5555555555555,
6666666666666, 7777777777777, 8888888888888,
9999999999999

Czech Republic VAT Number Validation Check Computes the checksum and validates the pattern against
it.

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

vat number, value added tax, vat, VAT, VAT#, vat#,


VATIN, vatin

číslo DPH, Daň z přidané hodnoty, Dan z pridané


hodnoty, Daň přidané hodnoty, Dan pridané hodnoty,
DPH, DIC, DIČ
Library of system data identifiers 1126
Denmark Personal Identification Number

Denmark Personal Identification Number


In Denmark, every citizen has a national identification number. The number serves as proof
of identification for most purposes.
The Denmark Personal Identification Number data identifier detects a 10-digit number that
matches the Denmark Personal Identification Number format.
The Denmark Personal Identification Number data identifier provides three breadths of detection:
■ The wide breadth detects a 10-digit number without checksum validation.
See “Denmark Personal Identification Number wide breadth” on page 1126.
■ The medium breadth detects a 10-digit number with checksum validation.
See “Denmark Personal Identification Number medium breadth” on page 1126.
■ The medium breadth detects a 10-digit number with checksum validation. It also requires
the presence of related keywords.
See “Denmark Personal Identification Number narrow breadth” on page 1127.

Denmark Personal Identification Number wide breadth


The wide breadth detects a 10-digit number without checksum validation.

Table 45-230 Denmark Personal Identification Number wide-breadth patterns

Patterns

\d{6}[ -]\d{4}

\d{6}[ -]\l{4}

\d{10}

Table 45-231 Denmark Personal Identification Number wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Denmark Personal Identification Number medium breadth


The medium breadth detects a 10-digit number with checksum validation.
Library of system data identifiers 1127
Denmark Personal Identification Number

Table 45-232 Denmark Personal Identification Number medium-breadth patterns

Patterns

\d{6}[ -]\d{4}

\d{6}[ -]\l{4}

\d{10}

Table 45-233 Denmark Personal Identification Number medium-breadth validators

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding


characters.

Denmark Personal Identification Number Checksum validator for the Denmark Personal
Validation Check Identification Number.

Denmark Personal Identification Number narrow breadth


The medium breadth detects a ten-digit number with checksum validation. It also requires the
presence of related keywords.

Table 45-234 Denmark Personal Identification Number narrow-breadth patterns

Patterns

\d{6}[ -]\d{4}

\d{6}[ -]\l{4}

\d{10}

Table 45-235 Denmark Personal Identification Number narrow-breadth validators

Mandatory validators Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

Denmark Personal Identification Number Validation Checksum validator for the Denmark Personal Identification
Check Number.
Library of system data identifiers 1128
Denmark Tax Identification Number

Table 45-235 Denmark Personal Identification Number narrow-breadth validators (continued)

Mandatory validators Description

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

national identification number, national identity


number, personal identity number, personal
identification number, nationalid#, personalidentityno#,
unique identity number, uniqueidentityno#

Nationalt identifikationsnummer, personnummer, unikt


identifikationsnummer, identifikationsnummer, centrale
personregister, cpr, cpr-nummer, cpr#, cpr-nummer#,
identifikationsnummer#, personnummer#

Denmark Tax Identification Number


Denmark issues a tax identification number for persons who have obligations to declare taxes
in Denmark. The tax identification number also serves as a personal health insurance number.
The Denmark Tax Identification Number data identifier detects a 10-digit number that matches
the Denmark Tax Identification Number format.
The Denmark Tax Identification Number data identifier offers three breadths of detection:
■ The wide breadth detects a 10-digit number without checksum validation.
See “Denmark Tax Identification Number wide breadth” on page 1128.
■ The medium breadth detects a 10-digit number with checksum validation.
See “Denmark Tax Identification Number medium breadth” on page 1129.
■ The narrow breadth detects a 10-digit number with checksum validation. It also requires
the presence of related keywords.
See “Denmark Tax Identification Number narrow breadth” on page 1129.

Denmark Tax Identification Number wide breadth


The wide breadth detects a 10-digit number without checksum validation.

Table 45-236 Denmark Tax Identification Number wide-breadth pattern

Pattern

\d{6}-\d{4}
Library of system data identifiers 1129
Denmark Tax Identification Number

Table 45-237 Denmark Tax Identification Number wide breadth validators

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding characters.

Duplicate digits Ensures that a string of digits is not all the same.

Denmark Tax Identification Number medium breadth


The medium breadth detects a 10-digit number with checksum validation.

Table 45-238 Denmark Tax Identification Number medium-breadth pattern

Pattern

\d{6}-\d{4}

Table 45-239 Denmark Tax Identification Number medium-breadth validator

Mandatory validator Description

Denmark Tax Identification Number Validation Check Computes the checksum and validates the pattern against
it.

Denmark Tax Identification Number narrow breadth


The narrow breadth detects a 10-digit number with checksum validation. It also requires the
presence of related keywords.

Table 45-240 Denmark Tax Identification Number narrow-breadth pattern

Pattern

\d{6}-\d{4}

Table 45-241 Denmark Tax Identification Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Denmark Tax Identification Number Validation Check Computes the checksum and validates the pattern against
it.

Duplicate digits Ensures that a string of digits is not all the same.
Library of system data identifiers 1130
Denmark Value Added Tax (VAT) Number

Table 45-241 Denmark Tax Identification Number narrow-breadth validators (continued)

Mandatory validator Description

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

tax id, tax number, tax identification number, tax code

skat id, skattenummer, skat identifikationsnummer,


skat kode

cpr number, cpr#, taxid#, cpr, CPR, health insurance,


health insurance number, health card number, health
card, travel health insurance card, health insurance
card number

sygesikring, Sundhedsforsikringsnummer,
sundhedskortnummer, sundhedskort,
REJSESYGESIKRINGSKORT,
Sundhedsforsikringskort, sygesikringkortnummer,
Krankenkassennummer, Gesundheitskarte Nummer,
ReisekrankenversicherungskarteNummer,
GesundheitsVersicherungkarte Nummer

Denmark Value Added Tax (VAT) Number


VAT is a consumption tax that is borne by the end consumer. VAT is paid for each transaction
in the manufacturing and distribution process. For Denmark, the VAT number is issued by the
tax office for the region in which the business is established.
The Denmark Value Added Tax (VAT) Number detects a 10-character alphanumeric pattern
that matches the Denmark Value Added Tax (VAT) Number format.
The Denmark Value Added Tax (VAT) Number data identifier provides three breadths of
detection:
■ The wide breadth detects a 10-character alphanumeric pattern preceded by DK without
checksum validation.
See “Denmark Value Added Tax (VAT) Number wide breadth” on page 1131.
■ The medium breadth detects a 10-character alphanumeric pattern preceded by DKwith
checksum validation.
See “Denmark Value Added Tax (VAT) Number medium breadth” on page 1131.
■ The narrow breadth detects a 10-character alphanumeric pattern preceded by DKwith
checksum validation. It also requires the presence of related keywords.
Library of system data identifiers 1131
Denmark Value Added Tax (VAT) Number

See “Denmark Value Added Tax (VAT) Number narrow breadth” on page 1132.

Denmark Value Added Tax (VAT) Number wide breadth


The wide breadth detects a 10-character alphanumeric pattern preceded by DK without
checksum validation.

Table 45-242 Denmark Value Added Tax (VAT) Number wide-breadth patterns

Patterns

[Dd][Kk]\d{8}

[Dd][Kk] \d{8}

[Dd][Kk] \d{3} \d{3} \d{2}

[Dd][Kk] \d{3}-\d{3}-\d{2}

[Dd][Kk] \d{3}.\d{3}.\d{2}

[Dd][Kk]-\d{8}

[Dd][Kk] \d{3},\d{3},\d{2}

Table 45-243 Denmark Value Added Tax (VAT) Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

00000000, 11111111, 22222222, 33333333, 44444444,


55555555, 66666666, 77777777, 88888888, 99999999

Denmark Value Added Tax (VAT) Number medium breadth


The medium breadth detects a 10-character alphanumeric pattern preceded by DKwith checksum
validation.

Table 45-244 Denmark Value Added Tax (VAT) Number medium-breadth patterns

Patterns

[Dd][Kk]\d{8}

[Dd][Kk] \d{8}
Library of system data identifiers 1132
Denmark Value Added Tax (VAT) Number

Table 45-244 Denmark Value Added Tax (VAT) Number medium-breadth patterns (continued)

Patterns

[Dd][Kk] \d{3} \d{3} \d{2}

[Dd][Kk] \d{3}-\d{3}-\d{2}

[Dd][Kk] \d{3}.\d{3}.\d{2}

[Dd][Kk]-\d{8}

[Dd][Kk] \d{3},\d{3},\d{2}

Table 45-245 Denmark Value Added Tax (VAT) Number medium-breadth validators

Mandatory validator Description

Denmark VAT Number Validation Check Computes the checksum and validates the pattern against
it.

Denmark Value Added Tax (VAT) Number narrow breadth


The narrow breadth detects a 10-character alphanumeric pattern preceded by DKwith checksum
validation. It also requires the presence of related keywords

Table 45-246 Denmark Value Added Tax (VAT) Number narrow-breadth patterns

Patterns

[Dd][Kk]\d{8}

[Dd][Kk] \d{8}

[Dd][Kk] \d{3} \d{3} \d{2}

[Dd][Kk] \d{3}-\d{3}-\d{2}

[Dd][Kk] \d{3}.\d{3}.\d{2}

[Dd][Kk]-\d{8}

[Dd][Kk] \d{3},\d{3},\d{2}

Table 45-247 Denmark Value Added Tax (VAT) Number narrow-breadth validators

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding characters.


Library of system data identifiers 1133
Driver's License Number – CA State

Table 45-247 Denmark Value Added Tax (VAT) Number narrow-breadth validators (continued)

Mandatory validators Description

Exclude ending characters Data ending with any of the following list of values is not
matched:

00000000, 11111111, 22222222, 33333333, 44444444,


55555555, 66666666, 77777777, 88888888, 99999999

Denmark VAT Number Validation Check Computes the checksum and validates the pattern against
it.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

vat number, vat, vat#, vat no., value added tax number,
vat identification number

moms, momsnummer, moms identifikationsnummer,


merværdiafgift

Driver's License Number – CA State


The California (CA) state driver's license number is the identifier for an individual's driver's
license issued by the US state of California.
The Drivers License Number – CA State data identifier detects the presence of a eight-character
alphanumeric pattern that matches the Drivers License Number – CA State format.
This data identifier provides two breadths of validation:
■ The wide breadth detects an eight-character alphanumeric pattern beginning with a letter
followed by seven numerals.
See “Driver's License Number – CA State wide breadth” on page 1133.
■ The medium breadth validates a detected number against keywords.
See “Driver's License Number – CA State medium breadth” on page 1134.

Driver's License Number – CA State wide breadth


The wide breadth detects an eight-character alphanumeric pattern beginning with a letter
followed by seven numerals.

Note: This breadth option does not include any validators.


Library of system data identifiers 1134
Driver's License Number - FL, MI, MN States

Table 45-248 Driver's License Number wide-breadth pattern

Pattern

\l\d{7}

Driver's License Number – CA State medium breadth


The medium breadth detects an eight-character alphanumeric pattern beginning with a letter
followed by seven numerals. It validates a detected number by requiring a driver's license
keyword AND a California-related keyword.

Table 45-249 Driver's License Number – CA State medium-breadth pattern

Pattern

\l\d{7}

Table 45-250 Driver's License Number – CA State medium-breadth validators

Mandatory validators Description

Find keywords With this option selected, at least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

driver license, drivers license, driver's license, driver licenses, drivers licenses,
driver's licenses, dl#, dls#, lic#, lics#

Find keywords With this option selected, at least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

ca, calif, california

Driver's License Number - FL, MI, MN States


The driver's license numbers for the states of Florida (FL), Michigan (MI), and Minnesota (MN)
are the identifiers an individual's driver's license issued by one of those US states. These
states are grouped together because they share a common pattern for this number.
This data identifier detects a 13-character alphanumeric pattern that matches the Driver's
License Number - FL, MI, MN States format.
This data identifier provides two breadths of validation:
Library of system data identifiers 1135
Driver's License Number - FL, MI, MN States

■ The wide breadth detects any 13-character alphanumeric pattern with a letter followed by
12 numbers.
See “Driver's License Number- FL, MI, MN States wide breadth” on page 1135.
■ The medium breadth narrows the scope by requiring the presence keywords.
See “Driver's License Number- FL, MI, MN States medium breadth” on page 1135.

Driver's License Number- FL, MI, MN States wide breadth


The wide breadth of this data identifier detects any 13-character string with a letter followed
by 12 numbers.
For the MN license number, the following format is matched: L-DDD-DDD-DDD-DDD.

Note: This breadth option does not include any validators.

Table 45-251 Driver's License Number- FL, MI, MN States wide-breadth patterns

Patterns

\l \d{3} \d{3} \d{3} \d{3}

\l\d{12}

\l\d{3}-\d{3}-\d{2}-\d{3}-\d

\l-\d{3}-\d{3}-\d{3}-\d{3}

Driver's License Number- FL, MI, MN States medium breadth


The medium breadth of this data identifier implements patters to detect any 13-character string
with a letter followed by 12 numbers. For the MN license number, the following format is
matched: L-DDD-DDD-DDD-DDD.
This data identifier validates the number by requiring the presence of a drivers license keyword
AND a state-related keyword.

Table 45-252 Driver's License Number- FL, MI, MN States medium-breadth patterns

Patterns

\l \d{3} \d{3} \d{3} \d{3}

\l\d{12}

\l\d{3}-\d{3}-\d{2}-\d{3}-\d
Library of system data identifiers 1136
Driver's License Number - IL State

Table 45-252 Driver's License Number- FL, MI, MN States medium-breadth patterns
(continued)

Patterns

\l-\d{3}-\d{3}-\d{3}-\d{3}

Table 45-253 Driver's License Number- FL, MI, MN States medium-breadth validators

Mandator validators Description

Find keywords Requires at least one of the input keywords or key phrases to be present for the
data to be matched.

Inputs:

driver license, drivers license, driver's license, driver licenses, drivers


licenses, driver's licenses, dl#, dls#, lic#, lics#

Find keywords Requires at least one of the input keywords or key phrases to be present for the
data to be matched.

Inputs:

fla, fl, florida, michigan, mi, minnesota, mn

Driver's License Number - IL State


The Illinois (IL) state driver's license number is a 12-character alphanumeric string that identifies
an individual's driver's license issued by the US state of Illinois.
The Driver's License Number - IL State data identifier detects a 12-character alphanumeric
pattern that matches the Driver's License Number - IL State format.
This data identifier provides two breadths of validation:
■ The wide breadth detects the presence of a 12-character alphanumeric pattern without
validation.
See “Driver's License Number- IL State wide breadth” on page 1136.
■ The medium breadth narrows the scope by requiring the presence of keywords.
See “Driver's License Number- IL State medium breadth” on page 1137.

Driver's License Number- IL State wide breadth


The wide breadth detects a 12-character alphanumeric pattern, beginning with a letter (the
first letter of the person's last name) followed by 11 numbers.
Library of system data identifiers 1137
Driver's License Number - IL State

Note: This breadth option does not include any validators.

Table 45-254 Driver's License Number- IL State wide-breadth patterns

Patterns

\l\d{3}-\d{4}-\d{4}

\l\d{11}

Driver's License Number- IL State medium breadth


The medium breadth detects a 12-character string, beginning with a letter (the first letter of
the person's last name) followed by 11 numbers.
This breadth also requires the presence of both a driver's license keyword AND an
Illinois-related keyword.

Table 45-255 Driver's License Number- IL State medium-breadth patterns

Patterns

\l\d{3}-\d{4}-\d{4}

\l\d{11}

Table 45-256 Driver's License Number- IL State medium-breadth validators

Mandatory validators Description

Find keywords Requires at least one of the input keywords or key phrases
to be present for the data to be matched.

Inputs:

driver license, drivers license, driver's license, driver


licenses, drivers licenses, driver's licenses, dl#, dls#,
lic#, lics#

Find keywords Requires at least one of the input keywords or key phrases
to be present for the data to be matched.
Inputs:

il, illinois
Library of system data identifiers 1138
Driver's License Number - NJ State

Driver's License Number - NJ State


The New Jersey (NJ) state driver's license number is a 15-character alphanumeric pattern
that identifies an individual's driver's license issued by the US state of New Jersey.
The Driver's License Number - NJ State detects a 15-character alphanumeric pattern that
matches the Driver's License Number - NJ State format.
This data identifier provides two breadths of validation:
■ The wide breadth detects a 15-character alphanumeric pattern without validation.
See “Driver's License Number- NJ State wide breadth” on page 1138.
■ The medium breadth narrows the scope by requiring the presence of related keywords.
See “Driver's License Number- NJ State medium breadth” on page 1138.

Driver's License Number- NJ State wide breadth


The wide breadth detects a 15-character alphanumeric pattern, beginning with a letter (the
first letter of the person's last name) followed by 14 numbers.

Note: The wide breadth option does not include any validators.

Table 45-257 Driver's License Number- NJ State wide-breadth patterns

Patterns

\l\d{4} \d{5} \d{5}

\l\d{14}

Driver's License Number- NJ State medium breadth


The medium breadth detects a 15-character alphanumeric pattern, beginning with a letter (the
first letter of the person's last name) followed by 14 numbers.
This breadth also requires the presence of both a driver's license keyword AND a New
Jersey-related keyword.

Table 45-258 Driver's License Number- NJ State medium-breadth patterns

Patterns

\l\d{3}-\d{4}-\d{4}

\l\d{11}
Library of system data identifiers 1139
Driver's License Number - NY State

Table 45-259 Driver's License Number- NJ State medium-breadth validators

Mandatory validators Description

Find keywords Requires at least one of the input keywords or key phrases
to be present for the data to be matched.

Inputs:

driver license, drivers license, driver's license, driver


licenses, drivers licenses, driver's licenses, dl#, dls#,
lic#, lics#

Find keywords Requires at least one of the input keywords or key phrases
to be present for the data to be matched.

Inputs:

nj, new jersey, newjersey

Driver's License Number - NY State


The New York (NY) state driver's license number is a nine-digit identifier for an individual's
driver's license issued by the US state of New York.
The Driver's License Number - NY State data identifier detects a nine-digit number that matches
the Driver's License Number - NY State format.
The data identifier detects the presence of a New York driver's license number.
This data identifier provides two breadths of validation:
■ The wide breadth detects a string of nine digits without validation.
See “Driver's License Number- NY State wide breadth” on page 1139.
■ The medium breadth narrows the scope by requiring the presence of related keywords.
See “Driver's License Number- NJ State medium breadth” on page 1138.

Driver's License Number- NY State wide breadth


The wide breadth detects a nine-digit string without validation.

Note: The wide breadth option does not include any validators.

Table 45-260 Driver's License Number- NY State wide-breadth patterns

Patterns

\d{3} \d{3} \d{3}


Library of system data identifiers 1140
Driver's License Number - WA State

Table 45-260 Driver's License Number- NY State wide-breadth patterns (continued)

Patterns

\d{9}

Driver's License Number - NY State medium breadth


The medium breadth detects a nine-digit number.
This breadth also requires the presence of both a driver's license keyword AND a New
York–related keyword.

Table 45-261 Driver's License Number- NY State medium-breadth patterns

Patterns

\d{3} \d{3} \d{3}

\d{9}

Table 45-262 Driver's License Number- NY State medium-breadth validators

Mandatory validators Description

Find keywords Requires at least one of the input keywords or key phrases to be present for the
data to be matched.

Inputs:

driver license, drivers license, driver's license, driver licenses, drivers


licenses, driver's licenses, dl#, dls#, lic#, lics#

Find keywords Requires at least one of the input keywords or key phrases to be present for the
data to be matched.

Inputs:

new york, ny, newyork

Driver's License Number - WA State


Identification number for an individual's driver's license issued by the US state of Washington.
The Driver's License Number - WA State data identifier detects alphanumeric patterns that
match the Driver's License Number - WA State format.
The Driver's License Number - WA State data identifier provides three breadths of detection.
■ The wide breadth detects a Washington State driver's license with no validation.
Library of system data identifiers 1141
Driver's License Number - WA State

See “ Driver's License Number - WA State wide breadth” on page 1141.


■ The medium breadth detects a Washington State driver's license with checksum validation.
See “Driver's License Number - WA State medium breadth” on page 1141.
■ The narrow breadth detects a Washington State driver's license with checksum validation.
It also requires the presence of related keywords.
See “Driver's License Number - WA State narrow breadth” on page 1142.

Driver's License Number - WA State wide breadth


The wide breadth detects a Washington State driver's license with no validation.

Table 45-263 Driver's License Number - WA State wide-breadth patterns

Pattern

\l{5}\l[A-Za-z*]\d{3}\w{2}

\l{4}[*]\l[A-Za-z*]\d{3}\w{2}

\l{3}[*]{2}\l[A-Za-z*]\d{3}\w{2}

\l{2}[*]{3}\l[A-Za-z*]\d{3}\w{2}

\l{1}[*]{4}\l[A-Za-z*]\d{3}\w{2}

The wide breadth of the Driver's License Number - WA State data identifier does not include
a validator.

Driver's License Number - WA State medium breadth


The medium breadth detects a Washington State driver's license with checksum validation.

Table 45-264 Driver's License Number - WA State medium-breadth patterns

Pattern

\l{5}\l[A-Za-z*]\d{3}\w{2}

\l{4}[*]\l[A-Za-z*]\d{3}\w{2}

\l{3}[*]{2}\l[A-Za-z*]\d{3}\w{2}

\l{2}[*]{3}\l[A-Za-z*]\d{3}\w{2}

\l{1}[*]{4}\l[A-Za-z*]\d{3}\w{2}
Library of system data identifiers 1142
Driver's License Number - WI State

Table 45-265 Driver's License Number - WA State medium-breadth validators

Mandatory validator Description

Driver's License Number - WA State Validation Check Computes the checksum and validates the pattern against
it.

Driver's License Number - WA State narrow breadth


The narrow breadth detects a Washington State driver's license with checksum validation. It
also requires the presence of related keywords.

Table 45-266 Driver's License Number - WA State narrow-breadth patterns

Pattern

\l{5}\l[A-Za-z*]\d{3}\w{2}

\l{4}[*]\l[A-Za-z*]\d{3}\w{2}

\l{3}[*]{2}\l[A-Za-z*]\d{3}\w{2}

\l{2}[*]{3}\l[A-Za-z*]\d{3}\w{2}

\l{1}[*]{4}\l[A-Za-z*]\d{3}\w{2}

Table 45-267 Driver's License Number - WA State narrow-breadth validators

Mandatory validator Description

Driver's License Number - WA State Validation Check Computes the checksum and validates the pattern against
it.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

driver license, drivers license, driver licenses, drivers


licenses, dl#, dls#, lic#, lics#, wash, washington, wa

Driver's License Number - WI State


The Driver's License Number - WI State is an identification number for an individual driver's
license issued by the US state of Wisconsin.
Library of system data identifiers 1143
Driver's License Number - WI State

The Driver's License Number - WI State data identifier detects a 13-digit number that matches
the Driver's License Number - WI State format.
The Driver's License Number - WI State data identifier provides three breadths of detection.
■ The wide breadth detects a 13-digit number with ending-character exclusion validation.
See “ Driver's License Number - WI State wide breadth” on page 1143.
■ The wide breadth detects a 13-digit number with ending-character exclusion and checksum
validation.
See “Driver's License Number - WI State medium breadth” on page 1143.
■ The wide breadth detects a 13-digit number with ending-character exclusion and checksum
validation. It also requires the presence of related keywords.
See “Driver's License Number - WI State narrow breadth” on page 1144.

Driver's License Number - WI State wide breadth


The wide breadth detects a 13-digit number with ending-character exclusion validation.

Table 45-268 Driver's License Number - WI Statewide-breadth patterns

Pattern

\l\d{3}-\d{4}-\d{4}-\d{2}

\l\d{13}

Table 45-269 Driver's License Number - WI State wide-breadth validator

Mandatory validator Description

Exclude ending characters Data ending with any of the following list of values is not
matched:

0000000000000, 1111111111111, 2222222222222,


3333333333333, 4444444444444, 5555555555555,
6666666666666, 7777777777777, 8888888888888,
9999999999999

Driver's License Number - WI State medium breadth


The wide breadth detects a 13-digit number with ending-character exclusion and checksum
validation.
Library of system data identifiers 1144
Driver's License Number - WI State

Table 45-270 Driver's License Number - WI State medium-breadth patterns

Pattern

\l\d{3}-\d{4}-\d{4}-\d{2}

\l\d{13}

Table 45-271 Driver's License Number - WI State medium-breadth validators

Mandatory validator Description

Driver's License Number - WI State Validation Check Computes the checksum and validates the pattern against
it.

Exclude ending characters Data ending with any of the following list of values is not
matched:

0000000000000, 1111111111111, 2222222222222,


3333333333333, 4444444444444, 5555555555555,
6666666666666, 7777777777777, 8888888888888,
9999999999999

Driver's License Number - WI State narrow breadth


The wide breadth detects a 13-digit number with ending-character exclusion and checksum
validation. It also requires the presence of related keywords.

Table 45-272 Driver's License Number - WI State narrow-breadth patterns

Pattern

\l\d{3}-\d{4}-\d{4}-\d{2}

\l\d{13}

Table 45-273 Driver's License Number - WI State narrow-breadth validators

Mandatory validator Description

Driver's License Number - WI State Validation Check Computes the checksum and validates the pattern against
it.
Library of system data identifiers 1145
Drug Enforcement Agency (DEA) Number

Table 45-273 Driver's License Number - WI State narrow-breadth validators (continued)

Mandatory validator Description

Exclude ending characters Data ending with any of the following list of values is not
matched:

0000000000000, 1111111111111, 2222222222222,


3333333333333, 4444444444444, 5555555555555,
6666666666666, 7777777777777, 8888888888888,
9999999999999

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

driver license, drivers license, driver licenses, drivers


licenses, dl#, dls#, lic#, lics#, wisc., wisconsin, wi

Drug Enforcement Agency (DEA) Number


A DEA number is a number assigned to a health care provider (such as a medical practitioner,
dentist, or veterinarian) by the U.S. Drug Enforcement Administration allowing them to write
prescriptions for controlled substances.
The Drug Enforcement Agency (DEA) Number data identifier detects an eight- or nine-character
alphanumeric pattern that matches the Drug Enforcement Agency (DEA) Number format.
The Drug Enforcement Agency (DEA) Number data identifier provides three breadths of
detection:
■ The wide breadth detects an eight- or nine-character alphanumeric pattern without validation.
See “ Drug Enforcement Agency (DEA) Number wide breadth” on page 1145.
■ The medium breadth detects an eight- or nine-character alphanumeric pattern with ending
character exclusion and checksum validation.
See “Drug Enforcement Agency (DEA) Number medium breadth” on page 1146.
■ The narrow breadth detects an eight- or nine-character alphanumeric pattern with ending
character exclusion and checksum validation. It also requires the presence of related
keywords.
See “Drug Enforcement Agency (DEA) Number narrow breadth” on page 1146.

Drug Enforcement Agency (DEA) Number wide breadth


The wide breadth detects an eight- or nine-character alphanumeric pattern without validation.
Library of system data identifiers 1146
Drug Enforcement Agency (DEA) Number

Table 45-274 Drug Enforcement Agency (DEA) Number wide-breadth patterns

Pattern

[ABFGMPR]\l\d{7}

[ABFGMPR]\d{8}

The wide breadth of the Drug Enforcement Agency (DEA) Number data identifier includes no
validators.

Drug Enforcement Agency (DEA) Number medium breadth


The medium breadth detects an eight- or nine-character alphanumeric pattern with ending
character exclusion and checksum validation.

Table 45-275 Drug Enforcement Agency (DEA) Number medium-breadth patterns

Pattern

[ABFGMPR]\l\d{7}

[ABFGMPR]\d{8}

Table 45-276 Drug Enforcement Agency (DEA) Number medium-breadth validators

Mandatory validator Description

Drug Enforcement Agency Number Validation Check Computes the checksum and validates the pattern against
it.

Exclude ending characters Data ending with any of the following list of values is not
matched:

5555555, 55555555

Drug Enforcement Agency (DEA) Number narrow breadth


The narrow breadth detects an eight- or nine-character alphanumeric pattern with ending
character exclusion and checksum validation. It also requires the presence of related keywords.

Table 45-277 Drug Enforcement Agency (DEA) Number narrow-breadth patterns

Pattern

[ABFGMPR]\l\d{7}

[ABFGMPR]\d{8}
Library of system data identifiers 1147
Estonia Driver's Licence Number

Table 45-278 Drug Enforcement Agency (DEA) Number narrow-breadth validators

Mandatory validator Description

Drug Enforcement Agency Number Validation Check Computes the checksum and validates the pattern against
it.

Exclude ending characters Data ending with any of the following list of values is not
matched:

5555555, 55555555

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

dea number, DEA, DEA no., DEA Registration Number,


DEA registration no., DEA#, DEA No#, Drug
Enforcement Agency Number, Drug Enforcement
Agency No.

Estonia Driver's Licence Number


The Estonian Road Administration issues driving licenses in Estonia, confirming the rights of
the holder to drive motor vehicles.
The Estonia Driver's Licence Number data identifier detects an eight-character alphanumeric
pattern that matches the Estonia Driver's Licence Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects an eight-character alphanumeric pattern that matches the Estonia
Driver's Licence Number format. It checks for common test patterns.
See “Estonia Driver's Licence Number wide breadth” on page 1147.
■ The narrow breadth detects an eight-character alphanumeric pattern that matches the
Estonia Driver's Licence Number format. It checks for common test patterns, and also
requires the presence of related keywords.
See “Estonia Driver's Licence Number narrow breadth” on page 1148.

Estonia Driver's Licence Number wide breadth


The wide breadth detects an eight-character alphanumeric pattern that matches the Estonia
Driver's Licence Number format. It checks for common test patterns.
Library of system data identifiers 1148
Estonia Driver's Licence Number

Table 45-279 Estonia Driver's Licence Number wide-breadth patterns

Pattern

[Ee][A-Za-z]\d{6}

Table 45-280 Estonia Driver's Licence Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

000000, 111111, 222222, 333333, 444444, 555555,


666666, 777777, 888888, 999999

Estonia Driver's Licence Number narrow breadth


The narrow breadth detects an eight-character alphanumeric pattern that matches the Estonia
Driver's Licence Number format. It checks for common test patterns, and also requires the
presence of related keywords.

Table 45-281 Estonia Driver's Licence Number narrow-breadth patterns

Pattern

[Ee][A-Za-z]\d{6}

Table 45-282 Estonia Driver's Licence Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

000000, 111111, 222222, 333333, 444444, 555555,


666666, 777777, 888888, 999999
Library of system data identifiers 1149
Estonia Passport Number

Table 45-282 Estonia Driver's Licence Number narrow-breadth validators (continued)

Mandatory validator Description

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

driver license, driver licence, drivers license, drivers


licence, driving license, driving licence, driver license
number, driver licence number, drivers license number,
drivers licence number, driving license number, driving
licence number, driver's license, driver's licence,
Driver's License, Driver's Licence, driver's license
number, driver's licence number, Driver's License
Number, Driver's Licence Number, DLNo#, dlno#,
drivers lic., driver permit, drivers permit, driving permit,
license number, licence number, licence

juhiluba, JUHILUBA, juhiluba number, juhiloa number,


Juhiluba, juhi litsentsi number

Estonia Passport Number


The Estonian passport is an international travel document issued to citizens of Estonia that
also serves as proof of Estonian citizenship. The Border Guard Board in Estonia and Estonian
foreign representations abroad are responsible for issuing Estonian passports.
The Estonia Passport Number data identifier detects an eight- or nine-character alphanumeric
pattern that matches the Estonia Passport Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects an eight- or nine-character alphanumeric pattern that matches
the Estonia Passport Number format. It checks for common test patterns.
See “Estonia Passport Number wide breadth” on page 1149.
■ The narrow breadth detects an eight- or nine-character alphanumeric pattern that matches
the Estonia Passport Number format. It checks for common test patterns, and also requires
the presence of related keywords.
See “Estonia Passport Number narrow breadth” on page 1150.

Estonia Passport Number wide breadth


The wide breadth detects an eight- or nine-character alphanumeric pattern that matches the
Estonia Passport Number format. It checks for common test patterns.
Library of system data identifiers 1150
Estonia Passport Number

Table 45-283 Estonia Passport Number wide-breadth patterns

Pattern

[Kk][A-Za-z]\d{7}

[Kk]\d{7}

[Vv][A-Za-z]\d{7}

[Vv]\d{7}

Table 45-284 Estonia Passport Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

0000000, 1111111, 2222222, 3333333, 4444444,


5555555, 6666666, 7777777, 8888888, 9999999

Estonia Passport Number narrow breadth


The narrow breadth detects an eight- or nine-character alphanumeric pattern that matches
the Estonia Passport Number format. It checks for common test patterns, and also requires
the presence of related keywords.

Table 45-285 Estonia Passport Number narrow-breadth patterns

Pattern

[Kk][A-Za-z]\d{7}

[Kk]\d{7}

[Vv][A-Za-z]\d{7}

[Vv]\d{7}

Table 45-286 Estonia Passport Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.


Library of system data identifiers 1151
Estonia Personal Identification Code

Table 45-286 Estonia Passport Number narrow-breadth validators (continued)

Mandatory validator Description

Exclude ending characters Data ending with any of the following list of values is not
matched:

0000000, 1111111, 2222222, 3333333, 4444444,


5555555, 6666666, 7777777, 8888888, 9999999

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

Passport, passport number, passport, passport no,


passport#, passportno, Passport No., Passport No,
PASSPORT, Pass, pass, passi number, pass nr, pass#,
Pass nr, Eesti passi number

Estonia Personal Identification Code


In Estonia, the personal identification code is a number based on the sex and birth date of a
person. This code is used as a unique personal identifier by governmental and other systems
where identification is required, as well as for digital signatures using the national identity card
and its associated certificates. It also serves as tax identification number.
The Estonia Personal Identification Code data identifier detects an 11-digit number that matches
the Estonia Personal Identification Code format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects an 11-digit number that matches the Estonia Personal Identification
Code format without checksum validation. It checks for common test numbers.
See “Estonia Personal Identification Code wide breadth” on page 1152.
■ The medium breadth detects an 11-digit number that matches the Estonia Personal
Identification Code format with checksum validation.
See “Estonia Personal Identification Code medium breadth” on page 1152.
■ The narrow breadth detects an 11-digit number that matches the Estonia Personal
Identification Code format with checksum validation. It also requires the presence of related
keywords.
See “Estonia Personal Identification Code narrow breadth” on page 1153.
Library of system data identifiers 1152
Estonia Personal Identification Code

Estonia Personal Identification Code wide breadth


The wide breadth detects an 11-digit number that matches the Estonia Personal Identification
Code format without checksum validation. It checks for common test numbers.

Table 45-287 Estonia Personal Identification Code wide-breadth patterns

Pattern

\d{3}[01]\d[0123]\d{5}

\d \d{2}[01]\d[0123]\d \d{4}

\d \d{2}[01]\d[0123]\d{4} \d

Table 45-288 Estonia Personal Identification Code wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Duplicate digits Ensures that a string of numbers is not all the same.

Estonia Personal Identification Code medium breadth


The medium breadth detects an 11-digit number that matches the Estonia Personal Identification
Code format with checksum validation.

Table 45-289 Estonia Personal Identification Code medium-breadth patterns

Pattern

\d{3}[01]\d[0123]\d{5}

\d \d{2}[01]\d[0123]\d \d{4}

\d \d{2}[01]\d[0123]\d{4} \d

Table 45-290 Estonia Personal Identification Code medium-breadth validators

Mandatory validator Description

Estonia Personal Identification Number Check Computes the checksum and validates the pattern against
it.
Library of system data identifiers 1153
Estonia Value Added Tax (VAT) Number

Estonia Personal Identification Code narrow breadth


The narrow breadth detects an 11-digit number that matches the Estonia Personal Identification
Code format with checksum validation. It checks for common test numbers, and also requires
the presence of related keywords.

Table 45-291 Estonia Personal Identification Code narrow-breadth patterns

Pattern

\d{3}[01]\d[0123]\d{5}

\d \d{2}[01]\d[0123]\d \d{4}

\d \d{2}[01]\d[0123]\d{4} \d

Table 45-292 Estonia Personal Identification Code narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Duplicate digits Ensures that a string of numbers is not all the same.

Estonia Personal Identification Number Check Computes the checksum and validates the pattern against
it.

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

national ID, national identification number, personal


ID, personal identification number, nationalid#,
personalid#, isikukood, isikukood#, IK, IK#, personal
identification code, PID#, maksu ID,
maksukohustuslase identifitseerimisnumber,
maksukood, tax id, tax number, tax identification
number, tax code, maksukood#, maksuID#, taxpayer
id, taxpayer identification number, maksumaksja kood,
maksumaksja identifitseerimisnumber

Estonia Value Added Tax (VAT) Number


Value Added Tax (VAT) is a consumption tax that is borne by the end consumer. VAT is paid
for each transaction in the manufacturing and distribution process. For Estonia, VAT is
administered by tax office for the region in which the business is established.
Library of system data identifiers 1154
Estonia Value Added Tax (VAT) Number

The Estonia Value Added Tax (VAT) Number data identifier detects an 11-character
alphanumeric pattern that matches the Estonia VAT Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects an 11-character alphanumeric pattern that matches the Estonia
VAT Number format without checksum validation. It checks for common test patterns.
See “Estonia Value Added Tax (VAT) Number wide breadth” on page 1154.
■ The medium breadth detects an 11-character alphanumeric pattern that matches the Estonia
VAT Number format with checksum validation.
See “Estonia Value Added Tax (VAT) Number medium breadth” on page 1154.
■ The narrow breadth detects an 11-character alphanumeric pattern that matches the Estonia
VAT Number format with checksum validation. It checks for common test patterns, and
also requires the presence of related keywords.
See “Estonia Value Added Tax (VAT) Number narrow breadth” on page 1155.

Estonia Value Added Tax (VAT) Number wide breadth


The wide breadth detects an 11-character alphanumeric pattern that matches the Estonia VAT
Number format without checksum validation. It checks for common test patterns.

Table 45-293 Estonia Value Added Tax (VAT) Number wide-breadth patterns

Pattern

[Ee][Ee]\d{9}

[Ee][Ee] \d{9}

Table 45-294 Estonia Value Added Tax (VAT) Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

000000000, 111111111, 222222222, 333333333,


444444444, 555555555, 666666666, 777777777,
888888888, 999999999

Estonia Value Added Tax (VAT) Number medium breadth


The medium breadth detects an 11-character alphanumeric pattern that matches the Estonia
VAT Number format with checksum validation.
Library of system data identifiers 1155
Estonia Value Added Tax (VAT) Number

Table 45-295 Estonia Value Added Tax (VAT) Number medium-breadth patterns

Pattern

[Ee][Ee]\d{9}

[Ee][Ee] \d{9}

Table 45-296 Estonia Value Added Tax (VAT) Number medium-breadth validators

Mandatory validator Description

Estonia Value Added Tax (VAT) Number Validation Computes the checksum and validates the pattern against
Check it.

Estonia Value Added Tax (VAT) Number narrow breadth


The narrow breadth detects an 11-character alphanumeric pattern that matches the Estonia
VAT Number format with checksum validation. It checks for common test patterns, and also
requires the presence of related keywords.

Table 45-297 Estonia Value Added Tax (VAT) Number narrow-breadth patterns

Pattern

[Ee][Ee]\d{9}

[Ee][Ee] \d{9}

Table 45-298 Estonia Value Added Tax (VAT) Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

000000000, 111111111, 222222222, 333333333,


444444444, 555555555, 666666666, 777777777,
888888888, 999999999

Estonia Value Added Tax (VAT) Number Validation Computes the checksum and validates the pattern against
Check it.
Library of system data identifiers 1156
European Health Insurance Card Number

Table 45-298 Estonia Value Added Tax (VAT) Number narrow-breadth validators (continued)

Mandatory validator Description

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

vat, vat number, vat#, käibemaksu


registreerimisnumber, käibemaksu, value added tax
number, Käibemaksu number, käibemaks, käibemaks#,
käibemaksu#, vat registration number

European Health Insurance Card Number


The European Health Insurance Card (EHIC) allows anyone insured by or covered by a
statutory social security scheme of the European Economic Area countries and Switzerland
to receive medical treatment in another member state free or at a reduced cost.
The European Health Insurance Card Number data identifier detects a 20-digit number that
matches the European Health Insurance Card Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects a 20-digit number that matches the European Health Insurance
Card Number format. It checks for common test numbers.
See “European Health Insurance Card Number wide breadth” on page 1156.
■ The narrow breadth detects a 20-digit number that matches the European Health Insurance
Card Number format. It checks for common test numbers, and also requires the presence
of related keywords.
See “European Health Insurance Card Number narrow breadth” on page 1160.

European Health Insurance Card Number wide breadth


The wide breadth detects a 20-digit number that matches the European Health Insurance Card
Number format. It checks for common test numbers.

Table 45-299 European Health Insurance Card Number wide-breadth patterns

Pattern

80040\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

80826\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

38500\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d
Library of system data identifiers 1157
European Health Insurance Card Number

Table 45-299 European Health Insurance Card Number wide-breadth patterns (continued)

Pattern

80203\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

60189\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

80246\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

80276\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

80300\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

80021\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

80380\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

80440\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

80442\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

30066\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

80620\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

80703\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

80724\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

80752\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

80756\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

80616\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

Table 45-300 European Health Insurance Card Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.


Library of system data identifiers 1158
European Health Insurance Card Number

Table 45-300 European Health Insurance Card Number wide-breadth validators (continued)

Mandatory validator Description

Exclude ending characters


Library of system data identifiers 1159
European Health Insurance Card Number

Table 45-300 European Health Insurance Card Number wide-breadth validators (continued)

Mandatory validator Description

Data ending with any of the following list of values is not


matched:

80040000000000000000, 80040111111111111111,
80040222222222222222, 80040333333333333333,
80040444444444444444, 80040555555555555555,
80040666666666666666, 80040777777777777777,
80040888888888888888, 80040999999999999999

80826000000000000000, 80826111111111111111,
80826222222222222222, 80826333333333333333,
80826444444444444444, 80826555555555555555,
80826666666666666666, 80826777777777777777,
80826888888888888888, 80826999999999999999

38500000000000000000, 38500111111111111111,
38500222222222222222, 38500333333333333333,
38500444444444444444, 38500555555555555555,
38500666666666666666, 38500777777777777777,
38500888888888888888, 38500999999999999999

80203000000000000000, 80203111111111111111,
80203222222222222222, 80203333333333333333,
80203444444444444444, 80203555555555555555,
80203666666666666666, 80203777777777777777,
80203888888888888888, 80203999999999999999

60189000000000000000, 60189111111111111111,
60189222222222222222, 60189333333333333333,
60189444444444444444, 60189555555555555555,
60189666666666666666, 60189777777777777777,
60189888888888888888, 60189999999999999999

80246000000000000000, 80246111111111111111,
80246222222222222222, 80246333333333333333,
80246444444444444444, 80246555555555555555,
80246666666666666666, 80246777777777777777,
80246888888888888888, 80246999999999999999

80276000000000000000, 80276111111111111111,
80276222222222222222, 80276333333333333333,
80276444444444444444, 80276555555555555555,
80276666666666666666, 80276777777777777777,
80276888888888888888, 80276999999999999999

80300000000000000000, 80300111111111111111,
80300222222222222222, 80300333333333333333,
Library of system data identifiers 1160
European Health Insurance Card Number

Table 45-300 European Health Insurance Card Number wide-breadth validators (continued)

Mandatory validator Description

80300444444444444444, 80300555555555555555,
80300666666666666666, 80300777777777777777,
80300888888888888888, 80300999999999999999

80021000000000000000, 80021111111111111111,
80021222222222222222, 80021333333333333333,
80021444444444444444, 80021555555555555555,
80021666666666666666, 80021777777777777777,
80021888888888888888, 80021999999999999999

80380000000000000000, 80380111111111111111,
80380222222222222222, 80380333333333333333,
80380444444444444444, 80380555555555555555,
80380666666666666666, 80380777777777777777,
80380888888888888888, 80380999999999999999

80440000000000000000, 80440111111111111111,
80440222222222222222, 80440333333333333333,
80440444444444444444, 80440555555555555555,
80440666666666666666, 80440777777777777777,
8440888888888888888, 80440999999999999999

80442000000000000000, 80442111111111111111,
80442222222222222222, 80442333333333333333,
80442444444444444444, 80442555555555555555,
80442666666666666666, 80442777777777777777,
80442888888888888888, 80442999999999999999

30066000000000000000, 30066111111111111111,
30066222222222222222, 30066333333333333333,
30066444444444444444, 30066555555555555555,
30066666666666666666, 30066777777777777777,
30066888888888888888, 30066999999999999999

European Health Insurance Card Number narrow breadth


The narrow breadth detects a 20-digit number that matches the European Health Insurance
Card Number format. It checks for common test numbers, and also requires the presence of
related keywords.

Table 45-301 European Health Insurance Card Number narrow-breadth patterns

Pattern

80040\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d
Library of system data identifiers 1161
European Health Insurance Card Number

Table 45-301 European Health Insurance Card Number narrow-breadth patterns (continued)

Pattern

80826\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

38500\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

80203\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

60189\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

80246\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

80276\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

80300\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

80021\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

80380\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

80440\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

80442\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

30066\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

80620\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

80703\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

80724\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

80752\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

80756\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

80616\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

Table 45-302 European Health Insurance Card Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.


Library of system data identifiers 1162
European Health Insurance Card Number

Table 45-302 European Health Insurance Card Number narrow-breadth validators (continued)

Mandatory validator Description

Exclude ending characters


Library of system data identifiers 1163
European Health Insurance Card Number

Table 45-302 European Health Insurance Card Number narrow-breadth validators (continued)

Mandatory validator Description

Data ending with any of the following list of values is not


matched:

80040000000000000000, 80040111111111111111,
80040222222222222222, 80040333333333333333,
80040444444444444444, 80040555555555555555,
80040666666666666666, 80040777777777777777,
80040888888888888888, 80040999999999999999

80826000000000000000, 80826111111111111111,
80826222222222222222, 80826333333333333333,
80826444444444444444, 80826555555555555555,
80826666666666666666, 80826777777777777777,
80826888888888888888, 80826999999999999999

38500000000000000000, 38500111111111111111,
38500222222222222222, 38500333333333333333,
38500444444444444444, 38500555555555555555,
38500666666666666666, 38500777777777777777,
38500888888888888888, 38500999999999999999

80203000000000000000, 80203111111111111111,
80203222222222222222, 80203333333333333333,
80203444444444444444, 80203555555555555555,
80203666666666666666, 80203777777777777777,
80203888888888888888, 80203999999999999999

60189000000000000000, 60189111111111111111,
60189222222222222222, 60189333333333333333,
60189444444444444444, 60189555555555555555,
60189666666666666666, 60189777777777777777,
60189888888888888888, 60189999999999999999

80246000000000000000, 80246111111111111111,
80246222222222222222, 80246333333333333333,
80246444444444444444, 80246555555555555555,
80246666666666666666, 80246777777777777777,
80246888888888888888, 80246999999999999999

80276000000000000000, 80276111111111111111,
80276222222222222222, 80276333333333333333,
80276444444444444444, 80276555555555555555,
80276666666666666666, 80276777777777777777,
80276888888888888888, 80276999999999999999

80300000000000000000, 80300111111111111111,
80300222222222222222, 80300333333333333333,
Library of system data identifiers 1164
European Health Insurance Card Number

Table 45-302 European Health Insurance Card Number narrow-breadth validators (continued)

Mandatory validator Description

80300444444444444444, 80300555555555555555,
80300666666666666666, 80300777777777777777,
80300888888888888888, 80300999999999999999

80021000000000000000, 80021111111111111111,
80021222222222222222, 80021333333333333333,
80021444444444444444, 80021555555555555555,
80021666666666666666, 80021777777777777777,
80021888888888888888, 80021999999999999999

80380000000000000000, 80380111111111111111,
80380222222222222222, 80380333333333333333,
80380444444444444444, 80380555555555555555,
80380666666666666666, 80380777777777777777,
80380888888888888888, 80380999999999999999

80440000000000000000, 80440111111111111111,
80440222222222222222, 80440333333333333333,
80440444444444444444, 80440555555555555555,
80440666666666666666, 80440777777777777777,
8440888888888888888, 80440999999999999999

80442000000000000000, 80442111111111111111,
80442222222222222222, 80442333333333333333,
80442444444444444444, 80442555555555555555,
80442666666666666666, 80442777777777777777,
80442888888888888888, 80442999999999999999

30066000000000000000, 30066111111111111111,
30066222222222222222, 30066333333333333333,
30066444444444444444, 30066555555555555555,
30066666666666666666, 30066777777777777777,
30066888888888888888, 30066999999999999999
Library of system data identifiers 1165
Finland Driver's Licence Number

Table 45-302 European Health Insurance Card Number narrow-breadth validators (continued)

Mandatory validator Description

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

medical account number, health insurance card


number, insurance card number, health card, health
card number, ehic number, ehic, ehic#, numero conto
medico, tessera sanitaria assicurazione numero, carta
assicurazione numero, Krankenversicherungsnummer,
assicurazione sanitaria numero, medisch
rekeningnummer, ziekteverzekeringskaartnummer,
verzekerings kaart nummer, gezondheidskaart
nummer, gezondheidskaart, medizinische
Kontonummer, Krankenversicherungskarte Nummer,
Versicherungsnummer, Gesundheitskarte Nummer,
Gesundheitskarte, arstliku konto number,
ravikindlustuse kaardi number, tervisekaart,
tervisekaardi number, Uimhir ehic, tarjeta salud, broj
kartice zdravstvenog osiguranja, kartice osiguranja
broj, zdravstvenu karticu, zdravstvene kartice broj,
ehic broj, numero tessera sanitaria, numero carta di
assicurazione, tessera sanitaria, numero ehic,
Gesondheetskaart, ehic nummer, numer rachunku
medycznego, numer karty ubezpieczenia zdrowotne,
numer karty ubezpieczenia, karta zdrowia, numer karty
zdrowia, numer ehic, sairausvakuutuskortin numero,
vakuutuskortin numero, terveyskortti, terveyskortin
numero, medicinsk kontonummer, ehic numeris,
medizinescher Konto Nummer, zdravstvena izkaznica

Finland Driver's Licence Number


The Finland Driver's License Number is the 10-character alphanumeric pattern that identifies
an individual Finnish driver's license.
The Finland Driver's Licence Number data identifier detects a 10-character alphanumeric
pattern that matches the Finland Driver's Licence Number format.
The Finland Driver's Licence Number data identifier offers three breadths of detection:
■ The wide breadth detects a 10-character alphanumeric pattern without checksum validation.
See “Finland Driver's Licence Number wide breadth” on page 1166.
■ The medium breadth detects a 10-character alphanumeric pattern with checksum validation.
Library of system data identifiers 1166
Finland Driver's Licence Number

See “Finland Driver's Licence Number medium breadth” on page 1166.


■ The narrow breadth detects a 10-character alphanumeric pattern with checksum validation.
It also requires the presence of related keywords.
See “Finland Driver's Licence Number narrow breadth” on page 1166.

Finland Driver's Licence Number wide breadth


The wide breadth detects a 10-character alphanumeric pattern without checksum validation.

Table 45-303 Finland Driver's Licence Number wide-breadth patterns

Patterns

\d{6}-\d{4}

\d{6}-\d{3}\l

Table 45-304 Finland Driver's Licence Number wide-breadth validator

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Finland Driver's Licence Number medium breadth


The medium breadth detects a 10-character alphanumeric pattern with checksum validation.

Table 45-305 Finland Driver's Licence Number medium-breadth patterns

Patterns

\d{6}-\d{4}

\d{6}-\d{3}\l

Table 45-306 Finland Driver's Licence Number medium-breadth validator

Mandatory validator Description

Finland Driver's Licence Number Validation Check Computes the checksum and validates the pattern against
it.

Finland Driver's Licence Number narrow breadth


The narrow breadth detects a 10-character alphanumeric pattern with checksum validation. It
also requires the presence of related keywords.
Library of system data identifiers 1167
Finland European Health Insurance Number

Table 45-307 Finland Driver's Licence Number narrow-breadth patterns

Patterns

\d{6}-\d{4}

\d{6}-\d{3}\l

Table 45-308 Finland Driver's Licence Number narrow-breadth validators

Mandatory validators Description

Finland Driver's Licence Number Validation Check Computes the checksum and validates the pattern against
it.

Number delimiter Validates a match by checking the surrounding characters.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

driver license, driver license number, drivers lic.,


drivers license, drivers license number, driving license
number, DLNo#, dlno#, driving license

permis de conduire, ajokortti, ajokortin numero,


kuljettaja lic., körkort, körkort nummer, förare lic.

Finland European Health Insurance Number


The Finland European Health Insurance Number is a unique 20-digit numeric identifier that is
assigned to every person who uses health services in Finland.
The Finland European Health Insurance Number data identifier detects a 20-digit number that
matches the Finland European Health Insurance Number format.
The Finland European Health Insurance Number data identifier provides two breadths of
detection:
■ The wide breadth detects a 20-digit number without checksum validation.
See “Finland European Health Insurance Number wide breadth” on page 1168.
■ The narrow breadth detects a 20-digit number without checksum validation. It requires the
presence of related keywords.
See “Finland European Health Insurance Number narrow breadth” on page 1168.
Library of system data identifiers 1168
Finland European Health Insurance Number

Finland European Health Insurance Number wide breadth


The wide breadth detects a 20-digit number without checksum validation.

Table 45-309 Finland European Health Insurance Number wide-breadth patterns

Patterns

8024680246\d{10}

8024680246[- ]\d{10}

Table 45-310 Finland European Health Insurance Number wide-breadth validators

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude beginning characters Data beginning with any of the following list of values is
not matched:

80246802460000000000, 80246802461111111111,
80246802462222222222, 80246802463333333333,
80246802464444444444, 80246802465555555555,
80246802466666666666, 80246802467777777777,
80246802468888888888, 80246802469999999999

Finland European Health Insurance Number narrow breadth


The narrow breadth detects a 20-digit number without checksum validation. It requires the
presence of related keywords.

Table 45-311 Finland European Health Insurance Number narrow-breadth patterns

Patterns

8024680246\d{10}

8024680246[- ]\d{10}

Table 45-312 Finland European Health Insurance Number narrow-breadth validators

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding characters.


Library of system data identifiers 1169
Finland Passport Number

Table 45-312 Finland European Health Insurance Number narrow-breadth validators


(continued)

Mandatory validators Description

Exclude beginning characters Data beginning with any of the following list of values is
not matched:

80246802460000000000, 80246802461111111111,
80246802462222222222, 80246802463333333333,
80246802464444444444, 80246802465555555555,
80246802466666666666, 80246802467777777777,
80246802468888888888, 80246802469999999999

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

Suomi EHIC-numero, health insurance card,


Sairausvakuutuskortti, sairaanhoitokortin,
Sjukförsäkringskort, ehic, sairaanhoitokortin, Finland
health insurance card, Suomen sairausvakuutuskortti,
Finska sjukförsäkringskort, health card number,
Terveyskortti, Hälsokort, health card,
FinlandEHICNumber#, ehic#, EHIC,
sairausvakuutusnumero, health insurance number,
sjukförsäkring nummer, EHIC#

Finland Passport Number


Finnish passports are issued to nationals of Finland for the purpose of international travel.
They also facilitate the process of securing assistance from Finnish consular officials abroad.
The Finland Passport Number data identifier detects a nine-digit alphanumeric pattern that
matches the Finland Passport Number format.
The Finland Passport Number data identifier provides two breadths of detection:
■ The wide breadth detects a nine-digit alphanumeric pattern without checksum validation.
See “Finland Passport Number wide breadth” on page 1170.
■ The narrow breadth detects a nine-digit alphanumeric pattern without checksum validation.
It requires the presence of related keywords.
See “Finland Passport Number narrow breadth” on page 1170.
Library of system data identifiers 1170
Finland Passport Number

Finland Passport Number wide breadth


The wide breadth detects a nine-digit alphanumeric pattern without checksum validation.

Table 45-313 Finland Passport Number wide-breadth pattern

Pattern

[A-Za-z]{2}\d{7}

Table 45-314 Finland Passport Number wide-breadth validator

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Finland Passport Number narrow breadth


The narrow breadth detects a nine-digit alphanumeric pattern without checksum validation. It
requires the presence of related keywords.

Table 45-315 Finland Passport Number narrow-breadth pattern

Pattern

[A-Za-z]{2}\d{7}

Table 45-316 Finland Passport Number narrow-breadth validators

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding characters.


Library of system data identifiers 1171
Finland Tax Identification Number

Table 45-316 Finland Passport Number narrow-breadth validators (continued)

Mandatory validators Description

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

finland passport number, finland passport no., finland


passport no#, finland passport#, finland passport
number#

Suomen passin numero, suomalainen passi, passin


numero, passin numero.#, passin numero#

passport number, passport no., passport no#,


passport#, passport number#

passin numero, passin numero., passin numero#,


passi#

Finland Tax Identification Number


Finland issues a tax identification number for persons who have obligations to declare taxes
in Finland.
The Finland Tax Identification Number data identifier detects an 8- or 11-character alphanumeric
pattern that matches the Finland Tax Identification Number format.
The Finland Tax Identification Number provides three breadths of detection:
■ The wide breadth detects an 8- or 11-character alphanumeric pattern without checksum
validation.
See “Finland Tax Identification Number wide breadth” on page 1171.
■ The medium breadth detects an 8- or 11-character alphanumeric pattern with checksum
validation.
See “Finland Tax Identification Number medium breadth” on page 1172.
■ The narrow breadth detects an 8- or 11-character alphanumeric pattern with checksum
validation. It also requires the presence of related keywords.
See “Finland Tax Identification Number narrow breadth” on page 1172.

Finland Tax Identification Number wide breadth


The wide breadth detects an 8- or 11-character alphanumeric pattern without checksum
validation.
Library of system data identifiers 1172
Finland Tax Identification Number

Table 45-317 Finland Tax Identification Number wide-breadth patterns

Patterns

\d{6}[Aa+-]\d{3}\w

\d{7}[-]\d

Table 45-318 Finland Tax Identification Number wide-breadth validator

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Finland Tax Identification Number medium breadth


The medium breadth detects an 8- or 11-character alphanumeric pattern with checksum
validation.

Table 45-319 Finland Tax Identification Number medium-breadth patterns

Patterns

\d{6}[Aa+-]\d{3}\w

\d{7}[-]\d

Table 45-320 Finland Tax Identification Number medium-breadth validator

Mandatory validator Description

Finland Tax Identification Number Validation Check Computes the checksum and validates the pattern against
it.

Finland Tax Identification Number narrow breadth


The narrow breadth detects an 8- or 11-character alphanumeric pattern with checksum
validation. It also requires the presence of related keywords.

Table 45-321 Finland Tax Identification Number narrow-breadth patterns

Patterns

\d{6}[Aa+-]\d{3}\w

\d{7}[-]\d
Library of system data identifiers 1173
Finland Value Added Tax (VAT) Number

Table 45-322 Finland Tax Identification Number narrow breadth validators

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding characters.

Finland Tax Identification Number Validation Check Computes the checksum and validates the pattern against
it.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

tax identification number, tax number, tax id, taxid#,


taxnumber#

verotunniste, verokortti, verotunnus, veronumero

Finland Value Added Tax (VAT) Number


Value Added Tax (VAT) is a consumption tax that is borne by the end consumer. VAT is paid
for each transaction in the manufacturing and distribution process.
The Finland Value Added Tax (VAT) Number data identifier detects a 10-character alphanumeric
pattern that matches the Finland Value Added Tax (VAT) Number format.
The Finland Value Added Tax (VAT) Number data identifier provides three breadths of detection:
■ The wide breadth detects a 10-character alphanumeric pattern beginning with FI without
checksum validation.
See “Finland Value Added Tax (VAT) Number wide breadth” on page 1173.
■ The medium breadth detects a 10-character alphanumeric pattern beginning with FI with
checksum validation.
See “Finland Value Added Tax (VAT) Number medium breadth” on page 1174.
■ The narrow breadth detects a 10-character alphanumeric pattern beginning with FI with
checksum validation. It also requires the presence of related keywords.
See “Finland Value Added Tax (VAT) Number narrow breadth” on page 1175.

Finland Value Added Tax (VAT) Number wide breadth


The wide breadth detects a 10-character alphanumeric pattern beginning with FI without
checksum validation.
Library of system data identifiers 1174
Finland Value Added Tax (VAT) Number

Table 45-323 Finland Value Added Tax (VAT) Number wide-breadth patterns

Patterns

[Ff][Ii]\d{8}

[Ff][Ii] \d{8}

[Ff][Ii]\d{7}-\d

[Ff][Ii] \d{7}-\d

Table 45-324 Finland Value Added Tax (VAT) Number wide-breadth validators

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

00000000, 11111111, 22222222, 33333333, 44444444,


55555555, 66666666, 77777777, 88888888, 99999999

Finland Value Added Tax (VAT) Number medium breadth


The medium breadth detects a 10-character alphanumeric pattern beginning with FI with
checksum validation.

Table 45-325 Finland Value Added Tax (VAT) Number medium-breadth patterns

Patterns

[Ff][Ii]\d{8}

[Ff][Ii] \d{8}

[Ff][Ii]\d{7}-\d

[Ff][Ii] \d{7}-\d

Table 45-326 Finland Value Added Tax (VAT) Number medium-breadth validator

Mandatory validator Description

Finland VAT Number Validation Check Computes the checksum and validates the pattern against
it.
Library of system data identifiers 1175
Finnish Personal Identification Number

Finland Value Added Tax (VAT) Number narrow breadth


The narrow breadth detects a 10-character alphanumeric pattern beginning with FI with
checksum validation. It also requires the presence of related keywords.

Table 45-327 Finland Value Added Tax (VAT) Number narrow-breadth patterns

Patterns

[Ff][Ii]\d{8}

[Ff][Ii] \d{8}

[Ff][Ii]\d{7}-\d

[Ff][Ii] \d{7}-\d

Table 45-328 Finland Value Added Tax (VAT) Number narrow-breadth validators

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

00000000, 11111111, 22222222, 33333333, 44444444,


55555555, 66666666, 77777777, 88888888, 99999999

Finland VAT Number Validation Check Computes the checksum and validates the pattern against
it.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

vat, vat number, vat#

arvonlisäveronumero, ARVONLISÄVERO, ALV,


arvonlisäverotunniste, ALV nro, ALV numero, alv

Finnish Personal Identification Number


The Finnish Personal Identification Number or Personal Identity Code is a unique personal
identifier used for identifying citizens in government and many other transactions.
The Finnish Personal Identification Number data identifier detects an alphanumeric pattern
that matches the Finnish Personal Identification Number format.
Library of system data identifiers 1176
Finnish Personal Identification Number

The Finnish Personal Identification Number data identifier provides three breadths of detection:
■ The wide breadth detects a Finnish Personal Identification Number without validation.
See “ Finnish Personal Identification Number wide breadth” on page 1176.
■ The medium breadth detects a Finnish Personal Identification Number with checksum
validation.
See “Finnish Personal Identification Number medium breadth” on page 1176.
■ The narrow breadth detects a Finnish Personal Identification Number with checksum
validation. It also requires the presence of related keywords.
See “Finnish Personal Identification Number narrow breadth” on page 1176.

Finnish Personal Identification Number wide breadth


The wide breadth detects a Finnish Personal Identification Number without validation.

Table 45-329 Finnish Personal Identification Number wide-breadth pattern

Pattern

\d{6}[-+Aa]\d{3}\w

The wide breadth of the Finnish Personal Identification Number wide breadth includes no
validators.

Finnish Personal Identification Number medium breadth


The medium breadth detects a Finnish Personal Identification Number with checksum validation.

Table 45-330 Finnish Personal Identification Number medium-breadth pattern

Pattern

\d{6}[-+Aa]\d{3}\w

Table 45-331 Finnish Personal Identification Number medium-breadth validators

Mandatory validator Description

Finnish Personal Identification Number Validation Computes the checksum and validates the pattern against
Check it.

Finnish Personal Identification Number narrow breadth


The narrow breadth detects a Finnish Personal Identification Number with checksum validation.
It also requires the presence of related keywords.
Library of system data identifiers 1177
France Driver's License Number

Table 45-332 Finnish Personal Identification Number narrow-breadth pattern

Pattern

\d{6}[-+Aa]\d{3}\w

Table 45-333 Finnish Personal Identification Number narrow-breadth validators

Mandatory validator Description

Finnish Personal Identification Number Validation Computes the checksum and validates the pattern against
Check it.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

identification number, personal ID, identity number,


Finnish national ID number, personalIDnumber#,
National Identification Number, id number, National id
no., National id number, id no

tunnistenumero, henkilötunnus, yksilöllinen


henkilökohtainen tunnistenumero, Ainutlaatuinen
henkilökohtainen tunnus, identiteetti numero, Suomen
kansallinen henkilötunnus, henkilötunnusnumero#,
kansallisen tunnistenumero, tunnusnumero,
kansallinen tunnus numero

France Driver's License Number


The France Driver's License Number is the 12-digit identifier for an individual's driver's licence
issued by the Driver and Vehicle Licensing Agency of France.
The France Driver's License Number data identifier detects a 12-digit number that matches
the France Driver's License Number format.
The France Driver's License Number data identifier provides two breadths of detection:
■ The wide breadth detects a 12-digit number without checksum validation.
See “France Driver's License Number wide breadth” on page 1178.
■ The narrow breadth detects a 12-digit number without checksum validation. It also requires
the presence of related keywords.
See “France Driver's License Number narrow breadth” on page 1178.
Library of system data identifiers 1178
France Driver's License Number

France Driver's License Number wide breadth


The wide breadth detects a 12-digit number without checksum validation.

Table 45-334 France Driver's License Number wide-breadth pattern

Pattern

\d{12}

Table 45-335 France Driver's License Number wide-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Number Delimiter Validates a match by checking the surrounding characters.

France Driver's License Number narrow breadth


The narrow breadth detects a 12-digit number without checksum validation. It also requires
the presence of related keywords.

Table 45-336 France Driver's License Number narrow-breadth pattern

Pattern

\d{12}

Table 45-337 France Driver's License Number narrow-breadth validators

Mandatory validators Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.


Library of system data identifiers 1179
France Health Insurance Number

Table 45-337 France Driver's License Number narrow-breadth validators (continued)

Mandatory validators Description

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

drivers licence number, drivers license number, driving


licence number, driving license number

permis de conduire

licence number, license number, licence numbers,


license numbers, drivers license, driving licence,
driving license, DL#, dl#, DLNO#, dlno#, Driver License,
Driver License Number, Drivers Lic., Drivers Licence,
Driver's License, Driver's License Number, driver's
license number, Driver's Licence Number

France Health Insurance Number


A Carte Vitale is social insurance card used in France that contains medical information for
the card holder. It has a unique 21-digit serial number.
The France Health Insurance Number data identifier detects a 21-digit number that matches
the France Health Insurance Number format.
The France Health Insurance Number data identifier provides two breadths of detection:
■ The wide breadth detects a 21-digit number without checksum validation.
See “France Health Insurance Number wide breadth” on page 1179.
■ The narrow breadth detects a 21-digit number without checksum validation. It also requires
the presence of related keywords.
See “France Health Insurance Number narrow breadth” on page 1180.

France Health Insurance Number wide breadth


The wide breadth detects a 21-character number without checksum validation.

Table 45-338 France Health Insurance Number wide-breadth patterns

Pattern

\d{10} \d{10} \d
Library of system data identifiers 1180
France Health Insurance Number

Table 45-338 France Health Insurance Number wide-breadth patterns (continued)

Pattern

\d{21}

Table 45-339 France Health Insurance Number wide-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

France Health Insurance Number narrow breadth


The narrow breadth detects a 21-character number without checksum validation. It also requires
the presence of related keywords.

Table 45-340 France Health Insurance Number narrow-breadth patterns

Pattern

\d{10} \d{10} \d

\d{21}

Table 45-341 France Health Insurance Number narrow-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

insurance card, social insurance card,

carte vitale, carte d'assuré social


Library of system data identifiers 1181
France Tax Identification Number

France Tax Identification Number


France issue a tax identification number for anyone who has obligations to declare taxes in
France.
The France Tax Identification Number data identifier detects a 13-digit number that matches
the France Tax Identification Number format.
The France Tax Identification Number data identifier provides two breadths of detection:
■ The wide breadth detects a 13-digit number without checksum validation.
See “France Tax Identification Number wide breadth” on page 1181.
■ The narrow breadth detects a 13-digit number without checksum validation. It also requires
the presence of related keywords.
See “France Tax Identification Number narrow breadth” on page 1181.

France Tax Identification Number wide breadth


The wide breadth detects a 13-digit number without checksum validation.

Table 45-342 France Tax Identification Number wide-breadth patterns

Patterns

[0123]\d{12}

[0123]\d{1} \d{2} \d{3} \d{3} \d{3}

Table 45-343 France Tax Identification Number wide-breadth validators

Mandatory validators Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

France Tax Identification Number narrow breadth


The narrow breadth detects a 13-digit number without checksum validation. It also requires
the presence of related keywords.

Table 45-344 France Tax Identification Number narrow-breadth patterns

Patterns

[0123]\d{12}
Library of system data identifiers 1182
France Value Added Tax (VAT) Number

Table 45-344 France Tax Identification Number narrow-breadth patterns (continued)

Patterns

[0123]\d{1} \d{2} \d{3} \d{3} \d{3}

Table 45-345 France Tax Identification Number narrow-breadth validators

Mandatory validators Description

Duplicate digits Ensures that a string of digits is not all the same.

Number Delimiter Validates a match by checking the surrounding characters.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

tax identification number, tax number, tax id

numéro d'identification fiscale

France Value Added Tax (VAT) Number


The Value Added Tax (VAT) is a tax levied on goods and services provided in France and is
collected from the final customer. Companies must register with the Register of Commerce
and Companies in France to get a VAT number allocated.
The France Value Added Tax (VAT) Number data identifier detects a 13-character alphanumeric
pattern that matches the France Value Added Tax (VAT) Number format.
The France Value Added Tax (VAT) Number data identifier provides three breadths of detection:
■ The wide breadth detects a 13-character alphanumeric pattern without checksum validation.
See “France Value Added Tax (VAT) Number wide breadth” on page 1182.
■ The medium breadth detects a 13-character alphanumeric pattern with checksum validation.
See “France Value Added Tax (VAT) Number medium breadth” on page 1183.
■ The narrow breadth detects a 13-character alphanumeric pattern with checksum validation.
It also requires the presence of related keywords.
See “France Value Added Tax (VAT) Number narrow breadth” on page 1184.

France Value Added Tax (VAT) Number wide breadth


The wide breadth detects a 13-character alphanumeric pattern without checksum validation.
Library of system data identifiers 1183
France Value Added Tax (VAT) Number

Table 45-346 France Value Added Tax (VAT) Number wide-breadth patterns

Patterns

[Ff][Rr][0-9A-Za-z]{2}\d{9}

[Ff][Rr][0-9A-Za-z]{2} \d{9}

[Ff][Rr] [0-9A-Za-z]{2}\d{9}

[Ff][Rr]-[0-9A-Za-z]{2}\d{9}

[Ff][Rr][0-9A-Za-z]{2} \d{3}-\d{3}-\d{3}

[Ff][Rr][0-9A-Za-z]{2} \d{3}.\d{3}.\d{3}

[Ff][Rr][0-9A-Za-z]{2} \d{3},\d{3},\d{3}

[Ff][Rr][0-9A-Za-z]{2} \d{3} \d{3} \d{3}

Table 45-347 France Value Added Tax (VAT) Number wide-breadth validator

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

France Value Added Tax (VAT) Number medium breadth


The medium breadth detects a 13-character alphanumeric pattern with checksum validation.

Table 45-348 France Value Added Tax (VAT) Number medium breadth patterns

Patterns

[Ff][Rr][0-9A-Za-z]{2}\d{9}

[Ff][Rr][0-9A-Za-z]{2} \d{9}

[Ff][Rr] [0-9A-Za-z]{2}\d{9}

[Ff][Rr]-[0-9A-Za-z]{2}\d{9}

[Ff][Rr][0-9A-Za-z]{2} \d{3}-\d{3}-\d{3}

[Ff][Rr][0-9A-Za-z]{2} \d{3}.\d{3}.\d{3}

[Ff][Rr][0-9A-Za-z]{2} \d{3},\d{3},\d{3}

[Ff][Rr][0-9A-Za-z]{2} \d{3} \d{3} \d{3}


Library of system data identifiers 1184
France Value Added Tax (VAT) Number

Table 45-349 France Value Added Tax (VAT) Number medium breadth validators

Mandatory validator Description

France VAT Number Validation Check Checksum validator for the France Value Added Tax (VAT
Number.

France Value Added Tax (VAT) Number narrow breadth


The narrow breadth detects a 13-character alphanumeric pattern with checksum validation. It
also requires the presence of related keywords.

Table 45-350 France Value Added Tax (VAT) Number narrow-breadth patterns

Patterns

[Ff][Rr][0-9A-Za-z]{2}\d{9}

[Ff][Rr][0-9A-Za-z]{2} \d{9}

[Ff][Rr] [0-9A-Za-z]{2}\d{9}

[Ff][Rr]-[0-9A-Za-z]{2}\d{9}

[Ff][Rr][0-9A-Za-z]{2} \d{3}-\d{3}-\d{3}

[Ff][Rr][0-9A-Za-z]{2} \d{3}.\d{3}.\d{3}

[Ff][Rr][0-9A-Za-z]{2} \d{3},\d{3},\d{3}

[Ff][Rr][0-9A-Za-z]{2} \d{3} \d{3} \d{3}

Table 45-351 France Value Added Tax (VAT) Number narrow-breadth validators

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding characters.

France VAT Number Validation Check Checksum validator for the France Value Added Tax (VAT
Number.
Library of system data identifiers 1185
French INSEE Code

Table 45-351 France Value Added Tax (VAT) Number narrow-breadth validators (continued)

Mandatory validators Description

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

france vat number, French vat number, VAT Number,


vat no, VAT#, value added tax number, value added
tax, SIREN identification no

Numéro d'identification taxe sur valeur ajoutée,


Numéro taxe valeur ajoutée, taxe valeur ajoutée, Taxe
sur la valeur ajoutée, Numéro de TVA
intracommunautaire, n° TVA,numéro de TVA, Numéro
de TVA en France, français numéro de TVA, Numéro
d'identification SIREN

French INSEE Code


The INSEE code in France is used as a social insurance number, a national identification
number, and for taxation and employment purposes.
The French INSEE Code data identifier detects a 15-digit number that matches the French
INSEE Code format.
The French INSEE Code data identifier detects the presence of INSEE numbers.
The French INSEE Code data identifier provides two breadths of detection:
■ The wide breadth detects a 15-digit number that passes checksum validation.
■ The narrow breadth detects a 15-digit number that passes checksum validation. It also
requires the presence of related keywords.

French INSEE Code wide breadth


The wide breadth detects a 15-digit number which encodes the date of birth, department of
origin, commune of origin, and an order number. A space delimiter after the first 13 digits is
optional. The last two digits of the INSEE code encode a control key used to validate a
checksum.
Library of system data identifiers 1186
French INSEE Code

Table 45-352 French INSEE Code wide-breadth patterns

Patterns

\d{13} \d{2}

d{15}

Table 45-353 French INSEE Code wide-breadth validator

Mandatory validator Description

INSEE Control Key This validator computes the INSEE control key and compares it to the last 2 digits
of the pattern.

French INSEE Code narrow breadth


The narrow breadth detects a 15-digit number which encodes the date of birth, department of
origin, commune of origin, and an order number. A space delimiter after the first 13 digits is
optional. The last two digits of the INSEE code encode a control key used to validate a
checksum. It also requires the presence of related keywords.

Table 45-354 French INSEE Code narrow-breadth patterns

Pattern

\d{13} \d{2}

d{15}

Table 45-355 French INSEE Code narrow-breadth validators

Mandatory validator Description

INSEE Control Key This validator computes the INSEE control key and
compares it to the last 2 digits of the pattern.

Find keywords With this option selected, at least one of the


following keywords or key phrases must be present
for the data to be matched.

Inputs:

INSEE, numéro de sécu, code sécu

social security number, social security code


Library of system data identifiers 1187
French Passport Number

French Passport Number


The French passport is an identity document issued to French citizens. Besides enabling the
bearer to travel internationally and serving as indication of French citizenship, the passport
facilitates the process of securing assistance from French consular officials abroad or other
European Union member states in case a French consular is absent, if needed.
The French Passport Number data identifier detects a nine-character alphanumeric pattern
that matches the French Passport Number format.
The French Passport Number data identifier provides two breadths of detection:
■ The wide breadth detects a nine-character alphanumeric pattern without checksum
validation.
See “French Passport Number wide breadth” on page 1187.
■ The narrow breadth detects a nine-character alphanumeric pattern without checksum
validation. It requires the presence of related keywords.
See “French Passport Number narrow breadth” on page 1187.

French Passport Number wide breadth


The wide breadth detects a nine-character alphanumeric pattern without checksum validation.

Table 45-356 French Passport Number wide-breadth pattern

Pattern

\d\d[A-Za-z][A-za-z]\d\d\d\d\d

Table 45-357 French Passport Number wide-breadth validator

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding numbers.

French Passport Number narrow breadth


The narrow breadth detects a nine-character alphanumeric pattern without checksum validation.
It also requires the presence of related keywords.

Table 45-358 French Passport Number narrow-breadth pattern

Pattern

\d\d[A-Za-z][A-za-z]\d\d\d\d\d
Library of system data identifiers 1188
French Social Security Number

Table 45-359 French Passport Number narrow-breadth validators

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding numbers.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

passport, Passport, French Passport, french passport,


Passport Card, Passport Book, passport card, passport
book, passport number, passport no, Passport Number

Passeport français, Passeport, Passeport livre,


Passeport carte, numéro passeport

French Social Security Number


The French Social Security Number (FSSN) is a unique number assigned to each French
citizen or resident foreign national. It serves as a national identification number.
The French Social Security Number data identifier detects a 15-character alphanumeric pattern
that matches the French Social Security Number format.
The French Social Security Number system data identifier provides three breadths of detection:
■ The wide breadth detects a 15-character alphanumeric pattern without checksum validation.
See “French Social Security Number wide breadth” on page 1188.
■ The medium breadth detects a 15-character alphanumeric pattern with checksum validation.
See “French Social Security Number medium breadth” on page 1189.
■ The narrow breadth detects a 15-character alphanumeric pattern that passes checksum
validation. It also requires the presence of related keywords.
See “French Social Security Number narrow breadth” on page 1189.

French Social Security Number wide breadth


The wide breadth detects a 15-character alphanumeric pattern without checksum validation.

Table 45-360 French Social Security Number wide-breadth pattern

Pattern

[12]\d{2}[012]\d{2}[AB1234567890]\d{8}
Library of system data identifiers 1189
French Social Security Number

Table 45-361 French Social Security Number wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

French Social Security Number medium breadth


The medium breadth detects a 15-character alphanumeric pattern with checksum validation.

Table 45-362 French Social Security Number medium-breadth pattern

Pattern

[12]\d{2}[012]\d{2}[AB1234567890]\d{8}

Table 45-363 French Social Security Number medium-breadth validator

Mandatory validator Description

French Social Security Number Validation Check Computes the checksum and validates the pattern against
it.

French Social Security Number narrow breadth


The narrow breadth detects a 15-character alphanumeric pattern that passes checksum
validation. It also requires the presence of related keywords.

Table 45-364 French Social Security Number narrow-breadth pattern

Pattern

[12]\d{2}[012]\d{2}[AB1234567890]\d{8}

Table 45-365 French Social Security Number narrow-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

French Social Security Number Validation Check Computes the checksum and validates the pattern against
it.
Library of system data identifiers 1190
German Passport Number

Table 45-365 French Social Security Number narrow-breadth validators (continued)

Mandatory validator Description

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

French social security number, social security number,


FSSN#, SSN#, ssn, ssn#, socialsecuritynumber,
insurance number, national ID number, nationalid#

sécurité sociale non., sécurité sociale numéro, code


sécurité sociale, numéro d'assurance

German Passport Number


The German passport number is issued to German nationals for the purpose of international
travel. A German passport is an officially recognized document that German authorities accept
as proof of identity from German citizens.
The German Passport Number data identifier detects an 11-character alphanumeric pattern
the matches the German Passport Number format.
The German Passport Number system data identifier provides three breadths of detection:
■ The wide breadth detects an 11-character alphanumeric pattern ending with the letter "D"
without checksum validation.
See “German Passport Number wide breadth” on page 1190.
■ The medium breadth detects an 11-character alphanumeric pattern ending with the letter
"D" with checksum validation.
See “German Passport Number medium breadth” on page 1191.
■ The narrow breadth detects an 11-character alphanumeric pattern ending with the letter
"D" with checksum validation. It also requires the presence of related keywords.
See “German Passport Number narrow breadth” on page 1191.

German Passport Number wide breadth


The wide breadth detects an 11-character alphanumeric pattern ending with the letter "D"
without checksum validation.
Library of system data identifiers 1191
German Passport Number

Table 45-366 German Passport Number wide-breadth patterns

Patterns

\w{9}\dD

\w{10}[dD]

Table 45-367 German Passport Number wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

German Passport Number medium breadth


The medium breadth detects an 11-character alphanumeric pattern ending with the letter "D"
with checksum validation.

Table 45-368 German Passport Number medium-breadth patterns

Patterns

\w{9}\dD

\w{10}[dD]

Table 45-369 German Passport Number medium-breadth validator

Mandatory validator Description

German Passport Number Validation Check Computes the checksum every German Passport Number
must pass.

German Passport Number narrow breadth


The narrow breadth detects an 11-character alphanumeric pattern ending with the letter "D"
with checksum validation. It also requires the presence of related keywords.

Table 45-370 German Passport Number narrow-breadth patterns

Patterns

\w{9}\dD

\w{10}[dD]
Library of system data identifiers 1192
German Personal ID Number

Table 45-371 German Passport Number narrow-breadth validators

Mandatory validators Description

Duplicate digits Ensures that a string of digits is not all the same.

German Passport Number Validation Check Computes the checksum every German Passport Number
must pass.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

German passport number, passport number, passport


no, passportno#, passportnumber#

Reisepass kein, Reisepass, Passnummer

German Personal ID Number


The German Personal ID Number is issued to all German citizens.
The German Personal ID Number data identifier detects an 11-character alphanumeric pattern
that matches the German Personal ID Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects an 11-character alphanumeric pattern ending with the letter "D"
without checksum validation.
See “German Personal ID Number wide breadth” on page 1192.
■ The medium breadth detects an 11-character alphanumeric pattern ending with the letter
"D" with checksum validation.
See “ German Personal ID Number medium breadth” on page 1193.
■ The narrow breadth detects an 11-character alphanumeric pattern ending with the letter
"D" with checksum validation. It also requires the presence of related keywords.
See “German Personal ID Number narrow breadth” on page 1193.

German Personal ID Number wide breadth


The wide breadth detects an 11-character alphanumeric pattern ending with the letter "D"
without checksum validation.

Table 45-372 German Personal ID Number wide-breadth pattern

Pattern

\w{9}\dD
Library of system data identifiers 1193
German Personal ID Number

Table 45-373 German Personal ID Number wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

German Personal ID Number medium breadth


The medium breadth detects an 11-character alphanumeric pattern ending with the letter "D"
with checksum validation.

Table 45-374 German Personal ID Number medium-breadth pattern

Pattern

\w{9}\dD

Table 45-375 German Personal ID Number medium breadth validator

Mandatory validator Description

German ID Number Validation Check Computes the checksum and validates the pattern against
it.

German Personal ID Number narrow breadth


The narrow breadth detects an 11-character alphanumeric pattern ending with the letter "D"
with checksum validation. It also requires the presence of related keywords.

Table 45-376 German Personal ID Number narrow-breadth pattern

Pattern

\w{9}\dD

Table 45-377 German Personal ID Number narrow-breadth validators

Mandatory validatora Description

Duplicate digits Ensures that a string of digits is not all the same.

German ID Number Validation Check Computes the checksum and validates the pattern against
it.
Library of system data identifiers 1194
Germany Driver's License Number

Table 45-377 German Personal ID Number narrow-breadth validators (continued)

Mandatory validatora Description

Find keywords If you select this option, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

ID number, identification number, personal ID number,


perosnal ID, GPID, GPID#, unique personal ID number,
unique personal ID, insurance number, identity number
German personal ID number

persönliche identifikationsnummer, ID-Nummer,


Deutsch persönliche-ID-Nummer, persönliche ID
Nummer, eindeutige ID-Nummer, persönliche Nummer,
identität nummer, Versicherungsnummer

Germany Driver's License Number


Identification number for an individual's driver's licence issued by the Driver and Vehicle
Licensing Agency of the Germany.
The Germany Driver's License Number data identifier detects a 13-character alphanumeric
pattern that matches the Germany Driver's License Number format.
The Germany Driver's License Number data identifier provides two breadths of detection:
■ The wide breadth detects a 13-character alphanumeric pattern without checksum validation.
See “Germany Driver's License Number wide breadth” on page 1194.
■ The narrow breadth detects a 13-character alphanumeric pattern without checksum
validation. It also requires the presence of related keywords.
See “Germany Driver's License Number narrow breadth” on page 1195.

Germany Driver's License Number wide breadth


The wide breadth detects a 13-character alphanumeric pattern without checksum validation.

Table 45-378 Germany Driver's License Number wide-breadth pattern

Pattern

\w\d{2}\w{6}\d\w
Library of system data identifiers 1195
Germany Driver's License Number

Table 45-379 Germany Driver's License Number wide-breadth validators

Mandatory validators Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

Germany Driver's License Number narrow breadth


The narrow breadth detects a 13-character alphanumeric pattern without checksum validation.
It also requires the presence of related keywords.

Table 45-380 Germany Driver's License Number narrow-breadth patterns

Pattern

\w\d{2}\w{6}\d\w

Table 45-381 Germany Driver's License Number narrow-breadth validators

Mandatory validators Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

Führerschein, Fuhrerschein, Fuehrerschein,


Führerscheinnummer, Fuhrerscheinnummer,
Fuehrerscheinnummer, Führerscheinnummer,
Fuhrerscheinnummer, Fuehrerscheinnummer,
Führerschein- Nr, Fuhrerschein- Nr, Fuehrerschein-
Nr

Driver License, Driver License Number, driver license


number, Driver Licence, Drivers Lic., Drivers License,
Drivers Licence, Driver's License, Driver's License
Number, driver's license number, Driver's Licence
Number, Driving License number, driving license
number, DL#, dl#, DLNO#, dlno#, driving licence,
driving license
Library of system data identifiers 1196
Germany Value Added Tax (VAT) Number

Germany Value Added Tax (VAT) Number


The Value Added Tax (VAT) is a tax levied on goods and services provided in Germany and
is collected from the final customer.
The Germany Value Added Tax (VAT) Number data identifier detects an 11-character
alphanumeric pattern that matches the Germany Value Added Tax (VAT) Number format.
The Germany Value Added Tax (VAT) Number data identifier provides three breadths of
detection:
■ The wide breadth detects an 11-character alphanumeric pattern without checksum validation.
See “Germany Value Added Tax (VAT) Number wide breadth” on page 1196.
■ The medium breadth detects an 11-character alphanumeric pattern with checksum
validation.
See “Germany Value Added Tax (VAT) Number medium breadth” on page 1196.
■ The narrow breadth detects an 11-character alphanumeric pattern with checksum validation.
It also requires the presence of related keywords.
See “Germany Value Added Tax (VAT) Number narrow breadth” on page 1197.

Germany Value Added Tax (VAT) Number wide breadth


The wide breadth detects an 11-character alphanumeric pattern without checksum validation.

Table 45-382 Germany Value Added Tax (VAT) Number wide-breadth patterns

Patterns

[Dd][Ee]\d{9}

[Dd][Ee] \d{9}

[Dd][Ee]\d{3}[, ]\d{3}[, ]\d{3}

[Dd][Ee] \d{3}[, ]\d{3}[, ]\d{3}

Table 45-383 Germany Value Added Tax (VAT) Number wide-breadth validator

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Germany Value Added Tax (VAT) Number medium breadth


The medium breadth detects an 11-character alphanumeric pattern with checksum validation.
Library of system data identifiers 1197
Germany Value Added Tax (VAT) Number

Table 45-384 Germany Value Added Tax (VAT) Number medium-breadth patterns

Patterns

[Dd][Ee]\d{9}

[Dd][Ee] \d{9}

[Dd][Ee]\d{3}[, ]\d{3}[, ]\d{3}

[Dd][Ee] \d{3}[, ]\d{3}[, ]\d{3}

Table 45-385 Germany Value Added Tax (VAT) Number medium breadth validator

Germany VAT Number Validation Check Checksum validator for the Germany Value Added Tax
(VAT) Number.

Germany Value Added Tax (VAT) Number narrow breadth


The narrow breadth detects an 11-character alphanumeric pattern with checksum validation.
It also requires the presence of related keywords.

Table 45-386 Germany Value Added Tax (VAT) Number narrow-breadth patterns

Patterns

[Dd][Ee]\d{9}

[Dd][Ee] \d{9}

[Dd][Ee]\d{3}[, ]\d{3}[, ]\d{3}

[Dd][Ee] \d{3}[, ]\d{3}[, ]\d{3}

Table 45-387 Germany Value Added Tax (VAT) Number narrow-breadth validators

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding characters.

Germany VAT Number Validation Check Checksum validator for the Germany Value Added Tax
(VAT) Number.
Library of system data identifiers 1198
Germany Tax Identification Number

Table 45-387 Germany Value Added Tax (VAT) Number narrow-breadth validators (continued)

Mandatory validators Description

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

VAT Number, vat no, vat number, VAT#, vat#

Mehrwertsteuer, MwSt, Mehrwertsteuer


Identifikationsnummer, Mehrwertsteuer nummer

Germany Tax Identification Number


Germany issues an 11-digit tax identification number for persons who have obligations to
declare taxes in Germany.
The Germany Tax Identification Number data identifier detects an 11-digit number that matches
the Germany Tax Identification Number format.
The Germany Tax Identification Number data identifier provides three breadths of detection:
■ The wide breadth detects an 11-digit number without checksum validation.
See “Germany Tax Identification Number wide breadth” on page 1198.
■ The medium breadth detects an 11-digit number with checksum validation.
See “Germany Tax Identification Number medium breadth” on page 1199.
■ The narrow breadth detects an 11-digit number with checksum validation. It also requires
the presence of related keywords.
See “Germany Tax Identification Number narrow breadth” on page 1199.

Germany Tax Identification Number wide breadth


The wide breadth detects an 11-digit number without checksum validation.

Table 45-388 Germany Tax Identification Number wide-breadth patterns

Patterns

\d{11}

\d{2} \d{3} \d{3} \d{3}

\d{2}-\d{3}-\d{3}-\d{3}

\d{2}.\d{3}.\d{3}.\d{3}
Library of system data identifiers 1199
Germany Tax Identification Number

Table 45-388 Germany Tax Identification Number wide-breadth patterns (continued)

Patterns

\d{2},\d{3},\d{3},\d{3}

Table 45-389 Germany Tax Identification Number wide-breadth validators

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding characters.

Duplicate digits Ensures that a string of digits is not all the same.

Germany Tax Identification Number medium breadth


The medium breadth detects an 11-digit number with checksum validation.

Table 45-390 Germany Tax Identification Number medium-breadth patterns

Patterns

\d{11}

\d{2} \d{3} \d{3} \d{3}

\d{2}-\d{3}-\d{3}-\d{3}

\d{2}.\d{3}.\d{3}.\d{3}

\d{2},\d{3},\d{3},\d{3}

Table 45-391 Germany Tax Identification Number medium-breadth validator

Mandatory validator Description

Germany Tax Number Validation Check Computes the checksum and validates the pattern against
it.

Germany Tax Identification Number narrow breadth


The narrow breadth detects an 11-digit number with checksum validation. It also requires the
presence of related keywords.
Library of system data identifiers 1200
Greece Passport Number

Table 45-392 Germany Tax Identification Number narrow-breadth patterns

Patterns

\d{11}

\d{2} \d{3} \d{3} \d{3}

\d{2}-\d{3}-\d{3}-\d{3}

\d{2}.\d{3}.\d{3}.\d{3}

\d{2},\d{3},\d{3},\d{3}

Table 45-393 Germany Tax Identification Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Germany Tax Number Validation Check Computes the checksum and validates the pattern against
it.

Duplicate digits Ensures that a string of digits is not all the same.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

tin, tin number, tin no, tin#, german tax identification


number, germany tax identification number, tax
number, tax id

Zinn, Zinnnummer, Zinn Nr, Zinn#,


Steueridentifikationsnummer, Steuer
Identifikationsnummer, Steuernummer, Steuer ID,
Identifikationsnummer

Greece Passport Number


Greek passports are issued to Greek citizens for the purpose of international travel. The
passport along with the national identity card allows for free rights of movement and residence
in any of the states of the European Union and European Economic Area.
The Greece Passport Number data identifier detects a nine-character alphanumeric pattern
that matches the Greece Passport Number format.
This data identifier provides the following breadths of detection:
Library of system data identifiers 1201
Greece Passport Number

■ The wide breadth detects a nine-character alphanumeric pattern that matches the Greece
Passport Number format. It checks for common test patterns.
See “Greece Passport Number wide breadth” on page 1201.
■ The narrow breadth detects a nine-character alphanumeric pattern that matches the Greece
Passport Number format. It checks for common test patterns, and also requires the presence
of related keywords.
See “Greece Passport Number narrow breadth” on page 1201.

Greece Passport Number wide breadth


The wide breadth detects a nine-character alphanumeric pattern that matches the Greece
Passport Number format. It checks for common test patterns.

Table 45-394 Greece Passport Number wide-breadth patterns

Pattern

[a-zA-Z]{2}\d{7}

Table 45-395 Greece Passport Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:
0000000, 1111111, 2222222, 3333333, 4444444,
5555555, 6666666, 7777777, 8888888, 9999999

Greece Passport Number narrow breadth


The narrow breadth detects a nine-character alphanumeric pattern that matches the Greece
Passport Number format. It checks for common test patterns, and also requires the presence
of related keywords.

Table 45-396 Greece Passport Number narrow-breadth patterns

Pattern

[a-zA-Z]{2}\d{7}
Library of system data identifiers 1202
Greece Social Security Number (AMKA)

Table 45-397 Greece Passport Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

0000000, 1111111, 2222222, 3333333, 4444444,


5555555, 6666666, 7777777, 8888888, 9999999

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

passport, passport number, passport no., passport


no, passport#, Passport No., PASSPORTΕ, λλάδα
pasport αριθμός, Greece passport no., Ελλάδα pasport
όχι., Ελλάδα Αριθμός Διαβατηρίου, διαβατήριο,
Διαβατήριο, ΕΛΛΑΔΑ ΔΙΑΒΑΤΗΡΙΟ, Ελλάδα
Διαβατήριο, ελλάδα διαβατήριο, Διαβατήριο Βιβλίο,
βιβλίο διαβατηρίου

Greece Social Security Number (AMKA)


The Greek social security number (AMKA) is the 11-digit work and insurance identification
number of every worker, retired person, and protected family member in Greece.
The Greece Social Security Number (AMKA) detects an 11-digit number that matches the
Greece Social Security Number (AMKA) format.
The Greece Social Security Number (AMKA) data identifier provides three breadths of detection:
■ The wide breadth detects an 11-digit number without checksum validation.
See “Greece Social Security Number (AMKA) wide breadth” on page 1202.
■ The medium breadth detects an 11-digit number with checksum validation.
See “Greece Social Security Number (AMKA) medium breadth” on page 1203.
■ The narrow breadth detects an 11-digit number with checksum validation. It also requires
the presence of related keywords.
See “Greece Social Security Number (AMKA) narrow breadth” on page 1203.

Greece Social Security Number (AMKA) wide breadth


The wide breadth detects an 11-digit number without checksum validation.
Library of system data identifiers 1203
Greece Social Security Number (AMKA)

Table 45-398 Greece Social Security Number (AMKA) wide-breadth pattern

Pattern

\d{11}

Table 45-399 Greece Social Security Number (AMKA) wide-breadth pattern

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding characters.

Duplicate digits Ensures that a string of digits is not all the same.

Greece Social Security Number (AMKA) medium breadth


The medium breadth detects an 11-digit number with checksum validation.

Table 45-400 Greece Social Security Number (AMKA) medium-breadth pattern

Pattern

\d{11}

Table 45-401 Greece Social Security Number (AMKA) medium-breadth validator

Mandatory validator Description

Greece Social Security Number (AMKA) Computes the checksum and validates the pattern against
it.

Greece Social Security Number (AMKA) narrow breadth


The narrow breadth detects an 11-digit number with checksum validation. It also requires the
presence of related keywords.

Table 45-402 Greece Social Security Number (AMKA) narrow-breadth pattern

Pattern

\d{11}
Library of system data identifiers 1204
Greek Tax Identification Number

Table 45-403 Greece Social Security Number (AMKA) narrow-breadth validators

Mandatory validators Description

Greece Social Security Number (AMKA) Computes the checksum and validates the pattern against
it.

Number delimiter Validates a match by checking the surrounding characters.

Duplicate digits Ensures that a string of digits is not all the same.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

greece social security number, greece ssn, greece


ssn#, greece social security no., social security no.,
ssn#, amka, greece amka

Αριθμού Μητρώου Κοινωνικής Ασφάλισης

Greek Tax Identification Number


The Arithmo Forologiko Mitro (AFM) is a unique personal tax identification number assigned
to any individual resident in Greece or person who owns property in Greece.
The Greek Tax Identification Number data identifier detects a nine-digit number that matches
the Greek Tax Identification Number format.
The Greek Tax Identification Number system data identifier provides three breadths of detection:
■ The wide breadth detects a nine-digit number without checksum validation.
See “Greek Tax Identification Number wide breadth” on page 1204.
■ The medium breadth detects a nine-digit number with checksum validation.
See “Greek Tax Identification Number medium breadth” on page 1205.
■ The narrow breadth detects a nine-digit number that passes checksum validation. It also
requires the presence of related keywords.
See “Greek Tax Identification Number narrow breadth” on page 1205.

Greek Tax Identification Number wide breadth


The wide breadth detects a nine-digit number without checksum validation.
Library of system data identifiers 1205
Greek Tax Identification Number

Table 45-404 Greek Tax Identification Number wide-breadth pattern

Pattern

\d{9}

Table 45-405 Greek Tax Identification Number wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Greek Tax Identification Number medium breadth


The medium breadth detects a nine-digit number with checksum validation.

Table 45-406 Greek Tax Identification Number medium-breadth pattern

Pattern

\d{9}

Table 45-407 Greek Tax Identification Number medium-breadth validators

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding characters.

Greek Tax Identification Number Validation Check Computes Greek Tax Identification Number checksum
every Greek Tax Identification Number must pass.

Greek Tax Identification Number narrow breadth


The narrow breadth detects a nine-digit number that passes checksum validation. It also
requires the presence of related keywords.

Table 45-408 Greek Tax Identification Number narrow-breadth pattern

Pattern

\d{9}

Table 45-409 Greek Tax Identification Number narrow-breadth validators

Mandatory validators Description

Duplicate digits Ensures that a string of digits is not all the same.
Library of system data identifiers 1206
Greece Value Added Tax (VAT) Number

Table 45-409 Greek Tax Identification Number narrow-breadth validators (continued)

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding characters.

Greek Tax Identification Number Validation Check Computes Greek Tax Identification Number checksum
every Greek Tax Identification Number must pass.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

AFM, TIN, tax ID No., Tax id no, tax identification


number, tax id no., Tax Registry Number, Tax Registry
No., AFM#, TIN#, Tax Identification Number, TaxIDNo#,
taxregistryno#

Αριθμός Φορολογικού Μητρώου, AΦΜ, AΦΜ αριθμός,


Φορολογικού Μητρώου Νο., τον αριθμό φορολογικού
μητρώου

Greece Value Added Tax (VAT) Number


Value Added Tax (VAT) is a consumption tax that is borne by the end consumer. VAT is paid
for each transaction in the manufacturing and distribution process. For Greece, VAT is
administered by the VAT office for the region in which the business is established.
The Greece Value Added Tax (VAT) Number data identifier detects an 11-character
alphanumeric pattern that matches the Greece VAT Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects an 11-character alphanumeric pattern that matches the Greece
VAT Number format without checksum validation. It checks for common test patterns.
See “Greece Value Added Tax (VAT) Number wide breadth” on page 1207.
■ The medium breadth detects an 11-character alphanumeric pattern that matches the Greece
VAT Number format with checksum validation.
See “Greece Value Added Tax (VAT) Number medium breadth” on page 1207.
■ The narrow breadth detects an 11-character alphanumeric pattern that matches the Greece
VAT Number format with checksum validation. It checks for common test patterns, and
also requires the presence of related keywords.
See “Greece Value Added Tax (VAT) Number narrow breadth” on page 1208.
Library of system data identifiers 1207
Greece Value Added Tax (VAT) Number

Greece Value Added Tax (VAT) Number wide breadth


The wide breadth detects an 11-character alphanumeric pattern that matches the Greece VAT
Number format without checksum validation. It checks for common test patterns.

Table 45-410 Greece Value Added Tax (VAT) Number wide-breadth patterns

Pattern

[Ee][Ll]\d{9}

[Ee][Ll] \d{9}

Table 45-411 Greece Value Added Tax (VAT) Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

000000000, 111111111, 222222222, 333333333,


444444444, 555555555, 666666666, 777777777,
888888888, 999999999

Greece Value Added Tax (VAT) Number medium breadth


The medium breadth detects an 11-character alphanumeric pattern that matches the Greece
VAT Number format with checksum validation.

Table 45-412 Greece Value Added Tax (VAT) Number medium-breadth patterns

Pattern

[Ee][Ll]\d{9}

[Ee][Ll] \d{9}

Table 45-413 Greece Value Added Tax (VAT) Number medium-breadth validators

Mandatory validator Description

Greece VAT Number Validation Check Computes the checksum and validates the pattern against
it.
Library of system data identifiers 1208
Healthcare Common Procedure Coding System (HCPCS CPT Code)

Greece Value Added Tax (VAT) Number narrow breadth


The narrow breadth detects an 11-character alphanumeric pattern that matches the Greece
VAT Number format with checksum validation. It checks for common test patterns, and also
requires the presence of related keywords.

Table 45-414 Greece Value Added Tax (VAT) Number narrow-breadth patterns

Pattern

[Ee][Ll]\d{9}

[Ee][Ll] \d{9}

Table 45-415 Greece Value Added Tax (VAT) Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

000000000, 111111111, 222222222, 333333333,


444444444, 555555555, 666666666, 777777777,
888888888, 999999999

Greece VAT Number Validation Check Computes the checksum and validates the pattern against
it.

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

vat number, value added tax, vat, VAT, VAT#, vat#,


FPA, fpa, VATIN, vatin, Foros Prostithemenis Axias,
arithmós dexamenís, Fóros Prostithémenis Axías,
μέγας κάδος, ΦΠΑ, Φ Π Α, Φόρος Προστιθέμενης
Αξίας, ΦΟΡΟΣ ΠΡΟΣΤΙΘΕΜΕΝΗΣ ΑΞΙΑΣ, φόρος
προστιθέμενης αξίας, Arithmos Forologikou Mitroou,
Α.Φ.Μ, ΑΦΜ

Healthcare Common Procedure Coding System


(HCPCS CPT Code)
The Healthcare Common Procedure Coding System (HCPCS) is a set of health care procedure
codes based on the American Medical Association's Current Procedural Terminology (CPT).
Library of system data identifiers 1209
Healthcare Common Procedure Coding System (HCPCS CPT Code)

The Healthcare Common Procedure Coding System (HCPCS CPT Code) data identifier detects
a two- or five-character alphanumeric pattern that matches the HCPCS CPT Code format.
Healthcare Common Procedure Coding System (HCPCS CPT Code) data identifier provides
two breadths of detection:
■ The medium breadth detects a two- or five-character alphanumeric pattern with checksum
validation.
See “Healthcare Common Procedure Coding System (HCPCS CPT Code) medium breadth”
on page 1209.
■ The narrow breadth detects a two- or five-character alphanumeric pattern with checksum
validation. It also requires the presence of related keywords.
See “Healthcare Common Procedure Coding System (HCPCS CPT Code) narrow breadth”
on page 1210.

Healthcare Common Procedure Coding System (HCPCS CPT Code)


medium breadth
The medium breadth detects a two- or five-character alphanumeric pattern with checksum
validation.

Table 45-416 Healthcare Common Procedure Coding System (HCPCS CPT Code)
medium-breadth patterns

Patterns Patterns (continued)

[A][AD-KMO-Z1-9] [V][1-35-9P]

[B][ALOPRU] [X][EPSU]

[C][A-NPR-T] [Z][AB]

[D][A] [L]\d{4}

[E][1-4A-EJMPTXY] [A][04-9]\d{3}

[F][1-9A-CPX] [B][459][0-29]\d{2}

[G][1-9A-HJ-Z] [C][12589]\d{3}

[H][9A-Z] [E][0128]\d{3}

[J][1-4A-FW] [G][03689]\d{3}

[K][1-4A-Z] [H][0-2]0[0-5]\d

[Q][1-9C-HJ-NPSTW-Z] [J][0-37-9]\d{3}
Library of system data identifiers 1210
Healthcare Common Procedure Coding System (HCPCS CPT Code)

Table 45-416 Healthcare Common Procedure Coding System (HCPCS CPT Code)
medium-breadth patterns (continued)

Patterns Patterns (continued)

[QK]0 [K][0][0-14-9]\d{2}

[L][1CDLMR-T] [M]0[013][067][01456]

[M][2S] [P][2379][06][0-7]\d

[N][BRU] [Q][0-59][01459]\d{2}

[P][1-6A-DIL-OST] [R]007[056]

[R][A-EIRT] [S][0-589]\d{3}

[S][A-HJ-NQS-Z] [T][1245][0159][0-49]\d

[T][1-9AC-HJ-NP-W] [V][25][0-7]\d{2}

[U][1-9A-HJKNP-S]

Table 45-417 Healthcare Common Procedure Coding System (HCPCS CPT Code)
medium-breadth validator

Mandatory validator Description

HCPCS CPT Code Validation Check Computes the checksum and validates the pattern against
it.

Healthcare Common Procedure Coding System (HCPCS CPT Code)


narrow breadth
The narrow breadth detects a two- or five-character alphanumeric pattern with checksum
validation. It also requires the presence of related keywords.

Table 45-418 Healthcare Common Procedure Coding System (HCPCS CPT Code)
narrow-breadth patterns

Patterns Patterns (continued)

[A][AD-KMO-Z1-9] [V][1-35-9P]

[B][ALOPRU] [X][EPSU]

[C][A-NPR-T] [Z][AB]

[D][A] [L]\d{4}
Library of system data identifiers 1211
Healthcare Common Procedure Coding System (HCPCS CPT Code)

Table 45-418 Healthcare Common Procedure Coding System (HCPCS CPT Code)
narrow-breadth patterns (continued)

Patterns Patterns (continued)

[E][1-4A-EJMPTXY] [A][04-9]\d{3}

[F][1-9A-CPX] [B][459][0-29]\d{2}

[G][1-9A-HJ-Z] [C][12589]\d{3}

[H][9A-Z] [E][0128]\d{3}

[J][1-4A-FW] [G][03689]\d{3}

[K][1-4A-Z] [H][0-2]0[0-5]\d

[Q][1-9C-HJ-NPSTW-Z] [J][0-37-9]\d{3}

[QK]0 [K][0][0-14-9]\d{2}

[L][1CDLMR-T] [M]0[013][067][01456]

[M][2S] [P][2379][06][0-7]\d

[N][BRU] [Q][0-59][01459]\d{2}

[P][1-6A-DIL-OST] [R]007[056]

[R][A-EIRT] [S][0-589]\d{3}

[S][A-HJ-NQS-Z] [T][1245][0159][0-49]\d

[T][1-9AC-HJ-NP-W] [V][25][0-7]\d{2}

[U][1-9A-HJKNP-S]

Table 45-419 Healthcare Common Procedure Coding System (HCPCS CPT Code)
narrow-breadth validators

Mandatory validators Description

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

hcpcs cpt code, HCPCS, hcpcs, cpt, CPT, healthcare


common procedure coding system, current procedural
terminology
Library of system data identifiers 1212
Health Insurance Claim Number

Table 45-419 Healthcare Common Procedure Coding System (HCPCS CPT Code)
narrow-breadth validators (continued)

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding characters.

HCPCS CPT Code Validation Check Computes the checksum and validates the pattern against
it.

Health Insurance Claim Number


The Health Insurance Claim Number (HICN) is assigned by the United States Social Security
Administration to an individual for the purpose of identifying them as a medicare beneficiary.
The Health Insurance Claim Number data identifier detects a 7- to 12-character alphanumeric
pattern that matches the Health Insurance Claim Number format.
The Health Insurance Claim Number data identifier provides three breadths of detection
■ The wide breadth detects a 7- to 12-character alphanumeric pattern without checksum
validation.
See “Health Insurance Claim Number wide breadth” on page 1212.
■ The medium breadth detects a 7- to 12-character alphanumeric pattern with checksum
validation.
See “Health Insurance Claim Number medium breadth” on page 1213.
■ The narrow breadth detects a 7- to 12-character alphanumeric pattern with checksum
validation. It also requires the presence of related keywords.
See “Health Insurance Claim Number narrow breadth” on page 1214.

Health Insurance Claim Number wide breadth


The wide breadth detects a 7- to 12-character alphanumeric pattern without checksum
validation.

Table 45-420 Health Insurance Claim Number wide-breadth patterns

Patterns

[a-zA-Z]{1,3}-\d{6}

[a-zA-Z]{1,3}-[0-8]\d{2} \d{1}[1-9] \d{4}

[a-zA-Z]{1,3}-[0-8]\d{3}[1-9]\d{4}

[a-zA-Z]{1,3}-[0-8]\d{2}[1-9]\d{5}
Library of system data identifiers 1213
Health Insurance Claim Number

Table 45-420 Health Insurance Claim Number wide-breadth patterns (continued)

Patterns

[a-zA-Z]{1,3}-[0-8]\d{2}-\d{1}[1-9]-\d{4}

[a-zA-Z]{1,3}-[0-8]\d{2} [1-9]\d{1} \d{4}

[a-zA-Z]{1,3}-[0-8]\d{2}-[1-9]\d{1}-\d{4}

[0-8]\d{2} \d{1}[1-9] \d{4}-[a-zA-Z]{1,3}

[0-8]\d{3}[1-9]\d{4}-[a-zA-Z]{1,3}

[0-8]\d{2}[1-9]\d{5}-[a-zA-Z]{1,3}

[0-8]\d{2}-\d{1}[1-9]-\d{4}-[a-zA-Z]{1,3}

[0-8]\d{2} [1-9]\d{1} \d{4}-[a-zA-Z]{1,3}

[0-8]\d{2}-[1-9]\d{1}-\d{4}-[a-zA-Z]{1,3}

[0-8]\d{2}[1-9]\d{1}\d{4}-[a-zA-Z][0-9]

Table 45-421 Health Insurance Claim Number wide-breadth validator

Mandatory validator

Number delimiter Validates a match by checking the surrounding characters.

Health Insurance Claim Number medium breadth


The medium breadth detects a 7- to 12-character alphanumeric pattern with checksum
validation.

Table 45-422 Health Insurance Claim Number medium-breadth patterns

Patterns

[a-zA-Z]{1,3}-\d{6}

[a-zA-Z]{1,3}-[0-8]\d{2} \d{1}[1-9] \d{4}

[a-zA-Z]{1,3}-[0-8]\d{3}[1-9]\d{4}

[a-zA-Z]{1,3}-[0-8]\d{2}[1-9]\d{5}

[a-zA-Z]{1,3}-[0-8]\d{2}-\d{1}[1-9]-\d{4}

[a-zA-Z]{1,3}-[0-8]\d{2} [1-9]\d{1} \d{4}


Library of system data identifiers 1214
Health Insurance Claim Number

Table 45-422 Health Insurance Claim Number medium-breadth patterns (continued)

Patterns

[a-zA-Z]{1,3}-[0-8]\d{2}-[1-9]\d{1}-\d{4}

[0-8]\d{2} \d{1}[1-9] \d{4}-[a-zA-Z]{1,3}

[0-8]\d{3}[1-9]\d{4}-[a-zA-Z]{1,3}

[0-8]\d{2}[1-9]\d{5}-[a-zA-Z]{1,3}

[0-8]\d{2}-\d{1}[1-9]-\d{4}-[a-zA-Z]{1,3}

[0-8]\d{2} [1-9]\d{1} \d{4}-[a-zA-Z]{1,3}

[0-8]\d{2}-[1-9]\d{1}-\d{4}-[a-zA-Z]{1,3}

[0-8]\d{2}[1-9]\d{1}\d{4}-[a-zA-Z][0-9]

Table 45-423 Health Insurance Claim Number medium-breadth validator

Mandatory validator

Health Care Insurance Number Check Computes the checksum and validates the pattern against
it.

Health Insurance Claim Number narrow breadth


The narrow breadth detects a 7- to 12-character alphanumeric pattern with checksum validation.
It also requires the presence of related keywords.

Table 45-424 Health Insurance Claim Number narrow-breadth patterns

Patterns

[a-zA-Z]{1,3}-\d{6}

[a-zA-Z]{1,3}-[0-8]\d{2} \d{1}[1-9] \d{4}

[a-zA-Z]{1,3}-[0-8]\d{3}[1-9]\d{4}

[a-zA-Z]{1,3}-[0-8]\d{2}[1-9]\d{5}

[a-zA-Z]{1,3}-[0-8]\d{2}-\d{1}[1-9]-\d{4}

[a-zA-Z]{1,3}-[0-8]\d{2} [1-9]\d{1} \d{4}

[a-zA-Z]{1,3}-[0-8]\d{2}-[1-9]\d{1}-\d{4}
Library of system data identifiers 1215
Hong Kong ID

Table 45-424 Health Insurance Claim Number narrow-breadth patterns (continued)

Patterns

[0-8]\d{2} \d{1}[1-9] \d{4}-[a-zA-Z]{1,3}

[0-8]\d{3}[1-9]\d{4}-[a-zA-Z]{1,3}

[0-8]\d{2}[1-9]\d{5}-[a-zA-Z]{1,3}

[0-8]\d{2}-\d{1}[1-9]-\d{4}-[a-zA-Z]{1,3}

[0-8]\d{2} [1-9]\d{1} \d{4}-[a-zA-Z]{1,3}

[0-8]\d{2}-[1-9]\d{1}-\d{4}-[a-zA-Z]{1,3}

[0-8]\d{2}[1-9]\d{1}\d{4}-[a-zA-Z][0-9]

Table 45-425 Health Insurance Claim Number narrow-breadth validators

Mandatory validators

Number delimiter Validates a match by checking the surrounding characters.

Health Care Insurance Number Check Computes the checksum and validates the pattern against
it.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

health insurance claim number, hicn, hic number, hic


no, hic#, hic no., hicn#, hicno#

Hong Kong ID
The Hong Kong ID is the unique identifier for all residents of Hong Kong that appears on the
Hong Kong Identity Card.
The Hong Kong ID data identifier detects eight-character patterns that match the Hong Kong
ID format.
The Hong Kong ID data identifier provides two breadths of detection:
■ The wide breadth detects eight characters in the form LDDDDDD(D) or LDDDDDD(A). The
last character in the detected string is used to validate a checksum.
See “Hong Kong ID wide breadth” on page 1216.
Library of system data identifiers 1216
Hong Kong ID

■ The narrow breadth detects eight characters in the form LDDDDDD(D) or LDDDDDD(A).
The last character in the detected string is used to validate a checksum. It also requires
the presence of Hong Kong ID-related keywords.
See “Hong Kong ID narrow breadth” on page 1216.

Hong Kong ID wide breadth


The wide breadth detects eight characters in the form LDDDDDD(D) or LDDDDDD(A). The
last character in the detected string is used to validate a checksum.

Table 45-426 Hong Kong ID wide-breadth patterns

Patterns

[A-Za-z]\d{6}(\d)

[A-Za-z][A-Za-z]\d{6}(\d)

[A-Za-z]\d{6}(A)

[A-Za-z]\d{6}(a)

[A-Za-z][A-Za-z]\d{6}(A)

[A-Za-z][A-Za-z]\d{6}(a)

[A-Za-z]\d{7}

[A-Za-z][A-Za-z]\d{7}

[A-Za-z]\d{6}[Aa]

[A-Za-z][A-Za-z]\d{6}[Aa]

Table 45-427 Hong Kong ID wide-breadth validator

Mandatory validator Description

Hong Kong ID Computes the checksum and validates the pattern against it.

Hong Kong ID narrow breadth


The narrow breadth detects eight characters in the form LDDDDDD(D) or LDDDDDD(A). The
last character in the detected string is used to validate a checksum. It also requires the presence
of Hong Kong ID-related keywords.
Library of system data identifiers 1217
Hungary Driver's Licence Number

Table 45-428 Hong Kong ID narrow-breadth patterns

Patterns

[A-Za-z]\d{6}(\d)

[A-Za-z][A-Za-z]\d{6}(\d)

[A-Za-z]\d{6}(A)

[A-Za-z]\d{6}(a)

[A-Za-z][A-Za-z]\d{6}(A)

[A-Za-z][A-Za-z]\d{6}(a)

[A-Za-z]\d{7}

[A-Za-z][A-Za-z]\d{7}

[A-Za-z]\d{6}[Aa]

[A-Za-z][A-Za-z]\d{6}[Aa]

Table 45-429 Hong Kong ID narrow-breadth validators

Mandatory validators Description

Hong Kong ID Computes the checksum and validates the pattern against
it.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

身份證,三顆星, Identity card, Hong Kong permanent


resident ID Card, HKID

Hungary Driver's Licence Number


A driving license in Hungary is a document issued by the Ministry of Economics and Transport,
confirming the rights of the holder to drive motor vehicles.
The Hungary Driver's Licence Number data identifier detects an eight-character alphanumeric
pattern that matches the Hungary Driver's Licence Number format.
This data identifier provides the following breadths of detection:
Library of system data identifiers 1218
Hungary Driver's Licence Number

■ The wide breadth detects an eight-character alphanumeric pattern that matches the Hungary
Driver's Licence Number format. It checks for common test patterns.
See “Hungary Driver's Licence Number wide breadth” on page 1218.
■ The narrow breadth detects an eight-character alphanumeric pattern that matches the
Hungary Driver's Licence Number format. It checks for common test patterns, and it requires
the presence of related keywords.
See “Hungary Driver's Licence Number narrow breadth” on page 1218.

Hungary Driver's Licence Number wide breadth


The wide breadth detects an eight-character alphanumeric pattern that matches the Hungary
Driver's Licence Number format. It checks for common test patterns.

Table 45-430 Hungary Driver's Licence Number wide-breadth patterns

Pattern

[Cc][A-Za-z]\d{6}

Table 45-431 Hungary Driver's Licence Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:
000000, 111111, 222222, 333333, 444444, 555555,
666666, 777777, 888888, 999999

Hungary Driver's Licence Number narrow breadth


The narrow breadth detects an eight-character alphanumeric pattern that matches the Hungary
Driver's Licence Number format. It checks for common test patterns, and it requires the presence
of related keywords.

Table 45-432 Hungary Driver's Licence Number narrow-breadth patterns

Pattern

[Cc][A-Za-z]\d{6}
Library of system data identifiers 1219
Hungary Passport Number

Table 45-433 Hungary Driver's Licence Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

000000, 111111, 222222, 333333, 444444, 555555,


666666, 777777, 888888, 999999

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

DLNo#, dlno#, DL#, Drivers Lic., driver licence, driver


license, drivers licence, drivers license, driver's
licence, driver's license, driving licence, driving
license, licence number, license number, driving permit

jogosítvány, Illesztőprogramok Lic, jogsi,licencszám,


vezetői engedély, VEZETŐI ENGEDÉLY, vezető
engedély, VEZETŐ ENGEDÉLY

Hungary Passport Number


Hungarian passports are issued to Hungarian citizens for international travel by the Central
Data Processing, Registration, and Election Office of the Hungarian Ministry of the Interior.
The Hungary Passport Number data identifier detects an eight- or nine-character alphanumeric
pattern that matches the Hungary Passport Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects an eight- or nine-character alphanumeric pattern that matches
the Hungary Passport Number format without checksum validation.
See “Hungary Passport Number wide breadth” on page 1220.
■ The medium breadth detects an eight- or nine-character alphanumeric pattern that matches
the Hungary Passport Number format with checksum validation.
See “Hungary Passport Number medium breadth” on page 1220.
■ The narrow breadth detects an eight- or nine-character alphanumeric pattern that matches
the Hungary Passport Number format with checksum validation. It also requires the presence
of related keywords.
See “Hungary Passport Number narrow breadth” on page 1220.
Library of system data identifiers 1220
Hungary Passport Number

Hungary Passport Number wide breadth


The wide breadth detects an eight- or nine-character alphanumeric pattern that matches the
Hungary Passport Number format without checksum validation.

Table 45-434 Hungary Passport Number wide-breadth patterns

Pattern

[A-Za-z]{2}[0-9]{6}

[A-Za-z]{2}[0-9]{7}

Table 45-435 Hungary Passport Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Hungary Passport Number medium breadth


The medium breadth detects an eight- or nine-character alphanumeric pattern that matches
the Hungary Passport Number format with checksum validation.

Table 45-436 Hungary Passport Number medium-breadth patterns

Pattern

[A-Za-z]{2}[0-9]{6}

[A-Za-z]{2}[0-9]{7}

Table 45-437 Hungary Passport Number medium-breadth validators

Mandatory validator Description

Hungary Passport Number Validation Check Computes the checksum and validates the pattern against
it.

Hungary Passport Number narrow breadth


The narrow breadth detects an eight- or nine-character alphanumeric pattern that matches
the Hungary Passport Number format with checksum validation. It also requires the presence
of related keywords.
Library of system data identifiers 1221
Hungarian Social Security Number

Table 45-438 Hungary Passport Number narrow-breadth patterns

Pattern

[A-Za-z]{2}[0-9]{6}

[A-Za-z]{2}[0-9]{7}

Table 45-439 Hungary Passport Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Hungary Passport Number Validation Check Computes the checksum and validates the pattern against
it.

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

passport, útlevél, hungarian passport number, Magyar


útlevélszám, hungarianpassportnumber, passport
book, útlevél könyv, passeport, nombre, numéro de
passeport, hongrois, numéro de passeport hongrois

Hungarian Social Security Number


The Hungarian Social Security Number (TAJ) is a unique identifier issued by the Hungarian
government.
The Hungarian Social Security Number data identifier detects a nine-digit number that matches
the Hungarian Social Security Number format.
The Hungarian Social Security Number system data identifier provides three breadths of
detection:
■ The wide breadth detects a nine-digit number without checksum validation.
See “Hungarian Social Security Number wide breadth” on page 1222.
■ The medium breadth detects a nine-digit number with checksum validation.
See “Hungarian Social Security Number medium breadth” on page 1222.
■ The narrow breadth detects a nine-digit number that passes checksum validation. It also
requires related keywords.
See “Hungarian Social Security Number narrow breadth” on page 1222.
Library of system data identifiers 1222
Hungarian Social Security Number

Hungarian Social Security Number wide breadth


The wide breadth detects a nine-digit number without checksum validation.

Table 45-440 Hungarian Social Security Number wide-breadth pattern

Pattern

\d{9}

Table 45-441 Hungarian Social Security Number wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Hungarian Social Security Number medium breadth


The medium breadth detects a nine-digit number with checksum validation.

Table 45-442 Hungarian Social Security Number medium-breadth pattern

Pattern

\d{9}

Table 45-443 Hungarian Social Security Number medium-breadth validators

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding characters.

Hungarian Social Security Validation Check Computes the checksum and validates the pattern against
it.

Hungarian Social Security Number narrow breadth


The narrow breadth detects a nine-digit number that passes checksum validation. It also
requires related keywords.

Table 45-444 Hungarian Social Security Number narrow-breadth pattern

Pattern

\d{9}
Library of system data identifiers 1223
Hungarian Tax Identification Number

Table 45-445 Hungarian Social Security Number narrow-breadth validators

Mandatory validators Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

Hungarian Social Security Validation Check Computes the checksum and validates the pattern against
it.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

Hungarian social security number, social security


number, socialsecuritynumber#, hssn#, HSSN#,
socialsecuritynno, HSSN, TAJ, TAJ#, SSN, SSN#,
social security no

ÁFA, Közösségi adószám, Általános forgalmi adó


szám, hozzáadottérték adó, ÁFA szám, magyar ÁFA
szám

Hungarian Tax Identification Number


The Hungarian Tax Identification Number is a 10-digit number that always begins with the digit
"8."
The Hungarian Tax Identification Number data identifier detects a 10-digit number that matches
the Hungarian Tax Identification Number format.
The Hungarian Tax Identification Number system data identifier provides three breadths of
detection:
■ The wide breadth detects a 10-digit number beginning with the digit "8" without checksum
validation.
See “Hungarian Tax Identification Number wide breadth” on page 1224.
■ The medium breadth detects a 10-digit number beginning with the digit "8" with checksum
validation.
See “Hungarian Tax Identification Number medium breadth” on page 1224.
■ The narrow breadth detects a 10-digit number beginning with the digit "8" that passes
checksum validation. It also requires the presence of related keywords.
See “Hungarian Tax Identification Number narrow breadth” on page 1224.
Library of system data identifiers 1224
Hungarian Tax Identification Number

Hungarian Tax Identification Number wide breadth


The wide breadth detects a 10-digit number beginning with the digit "8" without checksum
validation.

Table 45-446 Hungarian Tax Identification Number wide-breadth pattern

Pattern

[8]\d{9}

Table 45-447 Hungarian Tax Identification Number wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Hungarian Tax Identification Number medium breadth


The medium breadth detects a 10-digit number beginning with the digit "8" with checksum
validation.

Table 45-448 Hungarian Tax Identification Number medium breadth-pattern

Pattern

[8]\d{9}

Table 45-449 Hungarian Tax Identification Number medium-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Hungarian Tax Identification Number Validation Check Computes the checksum and validates the pattern against
it.

Hungarian Tax Identification Number narrow breadth


The narrow breadth detects a 10-digit number beginning with the digit "8" that passes checksum
validation. It also requires the presence of related keywords.

Table 45-450 Hungarian Tax Identification Number narrow breadth-pattern

Pattern

[8]\d{9}
Library of system data identifiers 1225
Hungarian VAT Number

Table 45-451 Hungarian Tax Identification Number narrow-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

Hungarian Tax Identification Number Validation Check Computes the checksum and validates the pattern against
it.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

Hungarian tax identification number, Hungarian TIN,


tax ID number, VAT number, tax authority no, tax ID
tax identity number, taxidnumber#, tin#, TIN#,
Hungatiantin#, tax identification no, taxIDno#,
adóazonosító szám, adószám, adóhatóság szám

Hungarian VAT Number


All Hungarian businesses (including non-profit organizations) upon registration at the court of
Registry are granted a value-added tax (VAT) number.
The Hungarian VAT Number data identifier detects an eight-character alphanumeric pattern
that matches the Hungarian VAT Number format.
The Hungarian VAT Number system data identifier provides three breadths of detection:
■ The wide breadth detects an eight-character alphanumeric pattern beginning with the letters
"HU/hu" without checksum validation.
See “Hungarian VAT Number wide breadth” on page 1226.
■ The medium breadth detects an eight-character alphanumeric pattern beginning with the
letters "HU/hu" with checksum validation.
See “Hungarian VAT Number medium breadth” on page 1226.
■ The narrow breadth detects an eight-character alphanumeric pattern beginning with the
letters "HU/hu" that passes checksum validation. It also requires the presence of related
keywords.
See “Hungarian VAT Number narrow breadth” on page 1226.
Library of system data identifiers 1226
Hungarian VAT Number

Hungarian VAT Number wide breadth


The wide breadth detects an eight-character alphanumeric pattern beginning with the letters
"HU/hu" without checksum validation.

Table 45-452 Hungarian VAT Number wide-breadth patterns

Patterns

HU\d{8}

hu\d{8}

Table 45-453 Hungarian VAT Number wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Hungarian VAT Number medium breadth


The medium breadth detects an eight-character alphanumeric pattern beginning with the letters
"HU/hu" with checksum validation.

Table 45-454 Hungarian VAT Number medium-breadth patterns

Patterns

HU\d{8}

hu\d{8}

Table 45-455 Hungarian VAT Number medium-breadth validators

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding characters.

Hungarian VAT Number Validation Check Computes the checksum and validates the pattern against
it.

Hungarian VAT Number narrow breadth


The narrow breadth detects an eight-character alphanumeric pattern beginning with the letters
"HU/hu" that passes checksum validation. It also requires the presence of related keywords.
Library of system data identifiers 1227
IBAN Central

Table 45-456 Hungarian VAT Number narrow-breadth patterns

Patterns

HU\d{8}

hu\d{8}

Table 45-457 Hungarian VAT Number narrow-breadth validators

Mandatory validators Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

Hungarian VAT Number Validation Check Computes the checksum and validates the pattern against
it.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

VAT, VAT No., Value Added Tax Number, vat#, vatno#,


hungarianvatno#, tax no., VAT number, value added
tax

ÁFA, Közösségi adószám, Általános forgalmi adó


szám, hozzáadottérték adó, ÁFA szám, magyar ÁFA
szám

IBAN Central
The International Bank Account Number (IBAN) is an international standard for identifying
bank accounts across national borders.
The IBAN Central data identifier detects IBAN numbers for Andorra, Austria, Belgium, Germany,
Italy, Liechtenstein, Luxembourg, Malta, Monaco, San Marino, and Switzerland.
The IBAN West data identifier provides two breadths of detection:
■ The wide breadth detects a country-specific IBAN number with checksum validation.
See “IBAN Central wide breadth” on page 1228.
■ The narrow breadth detects a country-specific IBAN number with checksum validation. It
also requires the presence of related keywords.
See “IBAN Central narrow breadth” on page 1229.
Library of system data identifiers 1228
IBAN Central

Note: Do not add the NIB validation to any IBAN data identifiers that apply to DLP Agents. The
NIB validator is only for use with server-side detection.

IBAN Central wide breadth


The wide breadth detects a country-specific IBAN number with checksum validation. IBAN
numbers can include space delimiters, dash delimiters, or no delimiters.

Table 45-458 IBAN Central wide-breadth patterns

Patterns Description

AD\d{2}\d{4}\d{4}\w{4}\w{4}\w{4} Andorra patterns

AD\d{2} \d{4} \d{4} \w{4} \w{4} \w{4}

AD\d{2}-\d{4}-\d{4}-\w{4}-\w{4}-\w{4}

AT\d{2}\d{4}\d{4}\d{4}\d{4} Austria patterns

AT\d{2} \d{4} \d{4} \d{4} \d{4}

AT\d{2}-\d{4}-\d{4}-\d{4}-\d{4}

BE\d{2}\d{4}\d{4}\d{4} Belgium patterns

BE\d{2} \d{4} \d{4} \d{4}

BE\d{2}-\d{4}-\d{4}-\d{4}

CH\d{2}\d{4}\d\w{3}\w{4}\w{4}\w Switzerland patterns

CH\d{2} \d{4} \d\w{3} \w{4} \w{4} \w

CH\d{2}-\d{4}-\d\w{3}-\w{4}-\w{4}-\w

DE\d{2}\d{4}\d{4}\d{4}\d{4}\d{2} Germany patterns

DE\d{2} \d{4} \d{4} \d{4} \d{4} \d{2}

DE\d{2}-\d{4}-\d{4}-\d{4}-\d{4}-\d{2}

IT\d{2}[A-Z]\d{3}\d{4}\d{3}\w\w{4}\w{4}\w{3} Italy patterns

IT\d{2} [A-Z]\d{3} \d{4} \d{3}\w \w{4} \w{4} \w{3}

IT\d{2}-[A-Z]\d{3}-\d{4}-\d{3}\w-\w{4}-\w{4}-\w{3}

LI\d{2}\d{4}\d\w{3}\w{4}\w{4}\w Liechtenstein patterns

LI\d{2} \d{4} \d\w{3} \w{4} \w{4} \w

LI\d{2}-\d{4}-\d\w{3}-\w{4}-\w{4}-\w
Library of system data identifiers 1229
IBAN Central

Table 45-458 IBAN Central wide-breadth patterns (continued)

Patterns Description

LU\d{2}\d{3}\w\w{4}\w{4}\w{4} Luxembourg patterns

LU\d{2} \d{3}\w \w{4} \w{4} \w{4}

LU\d{2}-\d{3}\w-\w{4}-\w{4}-\w{4}

MC\d{2}\d{4}\d{4}\d{2}\w{2}\w{4}\w{4}\w\d{2} Monaco patterns

MC\d{2} \d{4} \d{4} \d{2}\w{2} \w{4} \w{4} \w\d{2}

MC\d{2}-\d{4}-\d{4}-\d{2}\w{2}-\w{4}-\w{4}-\w\d{2}

MT\d{2}[A-Z]{4}\d{4}\d\w{3}\w{4}\w{4}\w{4}\w{3} Malta patterns

MT\d{2} [A-Z]{4} \d{4} \d\w{3} \w{4} \w{4} \w{4}


\w{3}

MT\d{2}-[A-Z]{4}-\d{4}-\d\w{3}-\w{4}-\w{4}-\w{4}-\w{3}

SM\d{2}[A-Z]\d{3}\d{4}\d{3}\w\w{4}\w{4}\w{3} San Marino patterns

SM\d{2} [A-Z]\d{3} \d{4} \d{3}\w \w{4} \w{4} \w{3}

SM\d{2}-[A-Z]\d{3}-\d{4}-\d{3}\w-\w{4}-\w{4}-\w{3}

Table 45-459 IBAN Central wide-breadth validator

Validator Description

Mod 97 Validator Computes the ISO 7064 Mod 97-10 checksum of the
complete match.

IBAN Central narrow breadth


The narrow breadth detects a country-specific IBAN number with checksum validation. It also
requires the presence of related keywords.

Table 45-460 IBAN Central narrow-breadth patterns

Patterns Description

AD\d{2}\d{4}\d{4}\w{4}\w{4}\w{4} Andorra patterns

AD\d{2} \d{4} \d{4} \w{4} \w{4} \w{4}

AD\d{2}-\d{4}-\d{4}-\w{4}-\w{4}-\w{4}
Library of system data identifiers 1230
IBAN Central

Table 45-460 IBAN Central narrow-breadth patterns (continued)

Patterns Description

AT\d{2}\d{4}\d{4}\d{4}\d{4} Austria patterns

AT\d{2} \d{4} \d{4} \d{4} \d{4}

AT\d{2}-\d{4}-\d{4}-\d{4}-\d{4}

BE\d{2}\d{4}\d{4}\d{4} Belgium patterns

BE\d{2} \d{4} \d{4} \d{4}

BE\d{2}-\d{4}-\d{4}-\d{4}

CH\d{2}\d{4}\d\w{3}\w{4}\w{4}\w Switzerland patterns

CH\d{2} \d{4} \d\w{3} \w{4} \w{4} \w

CH\d{2}-\d{4}-\d\w{3}-\w{4}-\w{4}-\w

DE\d{2}\d{4}\d{4}\d{4}\d{4}\d{2} Germany patterns

DE\d{2} \d{4} \d{4} \d{4} \d{4} \d{2}

DE\d{2}-\d{4}-\d{4}-\d{4}-\d{4}-\d{2}

IT\d{2}[A-Z]\d{3}\d{4}\d{3}\w\w{4}\w{4}\w{3} Italy patterns

IT\d{2} [A-Z]\d{3} \d{4} \d{3}\w \w{4} \w{4} \w{3}

IT\d{2}-[A-Z]\d{3}-\d{4}-\d{3}\w-\w{4}-\w{4}-\w{3}

LI\d{2}\d{4}\d\w{3}\w{4}\w{4}\w Liechtenstein patterns

LI\d{2} \d{4} \d\w{3} \w{4} \w{4} \w

LI\d{2}-\d{4}-\d\w{3}-\w{4}-\w{4}-\w

LU\d{2}\d{3}\w\w{4}\w{4}\w{4} Luxembourg patterns

LU\d{2} \d{3}\w \w{4} \w{4} \w{4}

LU\d{2}-\d{3}\w-\w{4}-\w{4}-\w{4}

MC\d{2}\d{4}\d{4}\d{2}\w{2}\w{4}\w{4}\w\d{2} Monaco patterns

MC\d{2} \d{4} \d{4} \d{2}\w{2} \w{4} \w{4} \w\d{2}

MC\d{2}-\d{4}-\d{4}-\d{2}\w{2}-\w{4}-\w{4}-\w\d{2}

MT\d{2}[A-Z]{4}\d{4}\d\w{3}\w{4}\w{4}\w{4}\w{3} Malta patterns

MT\d{2} [A-Z]{4} \d{4} \d\w{3} \w{4} \w{4} \w{4}


\w{3}

MT\d{2}-[A-Z]{4}-\d{4}-\d\w{3}-\w{4}-\w{4}-\w{4}-\w{3}
Library of system data identifiers 1231
IBAN East

Table 45-460 IBAN Central narrow-breadth patterns (continued)

Patterns Description

SM\d{2}[A-Z]\d{3}\d{4}\d{3}\w\w{4}\w{4}\w{3} San Marino patterns

SM\d{2} [A-Z]\d{3} \d{4} \d{3}\w \w{4} \w{4} \w{3}

SM\d{2}-[A-Z]\d{3}-\d{4}-\d{3}\w-\w{4}-\w{4}-\w{3}

Table 45-461 IBAN Central narrow-breadth validators

Validators Description

Mod 97 Validator Computes the ISO 7064 Mod 97-10 checksum of the
complete match.

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched when you use this
option.

Inputs:

Code IBAN, numéro IBAN, IBAN Code, IBAN number

IBAN East
The International Bank Account Number (IBAN) is an international standard for identifying
bank accounts across national borders.
The IBAN East data identifier detects IBAN numbers for Bosnia, Bulgaria, Croatia, Cyprus,
Czech Republic, Estonia, Greece, Hungary, Israel, Latvia, Lithuania, Macedonia, Montenegro,
Poland, Romania, Serbia, Slovakia, Slovenia, Turkey, and Tunisia.
The IBAN West data identifier provides two breadths of detection:
■ The wide breadth detects a country-specific IBAN number with checksum validation.
See “IBAN East wide breadth” on page 1232.
■ The narrow breadth detects a country-specific IBAN number with checksum validation. It
also requires the presence of related keywords.
See “IBAN East narrow-breadth” on page 1234.

Note: Do not add the NIB validation to any IBAN data identifiers that apply to DLP Agents. The
NIB validator is only for use with server-side detection.
Library of system data identifiers 1232
IBAN East

IBAN East wide breadth


The wide breadth detects a country-specific IBAN number with checksum validation. IBAN
numbers can include space delimiters, dash delimiters, or no delimiters.

Table 45-462 IBAN East wide-breadth patterns

Patterns Description

BA\d{2}\d{4}\d{4}\d{4}\d{4} Bosnia patterns

BA\d{2} \d{4} \d{4} \d{4} \d{4}

BA\d{2}-\d{4}-\d{4}-\d{4}-\d{4}

BG\d{2}[A-Z]{4}\d{4}\d{2}\w{2}\w{4}\w{2} Bulgaria patterns

BG\d{2} [A-Z]{4} \d{4} \d{2}\w{2} \w{4} \w{2}

BG\d{2}-[A-Z]{4}-\d{4}-\d{2}\w{2}-\w{4}-\w{2}

CY\d{2}\d{4}\d{4}\w{4}\w{4}\w{4}\w{4} Cyprus patterns

CY\d{2} \d{4} \d{4} \w{4} \w{4} \w{4} \w{4}

CY\d{2}-\d{4}-\d{4}-\w{4}-\w{4}-\w{4}-\w{4}

CZ\d{2}\d{4}\d{4}\d{4}\d{4}\d{4} Czech Republic patterns

CZ\d{2} \d{4} \d{4} \d{4} \d{4} \d{4}

CZ\d{2}-\d{4}-\d{4}-\d{4}-\d{4}-\d{4}

EE\d{2}\d{4}\d{4}\d{4}\d{4} Estonia patterns

EE\d{2} \d{4} \d{4} \d{4} \d{4}

EE\d{2}-\d{4}-\d{4}-\d{4}-\d{4}

GR\d{2}\d{4}\d{3}\w\w{4}\w{4}\w{4}\w{3} Greece patterns

GR\d{2} \d{4} \d{3}\w \w{4} \w{4} \w{4} \w{3}

GR\d{2}-\d{4}-\d{3}\w-\w{4}-\w{4}-\w{4}-\w{3}

HR\d{2}\d{4}\d{4}\d{4}\d{4}\d Croatia patterns

HR\d{2} \d{4} \d{4} \d{4} \d{4} \d

HR\d{2}-\d{4}-\d{4}-\d{4}-\d{4}-\d

HU\d{2}\d{4}\d{4}\d{4}\d{4}\d{4}\d{4} Hungary patterns

HU\d{2} \d{4} \d{4} \d{4} \d{4} \d{4} \d{4}

HU\d{2}-\d{4}-\d{4}-\d{4}-\d{4}-\d{4}-\d{4}
Library of system data identifiers 1233
IBAN East

Table 45-462 IBAN East wide-breadth patterns (continued)

Patterns Description

IL\d{2}\d{4}\d{4}\d{4}\d{4}\d{3} Israel patterns

IL\d{2} \d{4} \d{4} \d{4} \d{4} \d{3}

IL\d{2}-\d{4}-\d{4}-\d{4}-\d{4}-\d{3}

LT\d{2}\d{4}\d{4}\d{4}\d{4} Lithuania patterns

LT\d{2} \d{4} \d{4} \d{4} \d{4}

LT\d{2}-\d{4}-\d{4}-\d{4}-\d{4}

LV\d{2}[A-Z]{4}\w{4}\w{4}\w{4}\w Latvia patterns

LV\d{2} [A-Z]{4} \w{4} \w{4} \w{4} \w

LV\d{2}-[A-Z]{4}-\w{4}-\w{4}-\w{4}-\w

ME\d{2}\d{4}\d{4}\d{4}\d{4}\d{2} Montenegro patterns

ME\d{2} \d{4} \d{4} \d{4} \d{4} \d{2}

ME\d{2}-\d{4}-\d{4}-\d{4}-\d{4}-\d{2}

MK\d{2}\d{3}\w\w{4}\w{4}\w\d{2} Macedonia patterns

MK\d{2} \d{3}\w \w{4} \w{4} \w\d{2}

MK\d{2}-\d{3}\w-\w{4}-\w{4}-\w\d{2}

PL\d{2}\d{4}\d{4}\d{4}\d{4}\d{4}\d{4} Poland patterns

PL\d{2} \d{4} \d{4} \d{4} \d{4} \d{4} \d{4}

PL\d{2}-\d{4}-\d{4}-\d{4}-\d{4}-\d{4}-\d{4}

RO\d{2}[A-Z]{4}\w{4}\w{4}\w{4}\w{4} Romania patterns

RO\d{2} [A-Z]{4} \w{4} \w{4} \w{4} \w{4}

RO\d{2}-[A-Z]{4}-\w{4}-\w{4}-\w{4}-\w{4}

RS\d{2}\d{4}\d{4}\d{4}\d{4}\d{2} Serbia patterns

RS\d{2} \d{4} \d{4} \d{4} \d{4} \d{2}

RS\d{2}-\d{4}-\d{4}-\d{4}-\d{4}-\d{2}

SI\d{2}\d{4}\d{4}\d{4}\d{3} Slovenia patterns

SI\d{2} \d{4} \d{4} \d{4} \d{3}

SI\d{2}-\d{4}-\d{4}-\d{4}-\d{3}
Library of system data identifiers 1234
IBAN East

Table 45-462 IBAN East wide-breadth patterns (continued)

Patterns Description

SK\d{2}\d{4}\d{4}\d{4}\d{4}\d{4} Slovak Republic patterns

SK\d{2} \d{4} \d{4} \d{4} \d{4} \d{4}

SK\d{2}-\d{4}-\d{4}-\d{4}-\d{4}-\d{4}

TN59\d{4}\d{4}\d{4}\d{4}\d{4} Tunisia patterns

TN59 \d{4} \d{4} \d{4} \d{4} \d{4}

TN59-\d{4}-\d{4}-\d{4}-\d{4}-\d{4}

TR\d{2}\d{4}\d\w{3}\w{4}\w{4}\w{4}\w{2} Turkey patterns

TR\d{2} \d{4} \d\w{3} \w{4} \w{4} \w{4} \w{2}

TR\d{2}-\d{4}-\d\w{3}-\w{4}-\w{4}-\w{4}-\w{2}

Table 45-463 IBAN East wide-breadth validator

Validator Description

Mod 97 Validator Computes the ISO 7064 Mod 97-10 checksum of the
complete match.

IBAN East narrow-breadth


The narrow breadth detects a country-specific IBAN number with checksum validation. It also
requires the presence of related keywords.

Table 45-464 IBAN East narrow-breadth patterns

Patterns Description

BA\d{2}\d{4}\d{4}\d{4}\d{4} Bosnia patterns

BA\d{2} \d{4} \d{4} \d{4} \d{4}

BA\d{2}-\d{4}-\d{4}-\d{4}-\d{4}

BG\d{2}[A-Z]{4}\d{4}\d{2}\w{2}\w{4}\w{2} Bulgaria patterns

BG\d{2} [A-Z]{4} \d{4} \d{2}\w{2} \w{4} \w{2}

BG\d{2}-[A-Z]{4}-\d{4}-\d{2}\w{2}-\w{4}-\w{2}
Library of system data identifiers 1235
IBAN East

Table 45-464 IBAN East narrow-breadth patterns (continued)

Patterns Description

CY\d{2}\d{4}\d{4}\w{4}\w{4}\w{4}\w{4} Cyprus patterns

CY\d{2} \d{4} \d{4} \w{4} \w{4} \w{4} \w{4}

CY\d{2}-\d{4}-\d{4}-\w{4}-\w{4}-\w{4}-\w{4}

CZ\d{2}\d{4}\d{4}\d{4}\d{4}\d{4} Czech Republic patterns

CZ\d{2} \d{4} \d{4} \d{4} \d{4} \d{4}

CZ\d{2}-\d{4}-\d{4}-\d{4}-\d{4}-\d{4}

EE\d{2}\d{4}\d{4}\d{4}\d{4} Estonia patterns

EE\d{2} \d{4} \d{4} \d{4} \d{4}

EE\d{2}-\d{4}-\d{4}-\d{4}-\d{4}

GR\d{2}\d{4}\d{3}\w\w{4}\w{4}\w{4}\w{3} Greece patterns

GR\d{2} \d{4} \d{3}\w \w{4} \w{4} \w{4} \w{3}

GR\d{2}-\d{4}-\d{3}\w-\w{4}-\w{4}-\w{4}-\w{3}

HR\d{2}\d{4}\d{4}\d{4}\d{4}\d Croatia patterns

HR\d{2} \d{4} \d{4} \d{4} \d{4} \d

HR\d{2}-\d{4}-\d{4}-\d{4}-\d{4}-\d

HU\d{2}\d{4}\d{4}\d{4}\d{4}\d{4}\d{4} Hungary patterns

HU\d{2} \d{4} \d{4} \d{4} \d{4} \d{4} \d{4}

HU\d{2}-\d{4}-\d{4}-\d{4}-\d{4}-\d{4}-\d{4}

IL\d{2}\d{4}\d{4}\d{4}\d{4}\d{3} Israel patterns

IL\d{2} \d{4} \d{4} \d{4} \d{4} \d{3}

IL\d{2}-\d{4}-\d{4}-\d{4}-\d{4}-\d{3}

LT\d{2}\d{4}\d{4}\d{4}\d{4} Lithuania patterns

LT\d{2} \d{4} \d{4} \d{4} \d{4}

LT\d{2}-\d{4}-\d{4}-\d{4}-\d{4}

LV\d{2}[A-Z]{4}\w{4}\w{4}\w{4}\w Latvia patterns

LV\d{2} [A-Z]{4} \w{4} \w{4} \w{4} \w

LV\d{2}-[A-Z]{4}-\w{4}-\w{4}-\w{4}-\w
Library of system data identifiers 1236
IBAN East

Table 45-464 IBAN East narrow-breadth patterns (continued)

Patterns Description

ME\d{2}\d{4}\d{4}\d{4}\d{4}\d{2} Montenegro patterns

ME\d{2} \d{4} \d{4} \d{4} \d{4} \d{2}

ME\d{2}-\d{4}-\d{4}-\d{4}-\d{4}-\d{2}

MK\d{2}\d{3}\w\w{4}\w{4}\w\d{2} Macedonia patterns

MK\d{2} \d{3}\w \w{4} \w{4} \w\d{2}

MK\d{2}-\d{3}\w-\w{4}-\w{4}-\w\d{2}

PL\d{2}\d{4}\d{4}\d{4}\d{4}\d{4}\d{4} Poland patterns

PL\d{2} \d{4} \d{4} \d{4} \d{4} \d{4} \d{4}

PL\d{2}-\d{4}-\d{4}-\d{4}-\d{4}-\d{4}-\d{4}

RO\d{2}[A-Z]{4}\w{4}\w{4}\w{4}\w{4} Romania patterns

RO\d{2} [A-Z]{4} \w{4} \w{4} \w{4} \w{4}

RO\d{2}-[A-Z]{4}-\w{4}-\w{4}-\w{4}-\w{4}

RS\d{2}\d{4}\d{4}\d{4}\d{4}\d{2} Serbia patterns

RS\d{2} \d{4} \d{4} \d{4} \d{4} \d{2}

RS\d{2}-\d{4}-\d{4}-\d{4}-\d{4}-\d{2}

SI\d{2}\d{4}\d{4}\d{4}\d{3} Slovenia patterns

SI\d{2} \d{4} \d{4} \d{4} \d{3}

SI\d{2}-\d{4}-\d{4}-\d{4}-\d{3}

SK\d{2}\d{4}\d{4}\d{4}\d{4}\d{4} Slovak Republic patterns

SK\d{2} \d{4} \d{4} \d{4} \d{4} \d{4}

SK\d{2}-\d{4}-\d{4}-\d{4}-\d{4}-\d{4}

TN59\d{4}\d{4}\d{4}\d{4}\d{4} Tunisia patterns

TN59 \d{4} \d{4} \d{4} \d{4} \d{4}

TN59-\d{4}-\d{4}-\d{4}-\d{4}-\d{4}

TR\d{2}\d{4}\d\w{3}\w{4}\w{4}\w{4}\w{2} Turkey patterns

TR\d{2} \d{4} \d\w{3} \w{4} \w{4} \w{4} \w{2}

TR\d{2}-\d{4}-\d\w{3}-\w{4}-\w{4}-\w{4}-\w{2}
Library of system data identifiers 1237
IBAN West

Table 45-465 IBAN East narrow-breadth validators

Validators Description

Mod 97 Validator Computes the ISO 7064 Mod 97-10 checksum of the
complete match.

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched when you use this
option.

Inputs:

Code IBAN, numéro IBAN, IBAN Code, IBAN number

IBAN West
The International Bank Account Number (IBAN) is an international standard for identifying
bank accounts across national borders.
The IBAN West data identifier detects IBAN numbers for Denmark, Faroe Islands, Finland,
France, Gibraltar, Greenland, Iceland, Ireland, Netherlands, Norway, Portugal, Spain, Sweden,
and the United Kingdom.
The IBAN West data identifier provides two breadths of detection:
■ The wide breadth detects a country-specific IBAN number with checksum validation.
See “IBAN West wide breadth” on page 1237.
■ The narrow breadth detects a country-specific IBAN number with checksum validation. It
also requires the presence of related keywords.
See “IBAN West narrow-breadth” on page 1239.

Note: Do not add the NIB validation to any IBAN data identifiers that apply to DLP Agents. The
NIB validator is only for use with server-side detection.

IBAN West wide breadth


The wide breadth detects a country-specific IBAN number that passes a checksum. IBAN
numbers can include space delimiters, dash delimiters, or no delimiters.
Library of system data identifiers 1238
IBAN West

Table 45-466 IBAN West wide-breadth patterns

Patterns Description

DK\d{2}\d{4}\d{4}\d{4}\d{2} Denmark patterns

DK\d{2} \d{4} \d{4} \d{4} \d{2}

DK\d{2}-\d{4}-\d{4}-\d{4}-\d{2}

ES\d{2}\d{4}\d{4}\d{4}\d{4}\d{4} Spain patterns

ES\d{2} \d{4} \d{4} \d{4} \d{4} \d{4}

ES\d{2}-\d{4}-\d{4}-\d{4}-\d{4}-\d{4}

FI\d{2}\d{4}\d{4}\d{4}\d{2} Finland patterns

FI\d{2} \d{4} \d{4} \d{4} \d{2}

FI\d{2}-\d{4}-\d{4}-\d{4}-\d{2}

FO\d{2}\d{4}\d{4}\d{4}\d{2} Faroe Islands patterns

FO\d{2} \d{4} \d{4} \d{4} \d{2}

FO\d{2}-\d{4}-\d{4}-\d{4}-\d{2}

FR\d{2}\d{4}\d{4}\d{2}\w{2}\w{4}\w{4}\w\d{2} France patterns

FR\d{2} \d{4} \d{4} \d{2}\w{2} \w{4} \w{4} \w\d{2}

FR\d{2}-\d{4}-\d{4}-\d{2}\w{2}-\w{4}-\w{4}-\w\d{2}

GB\d{2}[A-Z]{4}\d{4}\d{4}\d{4}\d{2} United Kingdom

GB\d{2} [A-Z]{4} \d{4} \d{4} \d{4} \d{2}

GB\d{2}-[A-Z]{4}-\d{4}-\d{4}-\d{4}-\d{2}

GI\d{2}[A-Z]{4}\w{4}\w{4}\w{4}\w{3} Gibraltar patterns

GI\d{2} [A-Z]{4} \w{4} \w{4} \w{4} \w{3}

GI\d{2}-[A-Z]{4}-\w{4}-\w{4}-\w{4}-\w{3}

GL\d{2}\d{4}\d{4}\d{4}\d{2} Greenland patterns

GL\d{2} \d{4} \d{4} \d{4} \d{2}

GL\d{2}-\d{4}-\d{4}-\d{4}-\d{2}

IE\d{2}[A-Z]{4}\d{4}\d{4}\d{4}\d{2} Ireland patterns

IE\d{2} [A-Z]{4} \d{4} \d{4} \d{4} \d{2}

IE\d{2}-[A-Z]{4}-\d{4}-\d{4}-\d{4}-\d{2}
Library of system data identifiers 1239
IBAN West

Table 45-466 IBAN West wide-breadth patterns (continued)

Patterns Description

IS\d{2}\d{4}\d{4}\d{4}\d{4}\d{4}\d{2} Iceland patterns

IS\d{2} \d{4} \d{4} \d{4} \d{4} \d{4} \d{2}

IS\d{2}-\d{4}-\d{4}-\d{4}-\d{4}-\d{4}-\d{2}

NL\d{2}[A-Z]{4}\d{4}\d{4}\d{2} Netherlands patterns

NL\d{2} [A-Z]{4} \d{4} \d{4} \d{2}

NL\d{2}-[A-Z]{4}-\d{4}-\d{4}-\d{2}

NO\d{2}\d{4}\d{4}\d{3} Montenegro patterns

NO\d{2} \d{4} \d{4} \d{3}

NO\d{2}-\d{4}-\d{4}-\d{3}

PT\d{2}\d{4}\d{4}\d{4}\d{4}\d{4}\d Portugal patterns

PT\d{2} \d{4} \d{4} \d{4} \d{4} \d{4} \d

PT\d{2}-\d{4}-\d{4}-\d{4}-\d{4}-\d{4}-\d

SE\d{2}\d{4}\d{4}\d{4}\d{4}\d{4} Sweden patterns

SE\d{2} \d{4} \d{4} \d{4} \d{4} \d{4}

SE\d{2}-\d{4}-\d{4}-\d{4}-\d{4}-\d{4}

Table 45-467 IBAN West wide-breadth validator

Validator Description

Mod 97 Validator Computes the ISO 7064 Mod 97-10 checksum of the
complete match.

IBAN West narrow-breadth


The narrow breadth detects a country-specific IBAN number that passes a checksum. It also
requires the presence of IBAN-related keywords.
Library of system data identifiers 1240
IBAN West

Table 45-468 IBAN West narrow-breadth patterns

Patterns Description

DK\d{2}\d{4}\d{4}\d{4}\d{2} Denmark patterns

DK\d{2} \d{4} \d{4} \d{4} \d{2}

DK\d{2}-\d{4}-\d{4}-\d{4}-\d{2}

ES\d{2}\d{4}\d{4}\d{4}\d{4}\d{4} Spain patterns

ES\d{2} \d{4} \d{4} \d{4} \d{4} \d{4}

ES\d{2}-\d{4}-\d{4}-\d{4}-\d{4}-\d{4}

FI\d{2}\d{4}\d{4}\d{4}\d{2} Finland patterns

FI\d{2} \d{4} \d{4} \d{4} \d{2}

FI\d{2}-\d{4}-\d{4}-\d{4}-\d{2}

FO\d{2}\d{4}\d{4}\d{4}\d{2} Faroe Islands patterns

FO\d{2} \d{4} \d{4} \d{4} \d{2}

FO\d{2}-\d{4}-\d{4}-\d{4}-\d{2}

FR\d{2}\d{4}\d{4}\d{2}\w{2}\w{4}\w{4}\w\d{2} France patterns

FR\d{2} \d{4} \d{4} \d{2}\w{2} \w{4} \w{4} \w\d{2}

FR\d{2}-\d{4}-\d{4}-\d{2}\w{2}-\w{4}-\w{4}-\w\d{2}

GB\d{2}[A-Z]{4}\d{4}\d{4}\d{4}\d{2} United Kingdom

GB\d{2} [A-Z]{4} \d{4} \d{4} \d{4} \d{2}

GB\d{2}-[A-Z]{4}-\d{4}-\d{4}-\d{4}-\d{2}

GI\d{2}[A-Z]{4}\w{4}\w{4}\w{4}\w{3} Gibraltar patterns

GI\d{2} [A-Z]{4} \w{4} \w{4} \w{4} \w{3}

GI\d{2}-[A-Z]{4}-\w{4}-\w{4}-\w{4}-\w{3}

GL\d{2}\d{4}\d{4}\d{4}\d{2} Greenland patterns

GL\d{2} \d{4} \d{4} \d{4} \d{2}

GL\d{2}-\d{4}-\d{4}-\d{4}-\d{2}

IE\d{2}[A-Z]{4}\d{4}\d{4}\d{4}\d{2} Ireland patterns

IE\d{2} [A-Z]{4} \d{4} \d{4} \d{4} \d{2}

IE\d{2}-[A-Z]{4}-\d{4}-\d{4}-\d{4}-\d{2}
Library of system data identifiers 1241
Iceland National Identification Number

Table 45-468 IBAN West narrow-breadth patterns (continued)

Patterns Description

IS\d{2}\d{4}\d{4}\d{4}\d{4}\d{4}\d{2} Iceland patterns

IS\d{2} \d{4} \d{4} \d{4} \d{4} \d{4} \d{2}

IS\d{2}-\d{4}-\d{4}-\d{4}-\d{4}-\d{4}-\d{2}

NL\d{2}[A-Z]{4}\d{4}\d{4}\d{2} Netherlands patterns

NL\d{2} [A-Z]{4} \d{4} \d{4} \d{2}

NL\d{2}-[A-Z]{4}-\d{4}-\d{4}-\d{2}

NO\d{2}\d{4}\d{4}\d{3} Montenegro patterns

NO\d{2} \d{4} \d{4} \d{3}

NO\d{2}-\d{4}-\d{4}-\d{3}

PT\d{2}\d{4}\d{4}\d{4}\d{4}\d{4}\d Portugal patterns

PT\d{2} \d{4} \d{4} \d{4} \d{4} \d{4} \d

PT\d{2}-\d{4}-\d{4}-\d{4}-\d{4}-\d{4}-\d

SE\d{2}\d{4}\d{4}\d{4}\d{4}\d{4} Sweden patterns

SE\d{2} \d{4} \d{4} \d{4} \d{4} \d{4}

SE\d{2}-\d{4}-\d{4}-\d{4}-\d{4}-\d{4}

Table 45-469 IBAN West narrow-breadth validators

Validators Description

Mod 97 Validator Computes the ISO 7064 Mod 97-10 checksum of the
complete match.

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched when you use this
option.

Inputs:

Code IBAN, numéro IBAN, IBAN Code, IBAN number

Iceland National Identification Number


The Iceland National Identification Number is a unique national identifier used by the Icelandic
government to identify individuals and organizations. It is administered by the Registers Iceland.
Library of system data identifiers 1242
Iceland National Identification Number

Icelandic national identification numbers are issued to Icelandic citizens at birth and to foreign
nationals resident in Iceland upon registration. They are also issued to corporations and
institutions.
The Iceland National Identification Number data identifier detects a 10-digit number that
matches the Iceland National Identification Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects a 10-digit number that matches the Iceland National Identification
Number format without checksum validation. It checks for common test numbers.
See “Iceland National Identification Number wide breadth” on page 1242.
■ The medium breadth detects a 10-digit number that matches the Iceland National
Identification Number format with checksum validation.
See “Iceland National Identification Number medium breadth” on page 1243.
■ The narrow breadth detects a 10-digit number that matches the Iceland National
Identification Number format with checksum validation. It checks for common test numbers,
and also requires the presence of related keywords.
See “Iceland National Identification Number narrow breadth” on page 1244.

Iceland National Identification Number wide breadth


The wide breadth detects a 10-digit number that matches the Iceland National Identification
Number format without checksum validation. It checks for common test numbers.

Table 45-470 Iceland National Identification Number wide-breadth patterns

Pattern

[04][1-9]0[1-9]\d{2}-\d{3}[09]

[1256][0-9]0[1-9]\d{2}-\d{3}[09]

[37][01]0[1-9]\d{2}-\d{3}[09]

[04][1-9]1[012]\d{2}-\d{3}[09]

[1256][0-9]1[012]\d{2}-\d{3}[09]

[37][01]1[012]\d{2}-\d{3}[09]

[04][1-9]0[1-9]\d{5}[09]

[1256][0-9]0[1-9]\d{5}[09]

[37][01]0[1-9]\d{5}[09]

[04][1-9]1[012]\d{5}[09]
Library of system data identifiers 1243
Iceland National Identification Number

Table 45-470 Iceland National Identification Number wide-breadth patterns (continued)

Pattern

[1256][0-9]1[012]\d{5}[09]

[37][01]1[012]\d{5}[09]

Table 45-471 Iceland National Identification Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Duplicate digits Ensures that a string of numbers is not all the same.

Iceland National Identification Number medium breadth


The medium breadth detects a 10-digit number that matches the Iceland National Identification
Number format with checksum validation.

Table 45-472 Iceland National Identification Number medium-breadth patterns

Pattern

[04][1-9]0[1-9]\d{2}-\d{3}[09]

[1256][0-9]0[1-9]\d{2}-\d{3}[09]

[37][01]0[1-9]\d{2}-\d{3}[09]

[04][1-9]1[012]\d{2}-\d{3}[09]

[1256][0-9]1[012]\d{2}-\d{3}[09]

[37][01]1[012]\d{2}-\d{3}[09]

[04][1-9]0[1-9]\d{5}[09]

[1256][0-9]0[1-9]\d{5}[09]

[37][01]0[1-9]\d{5}[09]

[04][1-9]1[012]\d{5}[09]

[1256][0-9]1[012]\d{5}[09]

[37][01]1[012]\d{5}[09]
Library of system data identifiers 1244
Iceland National Identification Number

Table 45-473 Iceland National Identification Number medium-breadth validators

Mandatory validator Description

Iceland National Identification Number Validation Computes the checksum and validates the pattern against
Check it.

Iceland National Identification Number narrow breadth


The narrow breadth detects a 10-digit number that matches the Iceland National Identification
Number format with checksum validation. It checks for common test numbers, and also requires
the presence of related keywords.

Table 45-474 Iceland National Identification Number narrow-breadth patterns

Pattern

[04][1-9]0[1-9]\d{2}-\d{3}[09]

[1256][0-9]0[1-9]\d{2}-\d{3}[09]

[37][01]0[1-9]\d{2}-\d{3}[09]

[04][1-9]1[012]\d{2}-\d{3}[09]

[1256][0-9]1[012]\d{2}-\d{3}[09]

[37][01]1[012]\d{2}-\d{3}[09]

[04][1-9]0[1-9]\d{5}[09]

[1256][0-9]0[1-9]\d{5}[09]

[37][01]0[1-9]\d{5}[09]

[04][1-9]1[012]\d{5}[09]

[1256][0-9]1[012]\d{5}[09]

[37][01]1[012]\d{5}[09]

Table 45-475 Iceland National Identification Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Duplicate digits Ensures that a string of numbers is not all the same.
Library of system data identifiers 1245
Iceland Passport Number

Table 45-475 Iceland National Identification Number narrow-breadth validators (continued)

Mandatory validator Description

Iceland National Identification Number Validation Computes the checksum and validates the pattern against
Check it.

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

personal code, national ID, national identification


number, personal ID, personal identification number,
personal identification code, nationalid#, personalid#,
KH, KH#, magic number, magicnumber#, magicno#,
magic no., social security number, ssn, ssn#, social
security no., kennitala,kennitala#, tin, tax identification
number, tin#, tax id, tin no, tin number, tax number,
tax code, taxpayer id, taxpayer identification number,
persónuleg kennitala, galdur númer, skattanúmer,
skattgreiðenda kóða, kennitala skattgreiðenda

Iceland Passport Number


Icelandic passports are issued to citizens of Iceland for the purpose of international travel and
may also serve as a proof of Iceland citizenship.
The Iceland Passport Number data identifier detects an eight-character alphanumeric pattern
that matches the Iceland Passport Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects an eight-character alphanumeric pattern that matches the Iceland
Passport Number format. It checks for common test patterns.
See “Iceland Passport Number wide breadth” on page 1245.
■ The narrow breadth an eight-character alphanumeric pattern that matches the Iceland
Passport Number format. It checks for common test patterns, and also requires the presence
of related keywords.
See “Iceland Passport Number narrow breadth” on page 1246.

Iceland Passport Number wide breadth


The wide breadth detects an eight-character alphanumeric pattern that matches the Iceland
Passport Number format. It checks for common test patterns.
Library of system data identifiers 1246
Iceland Passport Number

Table 45-476 Iceland Passport Number wide-breadth patterns

Pattern

[A-Za-z]\d{7}

Table 45-477 Iceland Passport Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

0000000, 1111111, 2222222, 3333333, 4444444,


5555555, 6666666, 7777777, 8888888, 9999999

Iceland Passport Number narrow breadth


The narrow breadth an eight-character alphanumeric pattern that matches the Iceland Passport
Number format. It checks for common test patterns, and also requires the presence of related
keywords.

Table 45-478 Iceland Passport Number narrow-breadth patterns

Pattern

[A-Za-z]\d{7}

Table 45-479 Iceland Passport Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

0000000, 1111111, 2222222, 3333333, 4444444,


5555555, 6666666, 7777777, 8888888, 9999999
Library of system data identifiers 1247
Iceland Value Added Tax (VAT) Number

Table 45-479 Iceland Passport Number narrow-breadth validators (continued)

Mandatory validator Description

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

passport, passport number, passport no, passportno,


passport no., passport#, passportno#

vegabréf, vegabréfs númer, Vegabréf Nei, vegabréf#

Iceland Value Added Tax (VAT) Number


Value Added Tax (VAT) is a consumption tax that is borne by the end consumer. VAT is paid
for each transaction in the manufacturing and distribution process. For Iceland, VAT is
administered by the VAT office for the region in which the business is established.
The Iceland Value Added Tax (VAT) Number data identifier detects a seven- or eight-character
alphanumeric pattern that matches the Iceland VAT Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects a seven- or eight-character alphanumeric pattern that matches
the Iceland VAT Number format. It checks for common test patterns.
See “Iceland Value Added Tax (VAT) Number wide breadth” on page 1247.
■ The narrow breadth detects a seven- or eight-character alphanumeric pattern that matches
the Iceland VAT Number format. It checks for common test patterns, and also requires the
presence of related keywords.
See “Iceland Value Added Tax (VAT) Number narrow breadth” on page 1248.

Iceland Value Added Tax (VAT) Number wide breadth


The wide breadth detects a seven- or eight-character alphanumeric pattern that matches the
Iceland VAT Number format. It checks for common test patterns.

Table 45-480 Iceland Value Added Tax (VAT) Number wide-breadth patterns

Pattern

[Ii][Ss] \d\d\d\d\d

[Ii][Ss] \d\d\d\d\d\d
Library of system data identifiers 1248
Iceland Value Added Tax (VAT) Number

Table 45-481 Iceland Value Added Tax (VAT) Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

00000, 11111, 22222, 33333, 44444, 55555, 66666,


77777, 88888, 99999, 000000, 111111, 222222, 333333,
444444, 555555, 666666, 777777, 888888, 999999

Iceland Value Added Tax (VAT) Number narrow breadth


The narrow breadth detects a seven- or eight-character alphanumeric pattern that matches
the Iceland VAT Number format. It checks for common test patterns, and also requires the
presence of related keywords.

Table 45-482 Iceland Value Added Tax (VAT) Number narrow-breadth patterns

Pattern

[Ii][Ss] \d\d\d\d\d

[Ii][Ss] \d\d\d\d\d\d

Table 45-483 Iceland Value Added Tax (VAT) Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

00000, 11111, 22222, 33333, 44444, 55555, 66666,


77777, 88888, 99999, 000000, 111111, 222222, 333333,
444444, 555555, 666666, 777777, 888888, 999999

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

vat, vat number, value added tax number

virðisaukaskattsnúmer, vsk númer


Library of system data identifiers 1249
Indian Aadhaar Card Number

Indian Aadhaar Card Number


The UIDAI is mandated to assign a 12-digit UID number termed as Aadhaar to all the residents
of India. The Aadhaar number is robust enough to eliminate duplicate and fake identities and
can be verified and authenticated in a cost-effective way online.
The Indian Aadhaar Card Number data identifier detects a 12-digit number that matches the
Indian Aadhaar Card Number format.
The Indian Aadhaar Card Number data identifier provides three breadths of detection:
■ The wide breadth detects a 12-digit number without checksum validation.
See “Indian Aadhaar Card Number wide breadth” on page 1249.
■ The medium breadth detects a 12-digit number with checksum validation.
See “Indian Aadhaar Card Number medium breadth” on page 1249.
■ The narrow breadth detects a 12-digit number with checksum validation. It also requires
the presence of related keywords.
See “Indian Aadhaar Card Number narrow breadth” on page 1250.

Indian Aadhaar Card Number wide breadth


The wide breadth detects a 12-digit number without checksum validation.

Table 45-484 Indian Aadhaar Card Number wide-breadth patterns

Patterns

[2-9]\d{11}

[2-9]\d{3} \d{4} \d{4}

Table 45-485 Indian Aadhaar Card Number wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Indian Aadhaar Card Number medium breadth


The medium breadth detects a 12-digit number with checksum validation.

Table 45-486 Indian Aadhaar Card Number medium-breadth patterns

Patterns

[2-9]\d{11}
Library of system data identifiers 1250
Indian Aadhaar Card Number

Table 45-486 Indian Aadhaar Card Number medium-breadth patterns (continued)

Patterns

[2-9]\d{3} \d{4} \d{4}

Table 45-487 Indian Aadhaar Card Number medium-breadth validators

Mandatory validators Description

Exclude ending characters Data ending with any of the following list of values is not
matched:

333333333333, 666666666666, 999999999999

Number delimiter Validates a match by checking the surrounding numbers.

Verheoff validation check Computes the checksum and validates the pattern against
it.

Indian Aadhaar Card Number narrow breadth


The narrow breadth detects a 12-digit number with checksum validation. It also requires the
presence of related keywords.

Table 45-488 Indian Aadhaar Card Number narrow-breadth patterns

Patterns

[2-9]\d{11}

[2-9]\d{3} \d{4} \d{4}

Table 45-489 Indian Aadhaar Card Number narrow-breadth validators

Mandatory validators Description

Exclude ending characters Data ending with any of the following list of values is not
matched:

333333333333, 666666666666, 999999999999

Number delimiter Validates a match by checking the surrounding numbers.

Verheoff validation check Computes the checksum and validates the pattern against
it.
Library of system data identifiers 1251
Indian Permanent Account Number

Table 45-489 Indian Aadhaar Card Number narrow-breadth validators (continued)

Mandatory validators Description

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

aadhar card no.,uidai,aadhar no.,Aadhar


Number,Aadhar#,Aadhar Card#

Indian Permanent Account Number


The Indian Permanent Account Number (PAN) is a unique 10-character alphanumeric identifier
issued by the Indian Income Tax Department to an individual.
The Indian Permanent Account Number detects a 10-character alphanumeric pattern that
matches the Indian Permanent Account Number format.
This data identifier provides two breadths of detection:
■ The wide breadth detects a 10-character alphanumeric pattern without checksum validation.
See “Indian Permanent Account Number wide breadth” on page 1251.
■ The narrow breadth detects a 10-character alphanumeric pattern without checksum
validation. It requires the presence of related keywords.
See “Indian Permanent Account Number narrow breadth” on page 1252.

Indian Permanent Account Number wide breadth


The wide breadth detects a 10-character alphanumeric pattern without checksum validation.

Table 45-490 Indian Permanent Account Number wide-breadth pattern

Pattern

[A-Za-z]{3}[CPHFATBLJGcphfatbljg][A-Za-z]\d{4}[A-Za-z]

Table 45-491 Indian Permanent Account Number wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.
Library of system data identifiers 1252
India RuPay Card Number

Indian Permanent Account Number narrow breadth


The narrow breadth detects a 10-character alphanumeric pattern without checksum validation.
It requires the presence of related keywords.

Table 45-492 Indian Permanent Account Number narrow-breadth pattern

Pattern

[A-Za-z]{3}[CPHFATBLJGcphfatbljg][A-Za-z]\d{4}[A-Za-z]

Table 45-493 Indian Permanent Account Number narrow-breadth validators

Mandatory validators Description

Duplicate digits Ensures that a string of digits is not all the same.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

PAN, permanent account number, pan, pan#, PAN#,


PAN Card Number, pan card no, pancardno#, PAN
card no, pan#, PANID#

India RuPay Card Number


The India RuPay Card is a card payment system similar to MasterCard and Visa created by
the National Payments Corporation of India.
The India RuPay Card Number data identifier detects a 16-digit number that matches the
RuPay Card Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects a 16-digit number that matches the RuPay Card Number format
without checksum validation. It checks for common test numbers.
See “India RuPay Card Number wide breadth” on page 1253.
■ The medium breadth detects a 16-digit number that matches the RuPay Card Number
format with checksum validation.
See “India RuPay Card Number medium breadth” on page 1253.
■ The narrow breadth detects a 16-digit number that matches the RuPay Card Number format
with checksum validation. It checks for common test numbers, and also requires the
presence of related keywords.
See “India RuPay Card Number narrow breadth” on page 1254.
Library of system data identifiers 1253
India RuPay Card Number

India RuPay Card Number wide breadth


The wide breadth detects a 16-digit number that matches the RuPay Card Number format
without checksum validation. It checks for common test numbers.

Table 45-494 India RuPay Card Number wide-breadth patterns

Pattern

508[5-9]\d\d\d\d\d\d\d\d\d\d\d\d

607[0-8]\d\d\d\d\d\d\d\d\d\d\d\d

6079[0-8]\d\d\d\d\d\d\d\d\d\d\d

6069[89]\d\d\d\d\d\d\d\d\d\d\d

6521[5-9]\d\d\d\d\d\d\d\d\d\d\d

652[2345]\d\d\d\d\d\d\d\d\d\d\d\d

6531[0-4]\d\d\d\d\d\d\d\d\d\d\d

6530\d\d\d\d\d\d\d\d\d\d\d\d

608[0123]\d\d\d\d\d\d\d\d\d\d\d\d

6950\d\d\d\d\d\d\d\d\d\d\d\d

Table 45-495 India RuPay Card Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Duplicate digits Ensures that a string of numbers is not all the same.

India RuPay Card Number medium breadth


The medium breadth detects a 16-digit number that matches the RuPay Card Number format
with checksum validation.

Table 45-496 India RuPay Card Number medium-breadth patterns

Pattern

508[5-9]\d\d\d\d\d\d\d\d\d\d\d\d

607[0-8]\d\d\d\d\d\d\d\d\d\d\d\d
Library of system data identifiers 1254
India RuPay Card Number

Table 45-496 India RuPay Card Number medium-breadth patterns (continued)

Pattern

6079[0-8]\d\d\d\d\d\d\d\d\d\d\d

6069[89]\d\d\d\d\d\d\d\d\d\d\d

6521[5-9]\d\d\d\d\d\d\d\d\d\d\d

652[2345]\d\d\d\d\d\d\d\d\d\d\d\d

6531[0-4]\d\d\d\d\d\d\d\d\d\d\d

6530\d\d\d\d\d\d\d\d\d\d\d\d

608[0123]\d\d\d\d\d\d\d\d\d\d\d\d

6950\d\d\d\d\d\d\d\d\d\d\d\d

Table 45-497 India RuPay Card Number medium-breadth validators

Mandatory validator Description

Luhn Check Computes the checksum and validates the pattern against
it.

India RuPay Card Number narrow breadth


The narrow breadth detects a 16-digit number that matches the RuPay Card Number format
with checksum validation. It checks for common test numbers, and also requires the presence
of related keywords.

Table 45-498 India RuPay Card Number narrow-breadth patterns

Pattern

508[5-9]\d\d\d\d\d\d\d\d\d\d\d\d

607[0-8]\d\d\d\d\d\d\d\d\d\d\d\d

6079[0-8]\d\d\d\d\d\d\d\d\d\d\d

6069[89]\d\d\d\d\d\d\d\d\d\d\d

6521[5-9]\d\d\d\d\d\d\d\d\d\d\d

652[2345]\d\d\d\d\d\d\d\d\d\d\d\d

6531[0-4]\d\d\d\d\d\d\d\d\d\d\d
Library of system data identifiers 1255
Indonesian Identity Card Number

Table 45-498 India RuPay Card Number narrow-breadth patterns (continued)

Pattern

6530\d\d\d\d\d\d\d\d\d\d\d\d

608[0123]\d\d\d\d\d\d\d\d\d\d\d\d

6950\d\d\d\d\d\d\d\d\d\d\d\d

Table 45-499 India RuPay Card Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Duplicate digits Ensures that a string of numbers is not all the same.

Luhn Check Computes the checksum and validates the pattern against
it.

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

bank card, bankcard, card number, cc#, ccn, check


card, checkcard, credit card, credit card number,
creditcard#, debit card, debitcard, debit card number,
rupay card, rupay, rupaycard#, ccn#, debitcard#,
rupay#

Indonesian Identity Card Number


The Indonesian identity card (Kartu Tanda Penduduk, or KTP) number is used as the basis
for issuance of passport, driving license, taxpayer identification number, insurance policy,
certificate of land rights, and identity documents.
The Indonesian Identity Card Number data identifier detects a 16-digit number that matches
the Indonesian Identity Card Number format.
The Indonesian Identity Card Number system data identifier provides three breadths of
detection:
■ The wide breadth detects a 16-digit number without checksum validation.
See “Indonesian Identity Card Number wide breadth” on page 1256.
■ The medium breadth detects a 16-digit number with checksum validation.
See “Indonesian Identity Card Number medium breadth” on page 1256.
Library of system data identifiers 1256
Indonesian Identity Card Number

■ The narrow breadth detects a 16-digit number that passes checksum validation. It also
requires the presence of related keywords.
See “Indonesian Identity Card Number narrow breadth” on page 1256.

Indonesian Identity Card Number wide breadth


The wide breadth detects a 16-digit number without checksum validation.

Table 45-500 Indonesian Identity Card Number wide-breadth pattern

Pattern

\d{2}[01237]\d{3}[01234567]\d[01]\d{7}

Table 45-501 Indonesian Identity Card Number wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Indonesian Identity Card Number medium breadth


The medium breadth detects a 16-digit number with checksum validation.

Table 45-502 Indonesian Identity Card Number medium-breadth pattern

Pattern

\d{2}[01237]\d{3}[01234567]\d[01]\d{7}

Table 45-503 Indonesian Identity Card Number medium-breadth validator

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Indonesian Kartu Tanda Penduduk Validation Check Validator computes the checksum that every Indonesian
Kartu Tanda Penduduk must pass.

Indonesian Identity Card Number narrow breadth


The narrow breadth detects a 16-digit number that passes checksum validation. It also requires
the presence of related keywords.
Library of system data identifiers 1257
International Mobile Equipment Identity Number

Table 45-504 Indonesian Identity Card Number narrow-breadth pattern

Pattern

\d{2}[01237]\d{3}[01234567]\d[01]\d{7}

Table 45-505 Indonesian Identity Card Number narrow-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

Indonesian Kartu Tanda Penduduk Validation Check Validator computes the checksum that every Indonesian
Kartu Tanda Penduduk must pass.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

identity card number, Indonesian identity card no,


Indonesian identity card number, NIK, KTP, unique ID,
unique identity number, national identification number,
national identity no, identity number

kartu tanda penduduk nomor, nomor Induk


Kependudukan, tanda penduduk nomor, kartu identitas
Indonesia no, kartu identitas Indonesia nomor, nomor
identitas unik

International Mobile Equipment Identity Number


The International Mobile Station Equipment Identity (IMEI) is a unique identifier for 3GPP
(GSM, UMTS, and LTE) and iDEN mobile phones and some satellite phones.
The International Mobile Equipment Identity Number detects a 15-digit number that matches
the International Mobile Equipment Identity Number format.
The International Mobile Equipment Identity Number data identifier provides three breadths
of detecion:
■ The wide breadth detects a 15-digit number with duplicate digit validation.
See “International Mobile Equipment Identity Number wide breadth” on page 1258.
■ The medium breadth detects a 15-digit number with Luhn check validation and beginning
character exclusion.
See “International Mobile Equipment Identity Number medium breadth” on page 1258.
Library of system data identifiers 1258
International Mobile Equipment Identity Number

■ The narrow breadth detects a 15-digit number with duplicate digit and Luhn check validation.
It also requires the presence of related keywords.
See “International Mobile Equipment Identity Number narrow breadth” on page 1259.

International Mobile Equipment Identity Number wide breadth


The wide breadth detects a 15-digit number with duplicate digit validation.

Table 45-506 International Mobile Equipment Identity Number wide-breadth patterns

Patterns

\d{15}

\d{2}-\d{6}-\d{6}-\d

Table 45-507 International Mobile Equipment Identity Number wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

International Mobile Equipment Identity Number medium breadth


The medium breadth detects a 15-digit number with Luhn check validation and beginning
character exclusion.

Table 45-508 International Mobile Equipment Identity Number medium-breadth patterns

Patterns

\d{15}

\d{2}-\d{6}-\d{6}-\d

Table 45-509 International Mobile Equipment Identity Number medium-breadth validators

Mandatory validators Description

Luhn Check Computes the Luhn checksum and validates the pattern
against it.

Number delimiter Validates a match by checking the surrounding numbers.

Exclude beginning characters Data beginning with any of the following list of values is
not matched:

000000000000000
Library of system data identifiers 1259
International Securities Identification Number

International Mobile Equipment Identity Number narrow breadth


The narrow breadth detects a 15-digit number with duplicate digit and Luhn check validation.
It also requires the presence of related keywords.

Table 45-510 International Mobile Equipment Identity Number narrow-breadth patterns

Patterns

\d{15}

\d{2}-\d{6}-\d{6}-\d

Table 45-511 International Mobile Equipment Identity Number narrow-breadth validators

Mandatory validators Description

Luhn Check Computes the Luhn checksum and validates the pattern
against it.

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding numbers.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

imei, IMEI, imei no, IMEI No, IMEI Number, imei number,
International Mobile Station Equipment Identity
Number, International Mobile Station Equipment
Identity

International Securities Identification Number


An International Securities Identification Number (ISIN) is a 12-character alphanumeric pattern
that uniquely identifies a security. Securities for which ISINs are issued include bonds,
commercial paper, stocks and warrants.
The International Securities Identification Number data identifier detects a 12-character
alphanumeric pattern that matches the International Securities Identification Number format.
■ The wide breadth detects a 12-character alphanumeric pattern without validation.
See “ International Securities Identification Number wide breadth” on page 1260.
■ The medium breadth detects a 12-character alphanumeric pattern with checksum validation.
See “International Securities Identification Number medium breadth” on page 1260.
Library of system data identifiers 1260
International Securities Identification Number

■ The narrow breadth detects a 12-character alphanumeric pattern with checksum validation.
It also requires the presence of related keywords.
See “International Securities Identification Number narrow breadth” on page 1260.

International Securities Identification Number wide breadth


The wide breadth detects a 12-character alphanumeric pattern without validation.

Table 45-512 International Securities Identification Number wide-breadth pattern

Pattern

\l{2}\w{9}\d

The wide breadth of the International Securities Identification Number includes no validators.

International Securities Identification Number medium breadth


The medium breadth detects a 12-character alphanumeric pattern with checksum validation.

Table 45-513 International Securities Identification Number medium-breadth pattern

Pattern

\l{2}\w{9}\d

Table 45-514 International Securities Identification Number medium-breadth validator

Mandatory validator Description

International Securities Identification Number Computes the checksum and validates the pattern against
Validation Check it.

International Securities Identification Number narrow breadth


The narrow breadth detects a 12-character alphanumeric pattern with checksum validation. It
also requires the presence of related keywords.

Table 45-515 International Securities Identification Number narrow-breadth pattern

Pattern

\l{2}\w{9}\d
Library of system data identifiers 1261
IP Address

Table 45-516 International Securities Identification Number narrow-breadth validators

Mandatory validators Description

International Securities Identification Number Computes the checksum and validates the pattern against
Validation Check it.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

isin, i.s.i.n, International Securities Identification


Number, Standard & Poor's, S&P, National Numbering
Association, NNA ID, ID number, identification number,
Id no., international securities ID no., International
securities ID number

IP Address
An IP address is the computer networking code that is used to identify devices and facilitate
communications.
The IP Address data identifier detects IPv4 addresses.
This data identifier offers three breadths of detection:
■ The wide breadth detects IP addresses and validates their format.
See “IP Address wide breadth” on page 1261.
■ The medium breadth detects IP addresses, validates their format, and eliminates fictitious
addresses.
See “IP Address medium breadth” on page 1262.
■ The narrow breadth detects IP addresses, validates their format, and eliminates fictitious
and unassigned addresses.
See “IP Address narrow breadth” on page 1263.

IP Address wide breadth


The wide breadth of the IP Address data identifier detects numbers in format
DDD.DDD.DDD.DDD with an optional /DD. Each three-digit group must be between 0 and
255 inclusive and the /DD must be between 0 and 32. Additionally, 0.0.0.0 is not allowed.
Library of system data identifiers 1262
IP Address

Table 45-517 IP Address wide-breadth patterns

Patterns

\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}

\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}/[0-9]

\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}/[1-2][0-9]?

\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}/[3][0-2]?

Table 45-518 IP Address wide-breadth validator

Mandatory validator Description

IP Basic Check Every IP address must match the format x.x.x.x and every
number must be less than 256.

IP Address medium breadth


The medium breadth of the IP Address data identifier detects numbers in format
DDD.DDD.DDD.DDD with an optional /DD. Each three-digit group must be between 0 and
255 inclusive and the /DD must be between 0 and 32. Additionally, 0.0.0.0 is not allowed. Also,
eliminates as common fictitious examples all 1-digit match groups such as 1.1.1.2.

Table 45-519 IP Address medium-breadth patterns

Patterns

\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}

\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}/[0-9]

\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}/[1-2][0-9]?

\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}/[3][0-2]?

Table 45-520 IP Address medium-breadth validator

Mandatory Validator Description

IP Octet Check Every IP address must match the format x.x.x.x, every number must be less than 256,
and no IP address can contain only single-digit numbers (1.1.1.2).
Library of system data identifiers 1263
IPv6 Address

IP Address narrow breadth


The narrow breadth of the IP Address data identifier detects numbers in format
DDD.DDD.DDD.DDD with an optional /DD. Each three-digit group must be between 0 and
255 inclusive and the /DD must be between 0 and 32. Additionally, 0.0.0.0 is not allowed. Also,
eliminates as common fictitious examples all 1-digit match groups such as 1.1.1.2. Also
eliminates unassigned IP addresses ("bogons").

Table 45-521 IP Address narrow-breadth patterns

Patterns

\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}

\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}/[0-9]

\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}/[1-2][0-9]?

\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}/[3][0-2]?

Table 45-522 IP Address narrow-breadth validators

Mandatory Validators Description

IP Octet Check Every IP address must match the format x.x.x.x, every number must be less than 256,
and no IP address can contain only single-digit numbers (1.1.1.2).

IP Reserved Range Check Checks whether the IP address falls into any of the "Bogons" ranges. If so, the match
is invalid.

IPv6 Address
Internet Protocol version 6 (IPv6) is the latest version of the Internet Protocol (IP), the
communications protocol that provides an identification and location system for computers on
networks and routes traffic across the Internet.
The IPv6 Address data identifier detects IPv6 addresses.
This data identifier offers three breadths of detection:
■ The wide breadth detects IPv6 addresses and validates their format.
See “IPv6 Address wide breadth” on page 1264.
■ The medium breadth detects IPv6 addresses and validates their format. It also validates
that they do not begin with the numeral 0.
See “IPv6 Address medium breadth” on page 1264.
Library of system data identifiers 1264
IPv6 Address

■ The narrow breadth detects IPv6 addresses and validates their format. It also validates
that they do not begin with the numeral 0. Address strings are fully compressed, not
normalized.
See “IPv6 Address narrow breadth” on page 1265.

IPv6 Address wide breadth


The wide breadth detects IPv6 addresses and validates that they match the format
xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx.

Table 45-523 IPv6 Address wide-breadth patterns

Patterns

[0-9A-Fa-f:./%][0-9A-Fa-f:./%]

[0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%]

[0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%]

[0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%]

[0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%]

[0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%]

Pattern continues to 44 repetitions.

Table 45-524 IPv6 Address wide-breadth validator

Validator Description

IPv6 Address Basic Validation Check Checks every IPv6 address and verifies that they match
the xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx format.

IPv6 Address medium breadth


The medium breadth detects IPv6 addresses and validates that they match the format
xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx. It also validates that they do not begin with the
numeral 0.

Table 45-525 IPv6 Address medium-breadth patterns

Patterns

[0-9A-Fa-f:./%][0-9A-Fa-f:./%]

[0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%]
Library of system data identifiers 1265
IPv6 Address

Table 45-525 IPv6 Address medium-breadth patterns (continued)

Patterns

[0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%]

[0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%]

[0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%]

[0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%]

Pattern continues to 44 repetitions.

Table 45-526 IPv6 Address medium-breadth validator

Mandatory Validator Description

IPv6 Address Medium Checks every IPv6 address and verifies that they match the
Validation Check xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx format, and that addresses do not start with
the numeral 0.

IPv6 Address narrow breadth


The narrow breadth detects IPv6 addresses and validates that they match the format
xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx. It also validates that they do not begin with the
numeral 0. Address strings are fully compressed, not normalized.

Table 45-527 IPv6 Address narrow-breadth patterns

Patterns

[0-9A-Fa-f:./%][0-9A-Fa-f:./%]

[0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%]

[0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%]

[0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%]

[0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%]

[0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%][0-9A-Fa-f:./%]

Pattern continues to 44 repetitions.


Library of system data identifiers 1266
Ireland Passport Number

Table 45-528 IPv6 Address narrow-breadth validator

Mandatory Validator Description

IPv6 Address Reserved Checks every IPv6 address and verifies that they match the
Validation Check xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx format, do not start with the numeral 0, and
are fully compressed.

Table 45-529 IPv6 Address narrow-breadth normalizer

Normalizer Description

Noop (No operation) String is passed as it is without normalizing.

Ireland Passport Number


An Irish passport is the passport issued to citizens of Ireland. An Irish passport enables the
bearer to travel internationally and serves as evidence of Irish citizenship and citizenship of
the European union. It also facilitates the access to consular assistance from both Irish
embassies and any embassy from other European union member states while abroad.
The Ireland Passport Number data identifier detects a seven- or nine-character alphanumeric
pattern that matches the Ireland Passport Number format.
The Ireland Passport Number data identifier provides two breadths of detection:
■ The wide breadth detects a seven- or nine-character alphanumeric pattern without checksum
validation.
See “Ireland Passport Number wide breadth” on page 1266.
■ The narrow breadth detects a seven- or nine-character alphanumeric pattern without
checksum validation. It requires the presence of related keywords.
See “Ireland Passport Number narrow breadth” on page 1267.

Ireland Passport Number wide breadth


The wide breadth detects a seven- or nine-character alphanumeric pattern without checksum
validation.

Table 45-530 Ireland Passport Number wide-breadth patterns

Patterns

[a-zA-Z]{2}\d{7}

[a-zA-Z]\d{6}
Library of system data identifiers 1267
Ireland Passport Number

Table 45-530 Ireland Passport Number wide-breadth patterns (continued)

Patterns

[a-zA-Z]\d{8}

Table 45-531 Ireland Passport Number wide-breadth validator

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Ireland Passport Number narrow breadth


The narrow breadth detects a seven- or nine-character alphanumeric pattern without checksum
validation. It requires the presence of related keywords.

Table 45-532 Ireland Passport Number narrow-breadth patterns

Patterns

[a-zA-Z]{2}\d{7}

[a-zA-Z]\d{6}

[a-zA-Z]\d{8}

Table 45-533 Ireland Passport Number narrow-breadth validators

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding characters.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

passport number, passport, passport no, pas,


passeport, ireland passport, irelande passeport, Éire
pas, no de passeport, pas uimh, uimhir pas, numéro
de passeport
Library of system data identifiers 1268
Ireland Tax Identification Number

Ireland Tax Identification Number


The Ireland Tax Identification Number is issued by department of social protection for natural
persons and by revenue commissioner for non-natural persons. Non-natural persons can be
companies, partnerships, trusts, and unincorporated bodies.
The Ireland Tax Identification Number data identifier detects a six- to nine-character
alphanumeric pattern that matches the Ireland Tax Identification Number format.
The Ireland Tax Identification Number provides three breadths of detection:
■ The wide breadth detects a six- to nine-character alphanumeric pattern without checksum
validation.
See “Ireland Tax Identification Number wide breadth” on page 1268.
■ The medium breadth detects a six- to nine-character alphanumeric pattern with checksum
validation.
See “Ireland Tax Identification Number medium breadth” on page 1269.
■ The narrow breadth detects a six- to nine-character alphanumeric pattern with checksum
validation. It also requires the presence of related keywords.
See “Ireland Tax Identification Number narrow breadth” on page 1270.

Ireland Tax Identification Number wide breadth


The wide breadth detects a six- to nine-character alphanumeric pattern without checksum
validation.

Table 45-534 Ireland Tax Identification Number wide-breadth patterns

Patterns

\d{7}[A-Wa-w]

\d{7} [A-Wa-w]

\d{3} \d{2} \d{2}[A-Wa-w]

\d{3} \d{2} \d{2} [A-Wa-w]

\d{7}[A-Wa-w][A-Ia-iWw]

\d{7} [A-Wa-w][A-Ia-iWw]

\d{3} \d{2} \d{2}[A-Wa-w][A-Ia-iWw]

\d{3} \d{2} \d{2} [A-Wa-w][A-Ia-iWw]

\d{3} \d{2} \d{2} [A-Wa-w] [A-Ia-iWw]


Library of system data identifiers 1269
Ireland Tax Identification Number

Table 45-534 Ireland Tax Identification Number wide-breadth patterns (continued)

Patterns

[Cc][Hh][Yy]\d{3}

[Cc][Hh][Yy] \d{3}

[Cc][Hh][Yy]\d{4}

[Cc][Hh][Yy] \d{4}

[Cc][Hh][Yy]\d{5}

[Cc][Hh][Yy] \d{5}

Table 45-535 Ireland Tax Identification Number wide-breadth validator

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Ireland Tax Identification Number medium breadth


The medium breadth detects a six- to nine-character alphanumeric pattern with checksum
validation.

Table 45-536 Ireland Tax Identification Number medium-breadth patterns

Patterns

\d{7}[A-Wa-w]

\d{7} [A-Wa-w]

\d{3} \d{2} \d{2}[A-Wa-w]

\d{3} \d{2} \d{2} [A-Wa-w]

\d{7}[A-Wa-w][A-Ia-iWw]

\d{7} [A-Wa-w][A-Ia-iWw]

\d{3} \d{2} \d{2}[A-Wa-w][A-Ia-iWw]

\d{3} \d{2} \d{2} [A-Wa-w][A-Ia-iWw]

\d{3} \d{2} \d{2} [A-Wa-w] [A-Ia-iWw]

[Cc][Hh][Yy]\d{3}
Library of system data identifiers 1270
Ireland Tax Identification Number

Table 45-536 Ireland Tax Identification Number medium-breadth patterns (continued)

Patterns

[Cc][Hh][Yy] \d{3}

[Cc][Hh][Yy]\d{4}

[Cc][Hh][Yy] \d{4}

[Cc][Hh][Yy]\d{5}

[Cc][Hh][Yy] \d{5}

Table 45-537 Ireland Tax Identification Number medium-breadth validator

Mandatory validator Description

Ireland Tax Identification Number Validation Check Computes the checksum and validates the pattern against
it.

Ireland Tax Identification Number narrow breadth


The narrow breadth detects a six- to nine-character alphanumeric pattern with checksum
validation. It also requires the presence of related keywords.

Table 45-538 Ireland Tax Identification Number narrow-breadth patterns

Patterns

\d{7}[A-Wa-w]

\d{7} [A-Wa-w]

\d{3} \d{2} \d{2}[A-Wa-w]

\d{3} \d{2} \d{2} [A-Wa-w]

\d{7}[A-Wa-w][A-Ia-iWw]

\d{7} [A-Wa-w][A-Ia-iWw]

\d{3} \d{2} \d{2}[A-Wa-w][A-Ia-iWw]

\d{3} \d{2} \d{2} [A-Wa-w][A-Ia-iWw]

\d{3} \d{2} \d{2} [A-Wa-w] [A-Ia-iWw]

[Cc][Hh][Yy]\d{3}
Library of system data identifiers 1271
Ireland Value Added Tax (VAT) Number

Table 45-538 Ireland Tax Identification Number narrow-breadth patterns (continued)

Patterns

[Cc][Hh][Yy] \d{3}

[Cc][Hh][Yy]\d{4}

[Cc][Hh][Yy] \d{4}

[Cc][Hh][Yy]\d{5}

[Cc][Hh][Yy] \d{5}

Table 45-539 Ireland Tax Identification Number narrow-breadth validators

Mandatory validators Description

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

CHY, charity number, charity registration number, CHY


number, CHY#, CHY no., CHY no, TRN,TRN#, tax
reference number, ireland tax identification number,
irish tax identification, tax identification number, tax
id, taxid, taxid#, tax number, tax no, taxno#, tax#, TIN,
TIN#, ireland tin, tax id no, tax id no.

uimhir carthanachta, Uimhir chláraithe charthanais,


uimhir CHY, CHY uimh., uimhir thagartha cánach,
uimhir aitheantais cánach ireland, aitheantais cánach
irish, uimhir aitheantais cánach, id cánach, uimhir
chánach, cáin #, STÁIN, cáin id uimh.

Number delimiter Validates a match by checking the surrounding characters.

Ireland Tax Identification Number Validation Check Computes the checksum and validates the pattern against
it.

Ireland Value Added Tax (VAT) Number


VAT is a consumption tax that is borne by the end consumer. VAT is paid for each transaction
in the manufacturing and distribution process. For Ireland, the VAT number is issued by the
Irish tax authority.
Library of system data identifiers 1272
Ireland Value Added Tax (VAT) Number

The Ireland Value Added Tax (VAT) Number data identifier detects a 9- to 11-character
alphanumeric pattern that matches the Ireland Value Added Tax (VAT) Number format.
The Ireland Value Added Tax (VAT) Number data identifier provides three breadths of detection:
■ The wide breadth detects a 9- to 11-character alphanumeric pattern without checksum
validation.
See “Ireland Value Added Tax (VAT) Number wide breadth” on page 1272.
■ The medium breadth detects a 9- to 11-character alphanumeric pattern with checksum
validation.
See “Ireland Value Added Tax (VAT) Number medium breadth” on page 1273.
■ The narrow breadth detects a 9- to 11-character alphanumeric pattern with checksum
validation. It also requires the presence of related keywords.
See “Ireland Value Added Tax (VAT) Number narrow breadth” on page 1273.

Ireland Value Added Tax (VAT) Number wide breadth


The wide breadth detects a 9- to 11-character alphanumeric pattern without checksum
validation.

Table 45-540 Ireland Value Added Tax (VAT) Number wide-breadth patterns

Patterns

[Ii][Ee]\d{7}[A-Wa-w]

[Ii][Ee] \d{7}[A-Wa-w]

[Ii][Ee] \d{7} [A-Wa-w]

[Ii][Ee]\d{7}[A-Wa-w][HhAa]

[Ii][Ee] \d{7}[A-Wa-w][HhAa]

[Ii][Ee] \d{7} [A-Wa-w][HhAa]

[Ii][Ee][0-9][A-Za-z+*]\d{5}[A-Wa-w]

[Ii][Ee] [0-9][A-Za-z+*]\d{5}[A-Wa-w]

[Ii][Ee] [0-9] [A-Za-z+*]\d{5}[A-Wa-w]

Table 45-541 Ireland Value Added Tax (VAT) Number wide-breadth validator

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.


Library of system data identifiers 1273
Ireland Value Added Tax (VAT) Number

Ireland Value Added Tax (VAT) Number medium breadth


The medium breadth detects a 9- to 11-character alphanumeric pattern with checksum
validation.

Table 45-542 Ireland Value Added Tax (VAT) Number medium-breadth patterns

Patterns

[Ii][Ee]\d{7}[A-Wa-w]

[Ii][Ee] \d{7}[A-Wa-w]

[Ii][Ee] \d{7} [A-Wa-w]

[Ii][Ee]\d{7}[A-Wa-w][HhAa]

[Ii][Ee] \d{7}[A-Wa-w][HhAa]

[Ii][Ee] \d{7} [A-Wa-w][HhAa]

[Ii][Ee][0-9][A-Za-z+*]\d{5}[A-Wa-w]

[Ii][Ee] [0-9][A-Za-z+*]\d{5}[A-Wa-w]

[Ii][Ee] [0-9] [A-Za-z+*]\d{5}[A-Wa-w]

Table 45-543 Ireland Value Added Tax (VAT) Number medium-breadth validator

Mandatory validator Description

Ireland VAT Number Validation Check Computes the checksum and validates the pattern against
it.

Ireland Value Added Tax (VAT) Number narrow breadth


The narrow breadth detects a 9- to 11-character alphanumeric pattern with checksum validation.
It also requires the presence of related keywords.

Table 45-544 Ireland Value Added Tax (VAT) Number narrow-breadth patterns

Patterns

[Ii][Ee]\d{7}[A-Wa-w]

[Ii][Ee] \d{7}[A-Wa-w]

[Ii][Ee] \d{7} [A-Wa-w]


Library of system data identifiers 1274
Irish Personal Public Service Number

Table 45-544 Ireland Value Added Tax (VAT) Number narrow-breadth patterns (continued)

Patterns

[Ii][Ee]\d{7}[A-Wa-w][HhAa]

[Ii][Ee] \d{7}[A-Wa-w][HhAa]

[Ii][Ee] \d{7} [A-Wa-w][HhAa]

[Ii][Ee][0-9][A-Za-z+*]\d{5}[A-Wa-w]

[Ii][Ee] [0-9][A-Za-z+*]\d{5}[A-Wa-w]

[Ii][Ee] [0-9] [A-Za-z+*]\d{5}[A-Wa-w]

Table 45-545 Ireland Value Added Tax (VAT) Number narrow-breadth validators

Mandatory validators Description

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

ireland vat number, vat number, vat no, VAT#, VAT,


value added tax number, value added tax, irish vat

cáin bhreisluacha, CBL, CBL aon, Uimhir CBL, Uimhir


CBL hÉireann, bhreisluacha uimhir chánach

Number delimiter Validates a match by checking the surrounding characters.

Ireland VAT Number Validation Check Computes the checksum and validates the pattern against
it.

Irish Personal Public Service Number


The format of the number is a unique eight-character alphanumeric pattern ending with a letter,
such as 8765432A. The number is assigned at the registration of birth of the child and is issued
on a Public Services Card and is unique to every person.
The Irish Personal Public Service Number detects and eight-character alphanumeric pattern
that matches the Irish Personal Public Service Number format.
The Irish Personal Public Service Number system data identifier provides three breadths of
detection:
Library of system data identifiers 1275
Irish Personal Public Service Number

■ The wide breadth detects an eight-character alphanumeric pattern ending with a letter
without checksum validation.
See “Irish Personal Public Service Number wide breadth” on page 1275.
■ The medium breadth detects an eight-character alphanumeric pattern ending with a letter
with checksum validation.
See “Irish Personal Public Service Number medium breadth” on page 1275.
■ The narrow breadth detects an eight-character alphanumeric pattern ending with a letter
that passes checksum validation. It also requires the presence of related keywords.
See “Irish Personal Public Service Number narrow breadth” on page 1276.

Irish Personal Public Service Number wide breadth


The wide breadth detects an eight-character alphanumeric pattern ending with a letter without
checksum validation.

Table 45-546 Irish Personal Public Service Number wide-breadth pattern

Pattern

\d{7}[a-wA-W]

Table 45-547 Irish Personal Public Service Number wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Irish Personal Public Service Number medium breadth


The medium breadth detects an eight-character alphanumeric pattern ending with a letter with
checksum validation.

Table 45-548 Irish Personal Public Service Number medium-breadth pattern

Pattern

\d{7}[a-wA-W]

Table 45-549 Irish Personal Public Service Number medium-breadth validator

Mandatory validator Description

Irish Personal Public Service Number Validation Check Computes the checksum and validates the pattern against
it.
Library of system data identifiers 1276
Israel Personal Identification Number

Irish Personal Public Service Number narrow breadth


The narrow breadth detects an eight-character alphanumeric pattern ending with a letter with
checksum validation. It also requires the presence of related keywords.

Table 45-550 Irish Personal Public Service Number narrow-breadth pattern

Pattern

\d{7}[a-wA-W]

Table 45-551 Irish Personal Public Service Number narrow-breadth validators

Mandatory validators Description

Duplicate digits Ensures that a string of digits is not all the same.

Irish Personal Public Service Number Validation Check Computes the checksum and validates the pattern against
it.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

public service no, personal public service no, pps no,


PPS No, personal service no, PPS service no, ppsno#,
Irish PPS No, Irish pps no, PPSNO#, publicserviceno#,
personal public service number
uimhir phearsanta seirbhíse poiblí, pps uimh, Uimhir
aitheantais phearsanta

Israel Personal Identification Number


The Israel Personal Identification Number is a nine-digit number issued to all Israeli citizens
at birth by the Ministry of the Interior. Personal identification numbers are also issued to all
residents over 16 years old who have legal temporary or permanent residence status.
The Israel Personal Identification Number data identifier detects a nine-digit number that
matches the Israel Personal Identification Number format.
The Israel Personal Identification Number data identifier provides three breadths of detection:
■ The wide breadth detects a nine-digit number without checksum validation.
See “Israel Personal Identification Number wide breadth” on page 1277.
■ The medium breadth detects a nine-digit number with checksum validation.
See “Israel Personal Identification Number medium breadth” on page 1277.
Library of system data identifiers 1277
Israel Personal Identification Number

■ The narrow breadth detects a nine-digit number with checksum validation. It also requires
the presence of related keywords.
See “Israel Personal Identification Number narrow breadth” on page 1277.

Israel Personal Identification Number wide breadth


The wide breadth detects a nine-digit number without checksum validation.

Table 45-552 Israel Personal Identification Number wide-breadth pattern

Pattern

\d{9}

Table 45-553 Israel Personal Identification Number wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Israel Personal Identification Number medium breadth


The medium breadth detects a nine-digit number with checksum validation.

Table 45-554 Israel Personal Identification Number medium-breadth pattern

Pattern

\d{9}

Table 45-555 Israel Personal Identification Number medium-breadth validators

Mandatory validator Description

Israeli Identity Number Validation Check Computes the checksum and validates the pattern against
it.

Number delimiter Validates a match by checking the surrounding numbers.

Israel Personal Identification Number narrow breadth


The narrow breadth detects a nine-digit number with checksum validation. It also requires the
presence of related keywords.
Library of system data identifiers 1278
Italy Driver's Licence Number

Table 45-556 Israel Personal Identification Number narrow-breadth pattern

Pattern

\d{9}

Table 45-557 Israel Personal Identification Number narrow-breadth validators

Mandatory validators Description

Israel Personal Identification Number Validation Check Computes the checksum and validates the pattern against
it.

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding numbers.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

identity number, IDnumber#, israeliidentitynumber,


identitynumber#, identity no, Israeli identity number,
unique personal ID, personal ID, unique ID, unique
identity number

‫هو ية اسرائيل ية‬, ‫זהותישר אלית‬,‫מספר זיהוי ישר אלי‬,‫מספר זיה וי‬
‫عدد هوية فريدة من نوعها‬,‫رقم الهوية‬,‫هوية إسرائ يلية‬,‫عدد‬

Italy Driver's Licence Number


The Italy Driver's Licence Number is the identifier for an individual driver's license issued by
the Driver and Vehicle Licensing Agency of Italy.
The Italy Driver's Licence Number data identifier detects a 10-character alphanumeric pattern
that matches the Italy Driver's Licence Number format.
The Italy Driver's Licence Number data identifier provides two breadths of detection:
■ The wide breadth detects a 10-character alphanumeric pattern without checksum validation.
See “Italy Driver's Licence Number wide breadth” on page 1279.
■ The narrow breadth detects a 10-character alphanumeric pattern without checksum
validation. It also requires the presence of related keywords.
See “Italy Driver's Licence Number narrow breadth” on page 1279.
Library of system data identifiers 1279
Italy Driver's Licence Number

Italy Driver's Licence Number wide breadth


The wide breadth detects a 10-character alphanumeric pattern without checksum validation.

Table 45-558 Italy Driver's Licence Number wide-breadth pattern

Pattern

[A-Za-z][A-Za-z]\d{7}[A-Za-Z]

Table 45-559 Italy Driver's Licence Number wide-breadth validator

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Italy Driver's Licence Number narrow breadth


The narrow breadth detects a 10-character alphanumeric pattern without checksum validation.
It also requires the presence of related keywords.

Table 45-560 Italy Driver's Licence Number narrow-breadth patterns

Pattern

[A-Za-z][A-Za-z]\d{7}[A-Za-Z]

Table 45-561 Italy Driver's Licence Number narrow-breadth validators

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding characters.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

drivers licence number, drivers license number, driving


licence number, driving license number, drivers
license, driving licence, driving license

patente guida numero, patente di guida numero,


patente di guida, patente guida

Driver's License, Driver's License Number, driver's


license number, Driver's Licence Number
Library of system data identifiers 1280
Italy Health Insurance Number

Italy Health Insurance Number


The Italian Health Insurance Card is issued to every Italian citizen by the Italian Ministry of
Economy and Finance in cooperation with the Italian Agency of Revenue. The objective of the
card is to improve the social security services through expenditure control and performance,
and to optimize the use health services to citizens.
The Italy Health Insurance Number data identifier detects a 16-character alphanumeric pattern
that matches the Italy Health Insurance Number format.
The Italy Health Insurance Number data identifier provides two breadths of detection:
■ The wide breadth detects a 16-character alphanumeric pattern without checksum validation.
It also requires the presence of related keywords.
See “Italy Health Insurance Number wide breadth” on page 1280.
■ The wide breadth detects a 16-character alphanumeric pattern with checksum validation.
It also requires the presence of related keywords.
See “Italy Health Insurance Number narrow breadth” on page 1281.

Italy Health Insurance Number wide breadth


The wide breadth detects a 16-character alphanumeric pattern without checksum validation.
It also requires the presence of related keywords.

Table 45-562 Italy Health Insurance Number wide-breadth pattern

Pattern

[A-Z]{6}[0-9LMNPQRSTUV]{2}[ABCDEHLMPRST][0-9LMNPQRSTUV]

{2}[A-Z][0-9LMNPQRSTUV]{3}[A-Z]

[A-Z]{3} [A-Z]{3} [0-9LMNPQRSTUV]{2}[ABCDEHLMPRST]

[0-9LMNPQRSTUV]{2} [A-Z][0-9LMNPQRSTUV]{3}[A-Z]

Table 45-563 Italy Health Insurance Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding numbers.


Library of system data identifiers 1281
Italy Health Insurance Number

Table 45-563 Italy Health Insurance Number wide-breadth validators (continued)

Mandatory validator Description

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched when you use this
option.

TESSERA SANITARIA, tessera sanitaria, tessera


sanitaria italiana, Health Insurance Card, Italian health
insurance card, health insurance card, EHIC, health
card, ehic, Health Card

Italy Health Insurance Number narrow breadth


The wide breadth detects a 16-character alphanumeric pattern with checksum validation. It
also requires the presence of related keywords.

Table 45-564 Italy Health Insurance Number narrow-breadth patterns

Pattern

[A-Z]{6}[0-9LMNPQRSTUV]{2}[ABCDEHLMPRST][0-9LMNPQRSTUV]

{2}[A-Z][0-9LMNPQRSTUV]{3}[A-Z]

[A-Z]{3} [A-Z]{3} [0-9LMNPQRSTUV]{2}[ABCDEHLMPRST]

[0-9LMNPQRSTUV]{2} [A-Z][0-9LMNPQRSTUV]{3}[A-Z]

Table 45-565 Italy Health Insurance Number narrow-breadth validators

Mandatory validator Description

Codice Fiscale Control Key Check Computes the control key and checks if it is valid.

Number delimiter Validates a match by checking the surrounding numbers.

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched when you use this
option.

TESSERA SANITARIA, tessera sanitaria, tessera


sanitaria italiana, Health Insurance Card, Italian health
insurance card, health insurance card, EHIC, health
card, ehic, Health Card
Library of system data identifiers 1282
Italy Passport Number

Italy Passport Number


Italian passports are issued to Italian citizens for the purpose of international travel.
The Italy Passport Number data identifier detects a nine-character alphanumeric pattern that
matches the Italy Passport Number format.
The Italy Passport Number data identifier provides two breadths of detection:
■ The wide breadth detects a nine-character alphanumeric pattern without checksum
validation.
See “Italy Passport Number wide breadth” on page 1282.
■ The narrow breadth detects a nine-character alphanumeric pattern without checksum
validation. It also requires the presence of related keywords.
See “Italy Passport Number narrow breadth” on page 1282.

Italy Passport Number wide breadth


The wide breadth detects a nine-character alphanumeric pattern without checksum validation.

Table 45-566 Italy Passport Number wide-breadth pattern

Pattern

\l{2}\d{7}

Table 45-567 Italy Passport Number wide-breadth validator

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding numbers.

Italy Passport Number narrow breadth


The narrow breadth detects a nine-character alphanumeric pattern without checksum validation.
It also requires the presence of related keywords.

Table 45-568 Italy Passport Number narrow-breadth patterns

Pattern

\l{2}\d{7}
Library of system data identifiers 1283
Italy Value Added Tax (VAT) Number

Table 45-569 Italy Passport Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding numbers.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

Repubblica Italiana Passaporto, Passaporto,


Passaporto Italiana, passport number, Italiana
Passaporto numero, Passaporto numero, Numéro
passeport italien, numéro passeport, Italian passport
number

Italy Value Added Tax (VAT) Number


Value-Added Tax (VAT) is a consumption tax that is borne by the end consumer. VAT is paid
for each transaction in the manufacturing and distribution process. For Italy, the Value Added
Tax is issued by VAT office for the region in which the business is established.
The Italy Value Added Tax (VAT) Number data identifier detects a 13-character alphanumeric
pattern that matches the Italy Value Added Tax (VAT) Number format.
The Italy Value Added Tax (VAT) Number data identifier provides three breadths of detection:
■ The wide breadth detects a 13-character alphanumeric pattern preceded by IT, without
checksum validation.
See “Italy Value Added Tax (VAT) Number wide breadth” on page 1283.
■ The medium breadth detects a 13-character alphanumeric pattern preceded by IT, with
checksum validation.
See “Italy Value Added Tax (VAT) Number medium breadth” on page 1284.
■ The narrow breadth detects a 13-character alphanumeric pattern preceded by IT, with
checksum validation. It also requires the presence of related keywords.
See “Italy Value Added Tax (VAT) Number narrow breadth” on page 1285.

Italy Value Added Tax (VAT) Number wide breadth


The wide breadth detects a 13-character alphanumeric pattern preceded by IT, without
checksum validation.
Library of system data identifiers 1284
Italy Value Added Tax (VAT) Number

Table 45-570 Italy Value Added Tax (VAT) Number wide-breadth pattern

Pattern

[Ii][Tt]\d{11}

[Ii][Tt] \d{11}

[Ii][Tt].\d{11}

[Ii][Tt]-\d{11}

[Ii][Tt],\d{11}

Table 45-571 Italy Value Added Tax (VAT) Number wide-breadth validator

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding numbers.

Italy Value Added Tax (VAT) Number medium breadth


The medium breadth detects a 13-character alphanumeric pattern preceded by IT, with
checksum validation.

Table 45-572 Italy Value Added Tax (VAT) Number medium-breadth patterns

[Ii][Tt]\d{11}

[Ii][Tt] \d{11}

[Ii][Tt].\d{11}

[Ii][Tt]-\d{11}

[Ii][Tt],\d{11}

Table 45-573 Italy Value Added Tax (VAT) Number medium-breadth validator

Italy VAT Number Validation Check Checksum validator for the Italy Value Added Tax
(VAT) Number.
Library of system data identifiers 1285
Japan Driver's License Number

Italy Value Added Tax (VAT) Number narrow breadth


The narrow breadth detects a 13-character alphanumeric pattern preceded by IT, with
checksum validation. It also requires the presence of related keywords.

Table 45-574 Italy Value Added Tax (VAT) Number narrow-breadth patterns

Pattern

[Ii][Tt]\d{11}

[Ii][Tt] \d{11}

[Ii][Tt].\d{11}

[Ii][Tt]-\d{11}

[Ii][Tt],\d{11}

Table 45-575 Italy Value Added Tax (VAT) Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding numbers.

Italy VAT Number Validation Check Checksum validator for the Italy Value Added Tax (VAT)
Number.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

VAT Number, vat no, VAT#, IVA, numero partita IVA,


IVA#, numero IVA

Japan Driver's License Number


In Japan, a driving license is required when operating a car, motorcycle or moped on public
roads. Driving licenses are issued by the prefectural governments' public safety commissions
and are overseen on a nationwide basis by the National Police Agency.
The Japan Driver's License Number data identifier detects a 12-digit number that matches the
Japan Driver's License Number format.
The Japan Driver's License Number data identifier provides three breadths of detection:
■ The wide breadth detects a 12-digit number without checksum validation.
See “Japan Driver's License Number wide breadth” on page 1286.
Library of system data identifiers 1286
Japan Driver's License Number

■ The medium breadth detects a 12-digit number with checksum validation.


See “Japan Driver's License Number medium breadth” on page 1286.
■ The narrow breadth detects a 12-digit number with checksum validation. It also requires
the presence of related keywords.
See “Japan Driver's License Number narrow breadth” on page 1286.

Japan Driver's License Number wide breadth


The wide breadth detects a 12-digit number without checksum validation.

Table 45-576 Japan Driver's License Number wide-breadth pattern

Pattern

\d{12}

Table 45-577 Japan Driver's License Number validator

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding characters.

Duplicate digits Ensures that a string of digits is not all the same.

Japan Driver's License Number medium breadth


The medium breadth detects a 12-digit number with checksum validation.

Table 45-578 Japan Driver's License Number medium-breadth pattern

Pattern

\d{12}

Table 45-579 Japan Driver's License Number medium-breadth validator

Mandatory validator Description

Japan Driver's License Number Validation Check Computes the checksum and validates the pattern against
it.

Japan Driver's License Number narrow breadth


The narrow breadth detects a 12-digit number with checksum validation. It also requires the
presence of related keywords.
Library of system data identifiers 1287
Japan Passport Number

Table 45-580 Japan Driver's License Number narrow-breadth pattern

Pattern

\d{12}

Table 45-581 Japan Driver's License Number narrow-breadth validators

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding characters.

Japan Driver's License Number Validation Check Computes the checksum and validates the pattern against
it.

Duplicate digits Ensures that a string of digits is not all the same.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

公安委員会,番号,免許,交付,運転免許,運転免許証,ドライ
バライセンス,ドライバーズライセンス,ライセンス,運転
免許証番号

driver's license,driving license,driver license,driver's


license number,driving license number,driver license
number,license

Japan Passport Number


Japan Passport Numbers are issued to Japanese citizens for international travel.
The Japan Passport Number detects a valid Japanese passport number pattern.
The Japan Passport Number data identifier provides two breadths of detection:
■ The wide breadth detects a valid Japanese passport number pattern.
See “Japan Passport Number wide breadth” on page 1287.
■ The narrow breadth detects a valid Japanese passport number pattern. It also requires the
presence of related keywords.
See “Japan Passport Number narrow breadth” on page 1288.

Japan Passport Number wide breadth


The wide breadth detects a valid Japanese passport number pattern.
Library of system data identifiers 1288
Japan Passport Number

Table 45-582 Japan Passport Number wide-breadth patterns

Patterns

\l{2}\d{3}\l\d{2}\l\d

\l{2}\d{4}\l\d\l\d

\l\d{4}\l\d{2}\l\d

\l\d{4}\l\d{2}\l{2}\d

\l{2}\d{3}\l\d{2}\l{2}\d

\l{2}\d{8}

\l{2}\d{7}

\l\d{8}

Table 45-583 Japan Passport Number wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Japan Passport Number narrow breadth


The narrow breadth detects a valid Japanese passport number pattern. It also requires the
presence of related keywords.

Table 45-584 Japan Passport Number narrow-breadth patterns

Patterns

\l{2}\d{3}\l\d{2}\l\d

\l{2}\d{4}\l\d\l\d

\l\d{4}\l\d{2}\l\d

\l\d{4}\l\d{2}\l{2}\d

\l{2}\d{3}\l\d{2}\l{2}\d

\l{2}\d{8}

\l{2}\d{7}

\l\d{8}
Library of system data identifiers 1289
Japanese Juki-Net Identification Number

Table 45-585 Japan Passport Number narrow-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

日本国旅券, パスポート, パスポート数, passport,


Passport, JAPAN PASSPORT, Japan Passport, japan
passport, Passport Book, passport book

Japanese Juki-Net Identification Number


The Juki Net Identification Number is a unique number assigned to both Japanese and foreign
residents for confirming their personal identification.
The Japanese Juki-Net Identification Number detects an 11-digit number that matches the
Japanese Juki-Net Identification Number format.
The Juki-Net Identification Number system data identifier provides three breadths of detection:
■ The wide breadth detects an 11-digit number without checksum validation.
See “Japanese Juki-Net Identification Number wide breadth” on page 1289.
■ The medium breadth detects an 11-digit number with checksum validation.
See “Japanese Juki-Net Identification Number medium breadth” on page 1290.
■ The narrow breadth detects an 11-digit number that passes checksum validation. It also
requires the presence of related keywords.
See “Japanese Juki-Net Identification Number narrow breadth” on page 1290.

Japanese Juki-Net Identification Number wide breadth


The wide breadth detects an 11-digit number without checksum validation.

Table 45-586 Japanese Juki-Net Identification Number wide-breadth pattern

Pattern

\d{11}
Library of system data identifiers 1290
Japanese Juki-Net Identification Number

Table 45-587 Japanese Juki-Net Identification Number wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Japanese Juki-Net Identification Number medium breadth


The medium breadth detects an 11-digit number with checksum validation.

Table 45-588 Japanese Juki-Net Identification Number medium-breadth pattern

Pattern

\d{11}

Table 45-589 Japanese Juki-Net Identification Number medium-breadth validator

Mandatory validator Description

Japanese Juki-Net Id Validation Check Validator computes checksum number that every Japanese
Juki-net card number must pass.

Number delimiter Validates a match by checking the surrounding characters.

Japanese Juki-Net Identification Number narrow breadth


The narrow breadth detects an 11-digit number that passes checksum validation. It also
requires the presence of related keywords.

Table 45-590 Japanese Juki-Net Identification Number narrow-breadth pattern

Pattern

\d{11}

Table 45-591 Japanese Juki-Net Identification Number narrow-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

Japanese Juki-Net Id Validation Check Validator computes checksum number that every Japanese
Juki-net card number must pass..
Library of system data identifiers 1291
Japanese My Number - Corporate

Table 45-591 Japanese Juki-Net Identification Number narrow-breadth validators (continued)

Mandatory validator Description

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

juki net identity number, juki net number, identification


number, Juki Net No, jukinetno# personal identification
number, juki net no, jukinetnumber#, unique jukinet
ID

住基ネット識別番号, 住基ネット番号, 識別番号, 個人識


別番号, ID番号, ユニークID番号

Japanese My Number - Corporate


The Japanese My Number - Corporate is a unique identifier for Japanese corporations used
for tax administration, social security administration, and disaster response.
The Japanese My Number - Corporate data identifier detects a 13-digit number that matches
the My Number - Corporate format.
The Japanese My Number - Corporate data identifier provides two breadths of detection:
■ The wide breadth detects a 13-digit number with checksum validation.
See “ Japanese My Number - Corporate wide breadth” on page 1291.
■ The narrow breadth detects a 13-digit number with checksum validation. It also requires
the presence of related keywords.
See “Japanese My Number - Corporate narrow breadth” on page 1292.

Japanese My Number - Corporate wide breadth


The wide breadth detects a 13-digit number with checksum validation.

Table 45-592 Japanese My Number - Corporate wide-breadth pattern

Pattern

\d{13}
Library of system data identifiers 1292
Japanese My Number - Personal

Table 45-593 Japanese My Number - Corporate wide-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Japanese My Number Validation Check Computes the checksum and validates the pattern against
it.

Number delimiter Validates a match by checking the surrounding numbers.

Japanese My Number - Corporate narrow breadth


The narrow breadth detects a 13-digit number with checksum validation. It also requires the
presence of a Japanese My Number-related keyword.

Table 45-594 Japanese My Number - Corporate narrow-breadth pattern

Pattern

\d{13}

Table 45-595 Japanese My Number - Corporate narrow-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Japanese My Number Validation Check Computes the checksum and validates the pattern against
it.

Exclude beginning characters Data beginning with any of the following list of values is
not matched:

000000000000

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

マイナンバー, 共通番号

Japanese My Number - Personal


The Japanese My Number - Personal is a unique identifier for Japanese citizens and residents
used for tax administration, social security administration, and disaster response.
Library of system data identifiers 1293
Japanese My Number - Personal

The Japanese My Number - Personal data identifier detects a 12-digit number that matches
the My Number - Personal format.
■ The wide breadth detects a 12-digit number with checksum validation.
See “Japanese My Number - Personal wide breadth” on page 1293.
■ The medium breadth detects a 12-digit number with checksum validation.
See “Japanese My Number - Personal medium breadth” on page 1293.
■ The narrow breadth detects a 12-digit number with checksum validation. It also requires
the presence of related keywords.
See “Japanese My Number - Personal narrow breadth” on page 1294.

Japanese My Number - Personal wide breadth


The wide breadth detects a 12-digit number with checksum validation.

Table 45-596 Japanese My Number - Personal wide-breadth pattern

Pattern

\d{12}

Table 45-597 Japanese My Number - Personal wide-breadth validators

Mandatory validator Description

Japanese My Number Validation Check Computes the checksum and validates the pattern against
it.

Exclude beginning characters Data beginning with any of the following list of values is
not matched:

000000000000

Japanese My Number - Personal medium breadth


The medium breadth detects a 12-digit number with checksum validation.

Table 45-598 Japanese My Number - Personal medium-breadth patterns

Pattern

\d{12}

\d{4} \d{4} \d{4}

\d{4}-\d{4}-\d{4}
Library of system data identifiers 1294
Japanese My Number - Personal

Table 45-598 Japanese My Number - Personal medium-breadth patterns (continued)

Pattern

\d{4}.\d{4}.\d{4}

Table 45-599 Japanese My Number - Personal medium-breadth validators

Mandatory validator Description

Japanese My Number Validation Check Computes the checksum and validates the pattern against
it.

Exclude beginning characters Data beginning with any of the following list of values is
not matched:

000000000000

Japanese My Number - Personal narrow breadth


The narrow breadth detects a 12-digit number with checksum validation. It also requires the
presence of related keywords.

Table 45-600 Japanese My Number - Personal narrow-breadth patterns

Pattern

\d{12}

\d{4} \d{4} \d{4}

\d{4}-\d{4}-\d{4}

\d{4}.\d{4}.\d{4}

Table 45-601 Japanese My Number - Personal narrow-breadth validators

Mandatory validator Description

Japanese My Number Validation Check Computes the checksum and validates the pattern against
it.

Exclude beginning characters Data beginning with any of the following list of values is
not matched:

000000000000
Library of system data identifiers 1295
Kazakhstan Passport Number

Table 45-601 Japanese My Number - Personal narrow-breadth validators (continued)

Mandatory validator Description

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

マイナンバー, 個人番号, 共通番号

Kazakhstan Passport Number


Kazakhstani passports are issued to citizens of the Republic of Kazakhstan to facilitate
international travel.
The Kazakhstan Passport Number data identifier detects an eight-character alphanumeric
pattern that matches the Kazakhstan Passport Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects an eight-character alphanumeric pattern that matches the
Kazakhstan Passport Number format. It checks for common test patterns.
See “Kazakhstan Passport Number wide breadth” on page 1295.
■ The narrow breadth detects an eight-character alphanumeric pattern that matches the
Kazakhstan Passport Number format. It checks for common test patterns, and also requires
the presence of related keywords.
See “Kazakhstan Passport Number narrow breadth” on page 1296.

Kazakhstan Passport Number wide breadth


The wide breadth detects an eight-character alphanumeric pattern that matches the Kazakhstan
Passport Number format. It checks for common test patterns.

Table 45-602 Kazakhstan Passport Number wide-breadth patterns

Pattern

[A-Za-z]\d\d\d\d\d\d\d

Table 45-603 Kazakhstan Passport Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.


Library of system data identifiers 1296
Korea Passport Number

Table 45-603 Kazakhstan Passport Number wide-breadth validators (continued)

Mandatory validator Description

Exclude ending characters Data ending with any of the following list of values is not
matched:

0000000, 1111111, 2222222, 3333333, 4444444,


5555555, 6666666, 7777777, 8888888, 9999999

Kazakhstan Passport Number narrow breadth


The narrow breadth detects an eight-character alphanumeric pattern that matches the
Kazakhstan Passport Number format. It checks for common test patterns, and also requires
the presence of related keywords.

Table 45-604 Kazakhstan Passport Number narrow-breadth patterns

Pattern

[A-Za-z]\d\d\d\d\d\d\d

Table 45-605 Kazakhstan Passport Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

0000000, 1111111, 2222222, 3333333, 4444444,


5555555, 6666666, 7777777, 8888888, 9999999

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

passport, passport number, passport no, passportno,


passport no., passport#, passportno#

төлқұжат, төлқұжат нөмірі, номер паспорта,


заграничный пасспорт, национальный паспорт

Korea Passport Number


Korean Passports are issued to Korean citizens to facilitate international travel.
Library of system data identifiers 1297
Korea Passport Number

The Korea Passport Number data identifier detects a valid Korean passport number.
The Korea Passport Number data identifier provides two breadths of detection:
■ The wide breadth detects a valid Korean Passport Number pattern.
See “Korea Passport Number wide breadth” on page 1297.
■ The narrow breadth detects a valid Korean Passport Number pattern. It also requires the
presence of related keywords.
See “Korea Passport Number narrow breadth” on page 1297.

Korea Passport Number wide breadth


The wide breadth detects a valid Korean Passport Number pattern.

Table 45-606 Korea Passport Number wide-breadth patterns

Patterns

\l{2}\d{7}

\l\d{8}

\d{9}

Table 45-607 Korea Passport Number wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Korea Passport Number narrow breadth


The narrow breadth detects a valid Korean Passport Number pattern. It also requires the
presence of related keywords.

Table 45-608 Korea Passport Number narrow-breadth patterns

Patterns

\l{2}\d{7}

\l\d{8}

\d{9}
Library of system data identifiers 1298
Korea Residence Registration Number for Foreigners

Table 45-609 Korea Passport Number narrow-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

한국어 여권, 여권, 여권 번호, 조선 민주주의 인민 공화국,


대한민국

passport, Passport, KOREA PASSPORT, Korea


Passport, korea passport, Book, passport book, South
Korea, Republic of Korea

Korea Residence Registration Number for Foreigners


A foreign resident registration number is a 13-digit number issued to all foreign residents of
the Republic of Korea. It is used to identify people in various private transactions such as in
banking and employment and for online identification purposes.
The Korea Residence Registration Number for Foreigners data identifier detects a 13-digit
number that matches the Korea Residence Registration Number for Foreigners format.
The Korea Residence Registration Number for Foreigners data identifier provides three breadths
of detection:
■ The wide breadth detects a 13-digit number without checksum validation.
See “Korea Residence Registration Number for Foreigners wide breadth” on page 1298.
■ The medium breadth detects a 13-digit number with checksum validation.
See “Korea Residence Registration Number for Foreigners medium breadth” on page 1299.
■ The narrow breadth detects a 13-digit number with checksum validation. It also requires
the presence of related keywords.
See “Korea Residence Registration Number for Foreigners narrow breadth” on page 1299.

Korea Residence Registration Number for Foreigners wide breadth


The wide breadth detects a 13-digit number without checksum validation.
Library of system data identifiers 1299
Korea Residence Registration Number for Foreigners

Table 45-610 Korea Residence Registration Number for Foreigners wide-breadth patterns

Patterns

\d{2}[01]\d[0123]\d-\d{7}

\d{2}[01]\d[0123]\d{8}

\d\d[01]\d[0123]\d-\d{7}

\d{2}[01]\d[0123]\d[ ]\d{7}

Table 45-611 Korea Residence Registration Number for Foreigners wide-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Korea Residence Registration Number for Foreigners medium


breadth
The medium breadth detects a 13-digit number with checksum validation.

Table 45-612 Korea Residence Registration Number for Foreigners medium-breadth patterns

Patterns

\d{2}[01]\d[0123]\d-\d{7}

\d{2}[01]\d[0123]\d{8}

\d\d[01]\d[0123]\d-\d{7}

\d{2}[01]\d[0123]\d[ ]\d{7}

Table 45-613 Korea Residence Registration Number for Foreigners medium-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

KRRN Foreign Validation Check Computes the checksum and validates the pattern against
it.

Korea Residence Registration Number for Foreigners narrow breadth


The narrow breadth detects a 13-digit number with checksum validation. It also requires the
presence of related keywords.
Library of system data identifiers 1300
Korea Residence Registration Number for Korean

Table 45-614 Korea Residence Registration Number for Foreigners narrow-breadth patterns

Patterns

\d{2}[01]\d[0123]\d-\d{7}

\d{2}[01]\d[0123]\d{8}

\d\d[01]\d[0123]\d-\d{7}

\d{2}[01]\d[0123]\d[ ]\d{7}

Table 45-615 Korea Residence Registration Number for Foreigners narrow-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

KRRN Foreign Validation Check Computes the checksum and validates the pattern against
it.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

외국인 등록 번호, 주민번호, Foreign Registration


Number, Foreign Resident Number

Korea Residence Registration Number for Korean


A resident registration number is a 13-digit number issued to all residents of the Republic of
Korea. Similar to national identification numbers in other countries, it is used to identify people
in various private transactions such as in banking and employment. It is also used extensively
for online identification purposes.
The Korea Residence Registration Number for Korean data identifier detects a 13-digit number
that matches the residence registration number format.
The Korea Residence Registration Number for Korean data identifier provides three breadths
of detection:
■ The wide breadth detects a 13-digit number without checksum validation.
See “Korea Residence Registration Number for Korean wide breadth” on page 1301.
■ The medium breadth detects a 13-digit number with checksum validation.
See “Korea Residence Registration Number for Korean medium breadth” on page 1301.
Library of system data identifiers 1301
Korea Residence Registration Number for Korean

■ The narrow breadth detects a 13-digit number with checksum validation. It also requires
the presence of related keywords.
See “Korea Residence Registration Number for Korean narrow breadth” on page 1302.

Korea Residence Registration Number for Korean wide breadth


The wide breadth detects a 13-digit number without checksum validation.

Table 45-616 Korea Residence Registration Number for Korean wide-breadth patterns

Patterns

\d{2}[01]\d[0123]\d-\d{7}

\d{2}[01]\d[0123]\d{8}

\d\d[01]\d[0123]\d-\d{7}

\d{2}[01]\d[0123]\d[ ]\d{7}

Table 45-617 Korea Residence Registration Number for Korean wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Korea Residence Registration Number for Korean medium breadth


The medium breadth detects a 13-digit number with checksum validation.

Table 45-618 Korea Residence Registration Number for Korean medium-breadth patterns

Patterns

\d{2}[01]\d[0123]\d-\d{7}

\d{2}[01]\d[0123]\d{8}

\d\d[01]\d[0123]\d-\d{7}

\d{2}[01]\d[0123]\d[ ]\d{7}

Table 45-619 Korea Residence Registration Number for Korean medium-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.


Library of system data identifiers 1302
Korea Residence Registration Number for Korean

Table 45-619 Korea Residence Registration Number for Korean medium-breadth validators
(continued)

Mandatory validator Description

Advanced KRRN Validation Validates that the third and fourth digits represent a valid
month, and that the fifth and sixth digits represent a valid
day. Validates the checksum of the pattern.

Korea Residence Registration Number for Korean narrow breadth


The narrow breadth detects a 13-digit number with checksum validation. It also requires the
presence of related keywords.

Table 45-620 Korea Residence Registration Number for Korean narrow-breadth patterns

Patterns

\d{2}[01]\d[0123]\d-\d{7}

\d{2}[01]\d[0123]\d{8}

\d\d[01]\d[0123]\d-\d{7}

\d{2}[01]\d[0123]\d[ ]\d{7}

Table 45-621 Korea Residence Registration Number for Korean narrow-breadth validators

Mandatory validators Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

Advanced KRRN Validation Validates that the third and fourth digits represent a valid
month, and that the fifth and sixth digits represent a valid
day. Validates the checksum of the pattern.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

주민등록번호, 주민번호

Resident Registration Number, Resident Number


Library of system data identifiers 1303
Latvia Driver's Licence Number

Latvia Driver's Licence Number


A driver's license in Latvia is a document issued by the Road Traffic Safety Directorate,
confirming the rights of the holder to drive motor vehicles.
The Latvia Driver's Licence Number data identifier detects an eight- or nine-character
alphanumeric pattern that matches the Latvia Driver's Licence Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects an eight- or nine-character alphanumeric pattern that matches
the Latvia Driver's Licence Number format. It checks for common test numbers.
See “Latvia Driver's Licence Number wide breadth” on page 1303.
■ The narrow breadth detects an eight- or nine-character alphanumeric pattern that matches
the Latvia Driver's Licence Number format. It checks for common test numbers, and also
requires the presence of related keywords.
See “Latvia Driver's Licence Number narrow breadth” on page 1304.

Latvia Driver's Licence Number wide breadth


The wide breadth detects an eight- or nine-character alphanumeric pattern that matches the
Latvia Driver's Licence Number format. It checks for common test numbers.

Table 45-622 Latvia Driver's Licence Number wide-breadth patterns

Pattern

[a-zA-Z]{2}\d{6}

[a-zA-Z]{2}\d{7}

[a-zA-Z]{3}\d{6}

Table 45-623 Latvia Driver's Licence Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

000000, 111111, 222222, 333333, 444444, 555555,


666666, 777777, 888888, 999999, 0000000, 1111111,
2222222, 3333333, 4444444, 5555555, 6666666,
7777777, 8888888, 9999999
Library of system data identifiers 1304
Latvia Driver's Licence Number

Latvia Driver's Licence Number narrow breadth


The narrow breadth detects

Table 45-624 Latvia Driver's Licence Number narrow-breadth patterns

Pattern

[a-zA-Z]{2}\d{6}

[a-zA-Z]{2}\d{7}

[a-zA-Z]{3}\d{6}

Table 45-625 Latvia Driver's Licence Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

000000, 111111, 222222, 333333, 444444, 555555,


666666, 777777, 888888, 999999, 0000000, 1111111,
2222222, 3333333, 4444444, 5555555, 6666666,
7777777, 8888888, 9999999

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

licence number, driver license, driver licence, drivers


license, drivers licence, driving license, driving licence,
driver license number, driver licence number, drivers
license number, drivers licence number, driving license
number, driving licence number, driver's license,
driver's licence, Driver's License, Driver's Licence,
driver's license number, driver's licence number,
Driver's License Number, Driver's Licence Number,
DLNo#, dlno#, drivers lic., driver permit, drivers permit,
driving permit, license number

licences numurs, vadītāja apliecība, autovadītāja


apliecība, vadītāja apliecības numurs, Vadītāja licences
numurs, vadītāji lic., vadītāja atļauja
Library of system data identifiers 1305
Latvia Passport Number

Latvia Passport Number


Latvian passports are issued to citizens of Latvia for identity and international travel purposes.
The territorial section of The Office of Citizenship and Migration Affairs issues passports.
The Latvia Passport Number data identifier detects a nine-character alphanumeric pattern that
matches the Latvia Passport Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects a nine-character alphanumeric pattern that matches the Latvia
Passport Number format. It checks for common test patterns.
See “Latvia Passport Number wide breadth” on page 1305.
■ The narrow breadth detects a nine-character alphanumeric pattern that matches the Latvia
Passport Number format. It checks for common test patterns, and also requires the presence
of related keywords.
See “Latvia Passport Number narrow breadth” on page 1305.

Latvia Passport Number wide breadth


The wide breadth detects a nine-character alphanumeric pattern that matches the Latvia
Passport Number format. It checks for common test patterns.

Table 45-626 Latvia Passport Number wide-breadth patterns

Pattern

[Ll][A-Za-z]\d{7}

Table 45-627 Latvia Passport Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

0000000, 1111111, 2222222, 3333333, 4444444,


5555555, 6666666, 7777777, 8888888, 9999999

Latvia Passport Number narrow breadth


The narrow breadth detects a nine-character alphanumeric pattern that matches the Latvia
Passport Number format. It checks for common test patterns, and also requires the presence
of related keywords.
Library of system data identifiers 1306
Latvia Personal Identification Number

Table 45-628 Latvia Passport Number narrow-breadth patterns

Pattern

[Ll][A-Za-z]\d{7}

Table 45-629 Latvia Passport Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

0000000, 1111111, 2222222, 3333333, 4444444,


5555555, 6666666, 7777777, 8888888, 9999999

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

Passport, passport number, passport, passport no,


passport book, passport#, passportno, passport card,
LATVIJA, LETTONIE, Pases Nr., Pases Nr, Passport
No., Passport No, Passeport No., Passeport No, Pase,
pase, PASSPORT, PASSEPORT, pases numurs, Pases
Nr, pases grāmata, pase#, pases karte

Latvia Personal Identification Number


The Latvian personal identification number is used for national identity and as a tax identification
number for financial purposes. It is issued by the office of citizenship and migration affairs of
the Ministry of Interior.
The Latvia Personal Identification Number data identifier detects an 11-digit number that
matches the Latvia Personal Identification Number format.
The Latvia Personal Identification Number provides three breadths of detection:
■ The wide breadth detects an 11-digit number without checksum validation.
See “Latvia Personal Identification Number wide breadth” on page 1307.
■ The medium breadth detects an 11-digit number with checksum validation.
See “Latvia Personal Identification Number medium breadth” on page 1307.
■ The narrow breadth detects an 11-digit number with checksum validation. It also requires
the presence of related keywords.
See “Latvia Personal Identification Number narrow breadth” on page 1307.
Library of system data identifiers 1307
Latvia Personal Identification Number

Latvia Personal Identification Number wide breadth


The wide breadth detects an 11-digit number without checksum validation.

Table 45-630 Latvia Personal Identification Number wide-breadth patterns

Patterns

\d{2}[01]\d{3}-[012]\d{4}

\d{2}[01]\d{3}[012]\d{4}

32\d{9}

Table 45-631 Latvia Personal Identification Number wide-breadth validators

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding characters.

Duplicate digits Ensures that a string of digits is not all the same.

Latvia Personal Identification Number medium breadth


The medium breadth detects an 11-digit number with checksum validation.

Table 45-632 Latvia Personal Identification Number medium-breadth patterns

Patterns

\d{2}[01]\d{3}-[012]\d{4}

\d{2}[01]\d{3}[012]\d{4}

32\d{9}

Table 45-633 Latvia Personal Identification Number medium-breadth validator

Mandatory validator Description

Latvia Personal Code Check Computes the checksum and validates the pattern against
it.

Latvia Personal Identification Number narrow breadth


The narrow breadth detects an 11-digit number with checksum validation. It also requires the
presence of related keywords.
Library of system data identifiers 1308
Latvia Value Added Tax (VAT) Number

Table 45-634 Latvia Personal Identification Number narrow-breadth patterns

Patterns

\d{2}[01]\d{3}-[012]\d{4}

\d{2}[01]\d{3}[012]\d{4}

32\d{9}

Table 45-635 Latvia Personal Identification Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Latvia Personal Code Check Computes the checksum and validates the pattern against
it.

Duplicate digits Ensures that a string of digits is not all the same.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

latvia personal code, personal code, national


identification number, identification number, national
id, id#, latvia tin, tin, tax identification number, tin#,
tax id, tin no, tin number, tax number

Personas kods, personas kods, latvijas personas kods,


Valsts identifikācijas numurs, valsts identifikācijas
numurs, identifikācijas numurs, nacionālais id, latvija
alva, alva, nodokļu identifikācijas numurs, nodokļu id,
alvas nē, nodokļa numurs

Latvia Value Added Tax (VAT) Number


Value Added Tax (VAT) is a consumption tax that is borne by the end consumer. VAT is paid
for each transaction in the manufacturing and distribution process. In Latvia, VAT is administered
by the State Revenue Service.
The Latvia Value Added Tax (VAT) Number data identifier detects a 13-character alphanumeric
pattern beginning with LV that matches the Latvia VAT Number format.
This data identifier provides the following breadths of detection:
Library of system data identifiers 1309
Latvia Value Added Tax (VAT) Number

■ The wide breadth detects a 13-character alphanumeric pattern beginning with LV that
matches the Latvia VAT Number format without checksum validation. It checks for common
test patterns.
See “Latvia Value Added Tax (VAT) Number wide breadth” on page 1309.
■ The medium breadth detects a 13-character alphanumeric pattern beginning with LV that
matches the Latvia VAT Number format with checksum validation.
See “Latvia Value Added Tax (VAT) Number medium breadth” on page 1309.
■ The narrow breadth detects a 13-character alphanumeric pattern beginning with LV that
matches the Latvia VAT Number format with checksum validation. It checks for common
test patterns, and also requires the presence of related keywords.
See “Latvia Value Added Tax (VAT) Number narrow breadth” on page 1310.

Latvia Value Added Tax (VAT) Number wide breadth


The wide breadth detects a 13-character alphanumeric pattern beginning with LV that matches
the Latvia VAT Number format without checksum validation. It checks for common test patterns.

Table 45-636 Latvia Value Added Tax (VAT) Number wide-breadth patterns

Pattern

[Ll][Vv]\d{11}

[Ll][Vv] \d{11}

Table 45-637 Latvia Value Added Tax (VAT) Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

00000000000, 11111111111, 22222222222,


33333333333, 44444444444, 55555555555,
66666666666, 77777777777, 88888888888, 99999999999

Latvia Value Added Tax (VAT) Number medium breadth


The medium breadth detects a 13-character alphanumeric pattern beginning with LV that
matches the Latvia VAT Number format with checksum validation.
Library of system data identifiers 1310
Latvia Value Added Tax (VAT) Number

Table 45-638 Latvia Value Added Tax (VAT) Number medium-breadth patterns

Pattern

[Ll][Vv]\d{11}

[Ll][Vv] \d{11}

Table 45-639 Latvia Value Added Tax (VAT) Number medium-breadth validators

Mandatory validator Description

Latvia Value Added Tax (VAT) Number Validation Computes the checksum and validates the pattern against
Check it.

Latvia Value Added Tax (VAT) Number narrow breadth


The narrow breadth detects a 13-character alphanumeric pattern beginning with LV that matches
the Latvia VAT Number format with checksum validation. It checks for common test patterns,
and also requires the presence of related keywords.

Table 45-640 Latvia Value Added Tax (VAT) Number narrow-breadth patterns

Pattern

[Ll][Vv]\d{11}

[Ll][Vv] \d{11}

Table 45-641 Latvia Value Added Tax (VAT) Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

00000000000, 11111111111, 22222222222,


33333333333, 44444444444, 55555555555,
66666666666, 77777777777, 88888888888, 99999999999

Latvia Value Added Tax (VAT) Number Validation Computes the checksum and validates the pattern against
Check it.
Library of system data identifiers 1311
Liechtenstein Passport Number

Table 45-641 Latvia Value Added Tax (VAT) Number narrow-breadth validators (continued)

Mandatory validator Description

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

vat, vat number, value added tax, value added tax


number, vat identification number, vat#, vat no, VAT,
VAT#, vatin, VATIN

PVN Nr, PVN maksātāja numurs, PVN numurs, Vat Nr,


PVN#, pievienotās vērtības nodoklis, pievienotās
vērtības nodokļa numurs

Liechtenstein Passport Number


Liechtenstein passports are issued to nationals of Liechtenstein for the purpose of international
travel. The passport may also serve as proof of Liechtensteiner citizenship.
The Liechtenstein Passport Number data identifier detects a six-character alphanumeric pattern
that matches the Liechtenstein Passport Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects a six-character alphanumeric pattern that matches the
Liechtenstein Passport Number format. It checks for common test patterns.
See “Liechtenstein Passport Number wide breadth” on page 1311.
■ The narrow breadth detects a six-character alphanumeric pattern that matches the
Liechtenstein Passport Number format. It checks for common test patterns, and also requires
the presence of related keywords.
See “Liechtenstein Passport Number narrow breadth” on page 1312.

Liechtenstein Passport Number wide breadth


The wide breadth detects a six-character alphanumeric pattern that matches the Liechtenstein
Passport Number format. It checks for common test patterns.

Table 45-642 Liechtenstein Passport Number wide-breadth patterns

Pattern

[a-zA-Z]\d\d\d\d\d
Library of system data identifiers 1312
Lithuania Personal Identification Number

Table 45-643 Liechtenstein Passport Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

00000, 11111, 22222, 33333, 44444, 55555, 66666,


77777, 88888, 99999

Liechtenstein Passport Number narrow breadth


The narrow breadth detects a six-character alphanumeric pattern that matches the Liechtenstein
Passport Number format. It checks for common test patterns, and also requires the presence
of related keywords.

Table 45-644 Liechtenstein Passport Number narrow-breadth patterns

Pattern

[a-zA-Z]\d\d\d\d\d

Table 45-645 Liechtenstein Passport Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

00000, 11111, 22222, 33333, 44444, 55555, 66666,


77777, 88888, 99999

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

passport, passport number, passport no, passportno,


passport no., passport#, passportno#, Reisepass, Pass
Nr, Pass Nr., Reisepass#, Pass Nr#

Lithuania Personal Identification Number


In Lithuania, the personal identification code is a number based on the sex and birth date of
a person. This code is used as a unique personal identifier by governmental and other systems
Library of system data identifiers 1313
Lithuania Personal Identification Number

where identification is required, as well as for digital signatures using the national identity card
and its associated certificates.
The Lithuania Personal Identification Number data identifier detects an 11-digit number that
matches the Lithuania Personal Identification Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects an 11-digit number that matches the Lithuania Personal
Identification Number format without checksum validation. It checks for common test
numbers.
See “Lithuania Personal Identification Number wide breadth” on page 1313.
■ The medium breadth detects an 11-digit number that matches the Lithuania Personal
Identification Number format with checksum validation.
See “Lithuania Personal Identification Number medium breadth” on page 1314.
■ The narrow breadth detects an 11-digit number that matches the Lithuania Personal
Identification Number format with checksum validation. It checks for common test numbers,
and also requires the presence of related keywords.
See “Lithuania Personal Identification Number narrow breadth” on page 1314.

Lithuania Personal Identification Number wide breadth


The wide breadth detects an 11-digit number that matches the Lithuania Personal Identification
Number format without checksum validation. It checks for common test numbers.

Table 45-646 Lithuania Personal Identification Number wide-breadth patterns

Pattern

\d{3}[01]\d[0123]\d{5}

\d \d{2}[01]\d[0123]\d \d{4}

\d{3}[01]\d[0123]\d{4} \d

Table 45-647 Lithuania Personal Identification Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Duplicate digits Ensures that a string of digits is not all the same.
Library of system data identifiers 1314
Lithuania Personal Identification Number

Lithuania Personal Identification Number medium breadth


The medium breadth detects an 11-digit number that matches the Lithuania Personal
Identification Number format with checksum validation.

Table 45-648 Lithuania Personal Identification Number medium-breadth patterns

Pattern

\d{3}[01]\d[0123]\d{5}

\d \d{2}[01]\d[0123]\d \d{4}

\d{3}[01]\d[0123]\d{4} \d

Table 45-649 Lithuania Personal Identification Number medium-breadth validators

Mandatory validator Description

Estonia Personal Identification Number Check Computes the checksum and validates the pattern against
it.

Lithuania Personal Identification Number narrow breadth


The narrow breadth detects an 11-digit number that matches the Lithuania Personal
Identification Number format with checksum validation. It checks for common test numbers,
and also requires the presence of related keywords.

Table 45-650 Lithuania Personal Identification Number narrow-breadth patterns

Pattern

\d{3}[01]\d[0123]\d{5}

\d \d{2}[01]\d[0123]\d \d{4}

\d{3}[01]\d[0123]\d{4} \d

Table 45-651 Lithuania Personal Identification Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Duplicate digits Ensures that a string of digits is not all the same.

Estonia Personal Identification Number Check Computes the checksum and validates the pattern against
it.
Library of system data identifiers 1315
Lithuania Tax Identification Number

Table 45-651 Lithuania Personal Identification Number narrow-breadth validators (continued)

Mandatory validator Description

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

national ID, national identification number, personal


ID, personal identification number, nationalid#,
personalid#, personal identification code, PID#

Nacionalinis ID, Nacionalinis identifikavimo numeris,


asmens kodas

Lithuania Tax Identification Number


The Lithuanian Taxpayer Identification Number is used to identify taxpayers and facilitate the
administration of their national tax affairs.
The Lithuania Tax Identification Number data identifier detects and 11-digit number that matches
the Lithuania Tax Identification Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects and 11-digit number that matches the Lithuania Tax Identification
Number format without checksum validation. It checks for common test numbers.
See “Lithuania Tax Identification Number wide breadth” on page 1315.
■ The medium breadth detects and 11-digit number that matches the Lithuania Tax
Identification Number format with checksum validation.
See “Lithuania Tax Identification Number medium breadth” on page 1316.
■ The narrow breadth detects and 11-digit number that matches the Lithuania Tax Identification
Number format with checksum validation. It checks for common test numbers, and also
requires the presence of related keywords.
See “Lithuania Tax Identification Number narrow breadth” on page 1316.

Lithuania Tax Identification Number wide breadth


The wide breadth detects and 11-digit number that matches the Lithuania Tax Identification
Number format without checksum validation. It checks for common test numbers.
Library of system data identifiers 1316
Lithuania Tax Identification Number

Table 45-652 Lithuania Tax Identification Number wide breadth pattern

Pattern

[1-6]\d{2}[01]\d[0123]\d{5}

Table 45-653 Lithuania Tax Identification Number wide breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Duplicate digits Ensures that a string of digits is not all the same.

Lithuania Tax Identification Number medium breadth


The medium breadth detects and 11-digit number that matches the Lithuania Tax Identification
Number format with checksum validation.

Table 45-654 Lithuania Tax Identification Number medium breadth pattern

Pattern

[1-6]\d{2}[01]\d[0123]\d{5}

Table 45-655 Lithuania Tax Identification Number medium breadth validator

Mandatory validator Description

Lithuania Tax Identification Number Validation Check Computes the checksum and validates the pattern against
it.

Lithuania Tax Identification Number narrow breadth


The narrow breadth detects and 11-digit number that matches the Lithuania Tax Identification
Number format with checksum validation. It checks for common test numbers, and also requires
the presence of related keywords.

Table 45-656 Lithuania Tax Identification Number narrow breadth pattern

Pattern

[1-6]\d{2}[01]\d[0123]\d{5}
Library of system data identifiers 1317
Lithuania Value Added Tax (VAT) Number

Table 45-657 Lithuania Tax Identification Number narrow breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Duplicate digits Ensures that a string of digits is not all the same.

Lithuania Tax Identification Number Validation Check Computes the checksum and validates the pattern against
it.

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

tax identification no., TIN, tin, TIN#, tin#, tin no., tax
identification number, tin no, tax id, tax id no, tax id
no., taxid, taxid#, tax number, tax no, tax#, Tax
Identification Number

mokesčių identifikavimo Nr., mokesčių identifikavimo


numeris, mokesčių ID, mokesčių id nr, mokesčių id
nr., mokesčių ID#, mokesčių numeris, mokestis Nr,
mokestis #, Mokesčių identifikavimo numeris

Lithuania Value Added Tax (VAT) Number


Value Added Tax (VAT) is a consumption tax that is borne by the end consumer. VAT is paid
for each transaction in the manufacturing and distribution process. In Lithuania, VAT is
administered by the State Tax Inspectorate.
The Lithuania Value Added Tax (VAT) Number data identifier detects an 11- or 14-character
alphanumeric pattern beginning with LT.
This data identifier provides the following breadths of detection:
■ The wide breadth detects an 11- or 14-character alphanumeric pattern beginning with LT
without checksum validation. It checks for common test patterns.
See “Lithuania Value Added Tax (VAT) Number wide breadth” on page 1318.
■ The medium detects an 11- or 14-character alphanumeric pattern beginning with LT with
checksum validation.
See “Lithuania Value Added Tax (VAT) Number medium breadth” on page 1318.
■ The narrow breadth detects an 11- or 14-character alphanumeric pattern beginning with
LT with checksum validation. It checks for common test patterns, and also requires the
presence of related keywords.
See “Lithuania Value Added Tax (VAT) Number narrow breadth” on page 1319.
Library of system data identifiers 1318
Lithuania Value Added Tax (VAT) Number

Lithuania Value Added Tax (VAT) Number wide breadth


The wide breadth detects an 11- or 14-character alphanumeric pattern beginning with LT
without checksum validation. It checks for common test patterns.

Table 45-658 Lithuania Value Added Tax (VAT) Number wide-breadth patterns

Pattern

[Ll][Tt]\d{7}[1]\d

[Ll][Tt] \d{7}[1]\d

[Ll][Tt]\d{10}[1]\d

[Ll][Tt] \d{10}[1]\d

Table 45-659 Lithuania Value Added Tax (VAT) Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

000000010, 111111111, 222222212, 333333313,


444444414, 555555515, 666666616, 777777717,
888888818, 999999919, 000000000000, 111111111111,
222222222212, 333333333313, 444444444414,
555555555515, 666666666616, 777777777717,
888888888818, 999999999919

Lithuania Value Added Tax (VAT) Number medium breadth


The medium detects an 11- or 14-character alphanumeric pattern beginning with LT with
checksum validation.

Table 45-660 Lithuania Value Added Tax (VAT) Number medium-breadth patterns

Pattern

[Ll][Tt]\d{7}[1]\d

[Ll][Tt] \d{7}[1]\d

[Ll][Tt]\d{10}[1]\d

[Ll][Tt] \d{10}[1]\d
Library of system data identifiers 1319
Lithuania Value Added Tax (VAT) Number

Table 45-661 Lithuania Value Added Tax (VAT) Number medium-breadth validators

Mandatory validator Description

Lithuania Value Added Tax (VAT) Number Validation Computes the checksum and validates the pattern against
Check it.

Lithuania Value Added Tax (VAT) Number narrow breadth


The narrow breadth detects an 11- or 14-character alphanumeric pattern beginning with LT
with checksum validation. It checks for common test patterns, and also requires the presence
of related keywords.

Table 45-662 Lithuania Value Added Tax (VAT) Number narrow-breadth patterns

Pattern

[Ll][Tt]\d{7}[1]\d

[Ll][Tt] \d{7}[1]\d

[Ll][Tt]\d{10}[1]\d

[Ll][Tt] \d{10}[1]\d

Table 45-663 Lithuania Value Added Tax (VAT) Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

000000010, 111111111, 222222212, 333333313,


444444414, 555555515, 666666616, 777777717,
888888818, 999999919, 000000000000, 111111111111,
222222222212, 333333333313, 444444444414,
555555555515, 666666666616, 777777777717,
888888888818, 999999999919

Lithuania Value Added Tax (VAT) Number Validation Computes the checksum and validates the pattern against
Check it.
Library of system data identifiers 1320
Luxembourg National Register of Individuals Number

Table 45-663 Lithuania Value Added Tax (VAT) Number narrow-breadth validators (continued)

Mandatory validator Description

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

vat, vat number, vat#, value added tax number, VAT,


VAT#

pridėtinės vertės mokesčio numeris, PVM, PVM#,


pridėtinės vertės mokestis, PVM numeris, PVM
registracijos numeris

Luxembourg National Register of Individuals Number


The Luxembourg National Register of Individuals Number is an 11-digit identification number
issued to all Luxembourg citizens at age 15.
The Luxembourg National Register of Individuals Number data identifier detects an 11-digit
number that matches the Luxembourg National Register of Individuals Number format.
The Luxembourg National Register of Individuals Number system data identifier provides three
breadths of detection:
■ The wide breadth detects an 11-digit number without checksum validation.
See “ Luxembourg National Register of Individuals Number wide breadth” on page 1320.
■ The medium breadth detects an 11-digit number with checksum validation.
See “ Luxembourg National Register of Individuals Number medium breadth” on page 1321.
■ The narrow breadth detects an 11-digit number that passes checksum validation. It also
requires the presence of related keywords.
See “ Luxembourg National Register of Individuals Number narrow breadth” on page 1321.

Luxembourg National Register of Individuals Number wide breadth


The wide breadth detects an 11-digit number without checksum validation.

Table 45-664 Luxembourg National Register of Individuals Number wide-breadth pattern

Pattern

\d{11}
Library of system data identifiers 1321
Luxembourg National Register of Individuals Number

Table 45-665 Luxembourg National Register of Individuals Number wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Luxembourg National Register of Individuals Number medium breadth


The medium breadth detects an 11-digit number with checksum validation.

Table 45-666 Luxembourg National Register of Individuals Number medium breadth patterns

Pattern

\d{11}

Table 45-667 Luxembourg National Register of Individuals Number medium breadth validator

Mandatory validator Description

Luxembourg National Register of Individuals Number Computes the checksum and validates the pattern against
Validation Check it.

Number delimiter Validates a match by checking the surrounding characters.

Luxembourg National Register of Individuals Number narrow breadth


The narrow breadth detects an 11-digit number that passes checksum validation. It also
requires the presence of related keywords.

Table 45-668 Luxembourg National Register of Individuals Number narrow breadth patterns

Pattern

\d{11}

Table 45-669 Luxembourg National Register of Individuals Number narrow breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

Luxembourg National Register of Individuals Number Computes the checksum and validates the pattern against
Validation Check it.
Library of system data identifiers 1322
Luxembourg Passport Number

Table 45-669 Luxembourg National Register of Individuals Number narrow breadth validator
(continued)

Mandatory validator Description

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

Personal ID, personal ID number, personalidno#,


unique ID number, personalidnumber#, unique ID key,
Personal ID Code, uniqueidkey#, individual code,
individual ID

Eindeutige ID-Nummer, Eindeutige ID, ID personnelle,


Numéro d'identification personnel, IDpersonnelle#,
Persönliche Identifikationsnummer, EindeutigeID#

Luxembourg Passport Number


A Luxembourg passport is an international travel document issued to nationals of the grand
Duchy of Luxembourg, and may also serve as proof of Luxembourgish citizenship.
The Luxembourg Passport Number data identifier detects a seven- or eight-character
alphanumeric pattern that matches the Luxembourg Passport Number format.
The Luxembourg Passport Number data identifier provides two breadths of detection:
■ The wide breadth detects a seven- or eight-character alphanumeric pattern without
checksum validation.
See “Luxembourg Passport Number wide breadth” on page 1322.
■ The narrow breadth detects a seven- or eight-character alphanumeric pattern without
checksum validation. It requires the presence of related keywords.
See “Luxembourg Passport Number narrow breadth” on page 1323.

Luxembourg Passport Number wide breadth


The wide breadth detects a seven- or eight-character alphanumeric pattern without checksum
validation.

Table 45-670 Luxembourg Passport Number wide-breadth patterns

Patterns

\l\w{5}[0-9]
Library of system data identifiers 1323
Luxembourg Passport Number

Table 45-670 Luxembourg Passport Number wide-breadth patterns (continued)

Patterns

\l\w{5}[0-9][0-9A-Za-z]

Table 45-671 Luxembourg Passport Number wide-breadth validator

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Luxembourg Passport Number narrow breadth


The narrow breadth detects a seven- or eight-character alphanumeric pattern without checksum
validation. It requires the presence of related keywords.

Table 45-672 Luxembourg Passport Number narrow-breadth patterns

Patterns

\l\w{5}[0-9]

\l\w{5}[0-9][0-9A-Za-z]

Table 45-673 Luxembourg Passport Number narrow-breadth validators

Mandatory validators Description

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

passport number, passport, passport no, luxembourg


pass, luxembourg passeport, luxembourg passport

passnummer, ausweisnummer, passeport, reisepass,


pass, pass net, pass nr, no de passeport, passeport
nombre, numéro de passeport

Number delimiter Validates a match by checking the surrounding characters.


Library of system data identifiers 1324
Luxembourg Tax Identification Number

Luxembourg Tax Identification Number


This number is issued by Luxembourg internal revenue department (Administration des
contributions directes - ACD) and is used for tax related purposes of natural and non-natural
persons.
The Luxembourg Tax Identification Number data identifier detects an 11- or 13-digit number
that matches the Luxembourg Tax Identification Number format.
The Luxembourg Tax Identification Number data identifier provides three breadths of detection:
■ The wide breadth detects an 11- or 13-digit number without checksum validation.
See “Luxembourg Tax Identification Number wide breadth” on page 1324.
■ The medium breadth detects an 11- or 13-digit number with checksum validation.
See “Luxembourg Tax Identification Number medium breadth” on page 1325.
■ The narrow breadth detects an 11- or 13-digit number with checksum validation. It also
requires the presence of related keywords.
See “Luxembourg Tax Identification Number narrow breadth” on page 1326.

Luxembourg Tax Identification Number wide breadth


The wide breadth detects an 11- or 13-digit number without checksum validation.

Table 45-674 Luxembourg Tax Identification Number wide-breadth patterns

Patterns

[1][89]\d{2}[01]\d[0123]\d\d{5}

[1][89]\d{2}[01]\d[0123]\d \d{5}

[1][89]\d{2}[01]\d[0123]\d-\d{5}

[1][89]\d{2}[01]\d[0123]\d,\d{5}

[1][89]\d{2}[01]\d[0123]\d.\d{5}

[2][0]\d{2}[01]\d[0123]\d\d{5}

[2][0]\d{2}[01]\d[0123]\d \d{5}

[2][0]\d{2}[01]\d[0123]\d-\d{5}

[2][0]\d{2}[01]\d[0123]\d,\d{5}

[2][0]\d{2}[01]\d[0123]\d.\d{5}

\d{11}
Library of system data identifiers 1325
Luxembourg Tax Identification Number

Table 45-674 Luxembourg Tax Identification Number wide-breadth patterns (continued)

Patterns

\d{2} \d{3} \d{3} \d{3}

\d{2}-\d{3}-\d{3}-\d{3}

\d{2}.\d{3}.\d{3}.\d{3}

\d{2},\d{3},\d{3},\d{3}

Table 45-675 Luxembourg Tax Identification Number wide-breadth validators

Mandatory validators Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

Luxembourg Tax Identification Number medium breadth


The medium breadth detects an 11- or 13-digit number with checksum validation.

Table 45-676 Luxembourg Tax Identification Number medium-breadth patterns

Patterns

[1][89]\d{2}[01]\d[0123]\d\d{5}

[1][89]\d{2}[01]\d[0123]\d \d{5}

[1][89]\d{2}[01]\d[0123]\d-\d{5}

[1][89]\d{2}[01]\d[0123]\d,\d{5}

[1][89]\d{2}[01]\d[0123]\d.\d{5}

[2][0]\d{2}[01]\d[0123]\d\d{5}

[2][0]\d{2}[01]\d[0123]\d \d{5}

[2][0]\d{2}[01]\d[0123]\d-\d{5}

[2][0]\d{2}[01]\d[0123]\d,\d{5}

[2][0]\d{2}[01]\d[0123]\d.\d{5}

\d{11}
Library of system data identifiers 1326
Luxembourg Tax Identification Number

Table 45-676 Luxembourg Tax Identification Number medium-breadth patterns (continued)

Patterns

\d{2} \d{3} \d{3} \d{3}

\d{2}-\d{3}-\d{3}-\d{3}

\d{2}.\d{3}.\d{3}.\d{3}

\d{2},\d{3},\d{3},\d{3}

Table 45-677 Luxembourg Tax Identification Number medium-breadth validator

Mandatory validator Description

Luxembourg Tax Identification Number Validation Computes the checksum and validates the pattern against
Check it.

Luxembourg Tax Identification Number narrow breadth


The narrow breadth detects an 11- or 13-digit number with checksum validation. It also requires
the presence of related keywords.

Table 45-678 Luxembourg Tax Identification Number narrow-breadth patterns

Patterns

[1][89]\d{2}[01]\d[0123]\d\d{5}

[1][89]\d{2}[01]\d[0123]\d \d{5}

[1][89]\d{2}[01]\d[0123]\d-\d{5}

[1][89]\d{2}[01]\d[0123]\d,\d{5}

[1][89]\d{2}[01]\d[0123]\d.\d{5}

[2][0]\d{2}[01]\d[0123]\d\d{5}

[2][0]\d{2}[01]\d[0123]\d \d{5}

[2][0]\d{2}[01]\d[0123]\d-\d{5}

[2][0]\d{2}[01]\d[0123]\d,\d{5}

[2][0]\d{2}[01]\d[0123]\d.\d{5}

\d{11}
Library of system data identifiers 1327
Luxembourg Value Added Tax (VAT) Number

Table 45-678 Luxembourg Tax Identification Number narrow-breadth patterns (continued)

Patterns

\d{2} \d{3} \d{3} \d{3}

\d{2}-\d{3}-\d{3}-\d{3}

\d{2}.\d{3}.\d{3}.\d{3}

\d{2},\d{3},\d{3},\d{3}

Table 45-679 Luxembourg Tax Identification Number narrow-breadth validators

Mandatory validators Description

Duplicate digits Ensures that a string of digits is not all the same.

Luxembourg Tax Identification Number Validation Computes the checksum and validates the pattern against
Check it.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

social security, tin, tin number, tin no, tin#, luxembourg


tax identification number, tax number, tax id

Zinn, Zinn Nummer, Luxembourg Tax


Identifikatiounsnummer, Steier Nummer, Steier ID,
Sozialversicherungsausweis, Zinnzahl, Zinn nein,
Zinn#, luxemburgische steueridentifikationsnummer,
Steuernummer, Steuer ID

sécurité sociale, carte de sécurité sociale, étain,


numéro d'étain, étain non, étain#, Numéro
d'identification fiscal luxembourgeois, numéro
d'identification fiscale, identifiant d'impôt,

Sozialunterstützung, Sozialversécherung

Number delimiter Validates a match by checking the surrounding characters.

Luxembourg Value Added Tax (VAT) Number


VAT is a consumption tax that is borne by the end consumer. VAT is paid for each transaction
in the manufacturing and distribution process.
Library of system data identifiers 1328
Luxembourg Value Added Tax (VAT) Number

The Luxembourg Value Added Tax (VAT) Number data identifier detects an eight-character
alphanumeric pattern that matches the Luxembourg Value Added Tax (VAT) Number format.
The Luxembourg Value Added Tax (VAT) Number provides three breadths of detecion:
■ The wide breadth detects an eight-character alphanumeric pattern beginning with LU without
checksum validation.
See “Luxembourg Value Added Tax (VAT) Number wide breadth” on page 1328.
■ The medium breadth detects an eight-character alphanumeric pattern beginning with LU
with checksum validation.
See “Luxembourg Value Added Tax (VAT) Number medium breadth” on page 1329.
■ The narrow breadth detects an eight-character alphanumeric pattern beginning with LU
with checksum validation. It also requires the presence of related keywords.
See “Luxembourg Value Added Tax (VAT) Number narrow breadth” on page 1329.

Luxembourg Value Added Tax (VAT) Number wide breadth


The wide breadth detects an eight-character alphanumeric pattern beginning with LU without
checksum validation.

Table 45-680 Luxembourg Value Added Tax (VAT) Number wide-breadth patterns

Patterns

[Lu][Uu]\d{8}

[Lu][Uu] \d{8}

[Lu][Uu]-\d{8}

[Lu][Uu] \d{3} \d{3} \d{2}

[Lu][Uu] \d{4} \d{4}

[Lu][Uu] \d{4}-\d{4}

[Lu][Uu] \d{4}.\d{4}

[Lu][Uu] \d{4},\d{4}

Table 45-681 Luxembourg Value Added Tax (VAT) Number wide-breadth validators

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding characters.


Library of system data identifiers 1329
Luxembourg Value Added Tax (VAT) Number

Table 45-681 Luxembourg Value Added Tax (VAT) Number wide-breadth validators (continued)

Mandatory validators Description

Exclude ending characters Data ending with any of the following list of values is not
matched:

00000000, 11111111, 22222222, 33333333, 44444444,


55555555, 66666666, 77777777, 88888888, 99999999

Luxembourg Value Added Tax (VAT) Number medium breadth


The medium breadth detects an eight-character alphanumeric pattern beginning with LU with
checksum validation.

Table 45-682 Luxembourg Value Added Tax (VAT) Number medium-breadth patterns

Patterns

[Lu][Uu]\d{8}

[Lu][Uu] \d{8}

[Lu][Uu]-\d{8}

[Lu][Uu] \d{3} \d{3} \d{2}

[Lu][Uu] \d{4} \d{4}

[Lu][Uu] \d{4}-\d{4}

[Lu][Uu] \d{4}.\d{4}

[Lu][Uu] \d{4},\d{4}

Table 45-683 Luxembourg Value Added Tax (VAT) Number medium-breadth validator

Mandatory validator Description

Luxembourg VAT Number Validation Check Computes the checksum and validates the pattern against
it.

Luxembourg Value Added Tax (VAT) Number narrow breadth


The narrow breadth detects an eight-character alphanumeric pattern beginning with LU with
checksum validation. It also requires the presence of related keywords.
Library of system data identifiers 1330
Luxembourg Value Added Tax (VAT) Number

Table 45-684 Luxembourg Value Added Tax (VAT) Number narrow-breadth patterns

Patterns

[Lu][Uu]\d{8}

[Lu][Uu] \d{8}

[Lu][Uu]-\d{8}

[Lu][Uu] \d{3} \d{3} \d{2}

[Lu][Uu] \d{4} \d{4}

[Lu][Uu] \d{4}-\d{4}

[Lu][Uu] \d{4}.\d{4}

[Lu][Uu] \d{4},\d{4}

Table 45-685 Luxembourg Value Added Tax (VAT) Number narrow-breadth validatos

Mandatory validators Description

Luxembourg VAT Number Validation Check Computes the checksum and validates the pattern against
it.

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:
00000000, 11111111, 22222222, 33333333, 44444444,
55555555, 66666666, 77777777, 88888888, 99999999
Library of system data identifiers 1331
Macau National Identification Number

Table 45-685 Luxembourg Value Added Tax (VAT) Number narrow-breadth validatos
(continued)

Mandatory validators Description

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

luxembourg vat number, Luxembourg vat no, vat


number, vat no, vat, VAT#, value added tax number,
vat id, vat registration number, value added tax

TVA kee, TVA#, TVA Aschreiwung kee, T.V.A,


stammnummer, bleiwen, geheescht, gitt id,
mehrwertsteuer, vat registrierungsnummer,
umsatzsteuer-id, wat, umsatzsteuernummer,
umsatzsteuer-identifikationsnummer

id de la batterie, lëtzebuerg vat nee, registréierung


nummer, numéro de TVA, numéro de enregistrement
vat

Macau National Identification Number


The Macau resident identification card is for permanent and non-permanent residents of Macau.
The identity card is issued by the Identification Services Directorate of Macau.
The Macau National Identification Number data identifier detects an eight-digit number that
matches the Macau National Identification Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects an eight-digit number that matches the Macau National
Identification Number format. It checks for common test numbers.
See “Macau National Identification Number wide breadth” on page 1331.
■ The narrow breadth detects an eight-digit number that matches the Macau National
Identification Number format. It checks for common test numbers, and also requires the
presence of related keywords.
See “Macau National Identification Number narrow breadth” on page 1332.

Macau National Identification Number wide breadth


The wide breadth detects an eight-digit number that matches the Macau National Identification
Number format. It checks for common test numbers.
Library of system data identifiers 1332
Macau National Identification Number

Table 45-686 Macau National Identification Number wide-breadth patterns

Pattern

1\d\d\d\d\d\d(\d)

5\d\d\d\d\d\d(\d)

7\d\d\d\d\d\d(\d)

Table 45-687 Macau National Identification Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Duplicate digits Ensures that a string of digits is not all the same.

Macau National Identification Number narrow breadth


The narrow breadth detects an eight-digit number that matches the Macau National Identification
Number format. It checks for common test numbers, and also requires the presence of related
keywords.

Table 45-688 Macau National Identification Number narrow-breadth patterns

Pattern

1\d\d\d\d\d\d(\d)

5\d\d\d\d\d\d(\d)

7\d\d\d\d\d\d(\d)

Table 45-689 Macau National Identification Numbernarrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Duplicate digits Ensures that a string of digits is not all the same.
Library of system data identifiers 1333
Malaysia Passport Number

Table 45-689 Macau National Identification Numbernarrow-breadth validators (continued)

Mandatory validator Description

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

identification number, id number, identity card number,


identity card no, national identity card number, national
identity card no, national identification number,
personal identification number, personal ID no, unique
identification number, unique id no, nationalid#,
perosonalid#, uniqueid#

身份证号码, 唯一的识别号码

número de identificação, número cartão identidade,


número cartão identidade nacional, número
identificação pessoal, número identificação único, id
único não, ID único#

Malaysia Passport Number


The Malaysian passport is issued to citizens of Malaysia by the Immigration Department of
Malaysia.
The Malaysia Passport Number data identifier detects a nine-character alphanumeric pattern
that matches the Malaysia Passport Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects a nine-character alphanumeric pattern that matches the Malaysia
Passport Number format. It checks for common test patterns.
See “Malaysia Passport Number wide breadth” on page 1333.
■ The narrow breadth detects a nine-character alphanumeric pattern that matches the
Malaysia Passport Number format. It checks for common test patterns, and also requires
the presence of related keywords.
See “Malaysia Passport Number narrow breadth” on page 1334.

Malaysia Passport Number wide breadth


The wide breadth detects a nine-character alphanumeric pattern that matches the Malaysia
Passport Number format. It checks for common test patterns.
Library of system data identifiers 1334
Malaysia Passport Number

Table 45-690 Malaysia Passport Number wide-breadth patterns

Pattern

[AaHhKk]\d\d\d\d\d\d\d\d

Table 45-691 Malaysia Passport Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

00000000, 11111111, 22222222, 33333333, 44444444,


55555555, 66666666, 77777777, 88888888, 99999999

Malaysia Passport Number narrow breadth


The narrow breadth detects a nine-character alphanumeric pattern that matches the Malaysia
Passport Number format. It checks for common test patterns, and also requires the presence
of related keywords.

Table 45-692 Malaysia Passport Number narrow-breadth patterns

Pattern

[AaHhKk]\d\d\d\d\d\d\d\d

Table 45-693 Malaysia Passport Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

00000000, 11111111, 22222222, 33333333, 44444444,


55555555, 66666666, 77777777, 88888888, 99999999
Library of system data identifiers 1335
Malaysian MyKad Number (MyKad)

Table 45-693 Malaysia Passport Number narrow-breadth validators (continued)

Mandatory validator Description

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

passport, passport number, passport no, passportno,


passport no., passport#, passportno#

pasport, nombor pasport, pasport#

Malaysian MyKad Number (MyKad)


The Malaysian National Registration Identity Card Number (NRIC No.) is a unique 12-digit
number issued to Malaysian citizens and permanent residents for identification, indexing, and
tracking purposes.
The Malaysian MyKad Number (MyKad) data identifier detects a 12-digit number that matches
the MyKad format.
The Malaysian MyKad Number (MyKad) system data identifier provides three breadths of
detection:
The Malaysian MyKad Number (MyKad) system data identifier provides three breadths of
detection:
■ The wide breadth detects an 12-digit number without checksum validation.
See “ Malaysian MyKad Number (MyKad) wide breadth” on page 1335.
■ The medium breadth detects a 12-digit number with checksum validation.
See “ Malaysian MyKad Number (MyKad) medium breadth” on page 1336.
■ The narrow breadth detects a 12-digit number that passes checksum validation. It also
requires the presence of MyKad-related keywords.
See “ Malaysian MyKad Number (MyKad) narrow breadth” on page 1336.

Malaysian MyKad Number (MyKad) wide breadth


The wide breadth detects a 12-digit number without checksum validation.

Table 45-694 Malaysian MyKad Number (MyKad) wide-breadth patterns

Patterns

\d{12}
Library of system data identifiers 1336
Malaysian MyKad Number (MyKad)

Table 45-694 Malaysian MyKad Number (MyKad) wide-breadth patterns (continued)

Patterns

\d{6}-\d{2}-\d{4}

Table 45-695 Malaysian MyKad Number (MyKad) wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Malaysian MyKad Number (MyKad) medium breadth


The medium breadth detects a 12-digit number with checksum validation.

Table 45-696 Malaysian MyKad Number (MyKad) medium-breadth patterns

Patterns

\d{12}

\d{6}-\d{2}-\d{4}

Table 45-697 Malaysian MyKad Number (MyKad) medium-breadth validators

Mandatory validators Description

Malaysian My Kad Number Validation Check Computes the checksum and validates the pattern against
it.

Number delimiter Validates a match by checking the surrounding characters.

Malaysian MyKad Number (MyKad) narrow breadth


The narrow breadth detects a 12-digit number that passes checksum validation. It also requires
the presence of MyKad-related keywords.

Table 45-698 Malaysian MyKad Number (MyKad) narrow-breadth patterns

Patterns

\d{12}

\d{6}-\d{2}-\d{4}
Library of system data identifiers 1337
Malta National Identification Number

Table 45-699 Malaysian MyKad Number (MyKad) narrow-breadth validators

Mandatory validators Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

Malaysian MyKad Number Validation Check Validator computes checksum number that every
Malaysian MyKad Number must Computes the checksum
and validates the pattern against it.pass.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

NRIC No, nricno#, MyKad Number, mykad no,


mykadnumber#, identity card no, MyKadno#, mykad,
mykad#, identity card number, nric no

nombor kad pengenalan, kad pengenalan no, kad


pengenalan Malaysia, bilangan identiti unik, nombor
peribadi, nomborperibadi#, kadpengenalanno#

Malta National Identification Number


Every resident of Malta is assigned a national number. For foreigners who are authorized to
reside in Malta, National numbers for foreign resident end with the letter A. National numbers
for Maltese citizens end with M, G, L, H or P.
The Malta National Identification Number data identifier detects an eight-character alphanumeric
pattern that matches the Malta National Identification Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects an eight-character alphanumeric pattern that matches the Malta
National Identification Number format. It checks for common test patterns.
See “Malta National Identification Number wide breadth” on page 1338.
■ The narrow breadth detects an eight-character alphanumeric pattern that matches the
Malta National Identification Number format. It checks for common test patterns, and also
requires the presence of related keywords.
See “Malta National Identification Number narrow breadth” on page 1338.
Library of system data identifiers 1338
Malta National Identification Number

Malta National Identification Number wide breadth


The wide breadth detects an eight-character alphanumeric pattern that matches the Malta
National Identification Number format. It checks for common test patterns.

Table 45-700 Malta National Identification Number wide-breadth patterns

Pattern

\d{6}[1-9][APap]

[012]\d{6}[MGLHBZmglhbz]

[3][01]\d{5}[MGLHBZmglhbz]

32000\d{2}[MGLHBZmglhbz]

Table 45-701 Malta National Identification Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude beginning characters Data beginning with any of the following list of values is
not matched:

0000000, 1111111, 2222222, 3333333, 4444444,


5555555, 6666666, 7777777, 8888888, 9999999

Malta National Identification Number narrow breadth


The narrow breadth detects an eight-character alphanumeric pattern that matches the Malta
National Identification Number format. It checks for common test patterns, and also requires
the presence of related keywords.

Table 45-702 Malta National Identification Number narrow-breadth patterns

Pattern

\d{6}[1-9][APap]

[012]\d{6}[MGLHBZmglhbz]

[3][01]\d{5}[MGLHBZmglhbz]

32000\d{2}[MGLHBZmglhbz]
Library of system data identifiers 1339
Malta Tax Identification Number

Table 45-703 Malta National Identification Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude beginning characters Data beginning with any of the following list of values is
not matched:

0000000, 1111111, 2222222, 3333333, 4444444,


5555555, 6666666, 7777777, 8888888, 9999999

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

national ID, national identification number, personal


ID, personal identification number, nationalid#,
personalid#

numru identifikazzjoni nazzjonali, ID nazzjonali, numru


identifikazzjoni personali, ID personali, IDnazzjonali#,
IDpersonali#

Malta Tax Identification Number


The Malta Tax Identification Number is assigned by the Inland Revenue Department as a
means of identification for income tax purposes.
The Malta Tax Identification Number data identifier detects an eight- or nine-character
alphanumeric pattern that matches the Malta Tax Identification Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects an eight- or nine-character alphanumeric pattern that matches
the Malta Tax Identification Number format. It checks for common test patterns.
See “Malta Tax Identification Number wide breadth” on page 1339.
■ The narrow breadth detects an eight- or nine-character alphanumeric pattern that matches
the Malta Tax Identification Number format. It checks for common test patterns, and also
requires the presence of related keywords.
See “Malta Tax Identification Number narrow breadth” on page 1340.

Malta Tax Identification Number wide breadth


The wide breadth detects an eight- or nine-character alphanumeric pattern that matches the
Malta Tax Identification Number format. It checks for common test patterns.
Library of system data identifiers 1340
Malta Tax Identification Number

Table 45-704 Malta Tax Identification Number wide-breadth patterns

Pattern

\d{6}[1-9][APap]

[012]\d{6}[MGLHBZmglhbz]

[3][01]\d{5}[MGLHBZmglhbz]

32000\d{2}[MGLHBZmglhbz]

[1]{2}\d{7}

[2]{2}\d{7}

[3]{2}\d{7}

[4]{2}\d{7}

[5]{2}\d{7}

[6]{2}\d{7}

[7]{2}\d{7}

[8]{2}\d{7}

Table 45-705 Malta Tax Identification Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude beginning characters Data beginning with any of the following list of values is
not matched:

000000000, 111111111, 222222222, 333333333,


444444444, 555555555, 666666666, 777777777,
888888888, 999999999

0000000, 1111111, 2222222, 3333333, 4444444,


5555555, 6666666, 7777777, 8888888, 9999999

Malta Tax Identification Number narrow breadth


The narrow breadth detects an eight- or nine-character alphanumeric pattern that matches
the Malta Tax Identification Number format. It checks for common test patterns, and also
requires the presence of related keywords.
Library of system data identifiers 1341
Malta Tax Identification Number

Table 45-706 Malta Tax Identification Number narrow-breadth patterns

Pattern

\d{6}[1-9][APap]

[012]\d{6}[MGLHBZmglhbz]

[3][01]\d{5}[MGLHBZmglhbz]

32000\d{2}[MGLHBZmglhbz]

[1]{2}\d{7}

[2]{2}\d{7}

[3]{2}\d{7}

[4]{2}\d{7}

[5]{2}\d{7}

[6]{2}\d{7}

[7]{2}\d{7}

[8]{2}\d{7}

Table 45-707 Malta Tax Identification Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude beginning characters Data beginning with any of the following list of values is
not matched:

000000000, 111111111, 222222222, 333333333,


444444444, 555555555, 666666666, 777777777,
888888888, 999999999

0000000, 1111111, 2222222, 3333333, 4444444,


5555555, 6666666, 7777777, 8888888, 9999999
Library of system data identifiers 1342
Malta Value Added Tax (VAT) Number

Table 45-707 Malta Tax Identification Number narrow-breadth validators (continued)

Mandatory validator Description

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

tax code, tax number, tax identification number, kodiċi


tat-taxxa, numru tat-taxxa, numru identifikazzjoni
tat-taxxa, tax id, taxxaid#, taxid#, numru identifikazzjoni
kontribwent, taxpayer identification number, kodiċi
kontribwent, taxpayer code, tin, TIN, tin#, TIN#, tin no,
landa, landa nru

Malta Value Added Tax (VAT) Number


Value Added Tax (VAT) is a consumption tax that is borne by the end consumer. VAT is paid
for each transaction in the manufacturing and distribution process. In Malta, VAT is administered
by tax office for the region in which the business is established.
The Malta Value Added Tax (VAT) Number data identifier detects an 8- or 10-character
alphanumeric pattern that matches the Malta Value Added Tax (VAT) Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects an 8- or 10-character alphanumeric pattern that matches the
Malta Value Added Tax (VAT) Number format without checksum validation. It checks for
common test patterns.
See “Malta Value Added Tax (VAT) Number wide breadth” on page 1342.
■ The medium breadth detects an 8- or 10-character alphanumeric pattern that matches the
Malta Value Added Tax (VAT) Number format with checksum validation.
See “Malta Value Added Tax (VAT) Number medium breadth” on page 1343.
■ The narrow breadth detects an 8- or 10-character alphanumeric pattern that matches the
Malta Value Added Tax (VAT) Number format with checksum validation. It checks for
common test patterns, and also requires the presence of related keywords.
See “Malta Value Added Tax (VAT) Number narrow breadth” on page 1343.

Malta Value Added Tax (VAT) Number wide breadth


The wide breadth detects an 8- or 10-character alphanumeric pattern that matches the Malta
Value Added Tax (VAT) Number format without checksum validation. It checks for common
test patterns.
Library of system data identifiers 1343
Malta Value Added Tax (VAT) Number

Table 45-708 Malta Value Added Tax (VAT) Number wide-breadth patterns

Pattern

[Mm][Tt]\d{8}

\d{4}[-]\d{4}

Table 45-709 Malta Value Added Tax (VAT) Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

00000000, 11111111, 22222222, 33333333, 44444444,


55555555, 66666666, 77777777, 88888888, 99999999

Malta Value Added Tax (VAT) Number medium breadth


The medium breadth detects an 8- or 10-character alphanumeric pattern that matches the
Malta Value Added Tax (VAT) Number format with checksum validation.

Table 45-710 Malta Value Added Tax (VAT) Number medium-breadth patterns

Pattern

[Mm][Tt]\d{8}

\d{4}[-]\d{4}

Table 45-711 medium-breadth validators

Mandatory validator Description

Malta Value Added Tax (VAT) Number Validation Check Computes the checksum and validates the pattern against
it.

Malta Value Added Tax (VAT) Number narrow breadth


The narrow breadth detects an 8- or 10-character alphanumeric pattern that matches the Malta
Value Added Tax (VAT) Number format with checksum validation. It checks for common test
patterns, and also requires the presence of related keywords.
Library of system data identifiers 1344
Medicare Beneficiary Identifier

Table 45-712 Malta Value Added Tax (VAT) Number narrow-breadth patterns

Pattern

[Mm][Tt]\d{8}

\d{4}[-]\d{4}

Table 45-713 Malta Value Added Tax (VAT) Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

00000000, 11111111, 22222222, 33333333, 44444444,


55555555, 66666666, 77777777, 88888888, 99999999

Malta Value Added Tax (VAT) Number Validation Check Computes the checksum and validates the pattern against
it.

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

vat number, VAT number, vat, vat#, malta vat number,


vatno#, value added tax number, malta vat, vat
identification number

Numru tal-VAT, numru tal-VAT, bettija,valur miżjud


taxxa in-numru, bettija identifikazzjoni in-numru

Medicare Beneficiary Identifier


The Medicare Beneficiary Identifier (MBI) is assigned to an individual for the purpose of
identifying them as a medicare beneficiary. The MBI will replace the Healthcare Insurance
Claim Number (HICN) on all Medicare cards by April 2019.
The Medicare Beneficiary Identifier detects an 11-character alphanumeric pattern that matches
the Medicare Beneficiary Identifier format.
The Medicare Beneficiary Identifier data identifier provides three breadths of detection:
■ The wide breadth detects an 11-character alphanumeric pattern without checksum validation.
See “Medicare Beneficiary Identifier wide breadth” on page 1345.
■ The medium breadth detects an 11-character alphanumeric pattern with checksum
validation.
Library of system data identifiers 1345
Medicare Beneficiary Identifier

See “Medicare Beneficiary Identifier medium breadth” on page 1345.


■ The narrow breadth detects an 11-character alphanumeric pattern with checksum validation.
It also requires the presence of related keywords.
See “Medicare Beneficiary Identifier narrow breadth” on page 1345.

Medicare Beneficiary Identifier wide breadth


The wide breadth detects an 11-character alphanumeric pattern without checksum validation.

Table 45-714 Medicare Beneficiary Identifier wide-breadth pattern

Pattern

[1-9][A-Za-z][0-9A-Za-z][0-9][A-Za-z][0-9A-Za-z][0-9][A-Za-z]{2}[0-9]{2}

Table 45-715 Medicare Beneficiary Identifier wide-breadth validator

Mandatory validator

Number delimiter Validates a match by checking the surrounding characters.

Medicare Beneficiary Identifier medium breadth


The medium breadth detects an 11-character alphanumeric pattern with checksum validation.

Table 45-716 Medicare Beneficiary Identifier medium-breadth pattern

Pattern

[1-9][A-Za-z][0-9A-Za-z][0-9][A-Za-z][0-9A-Za-z][0-9][A-Za-z]{2}[0-9]{2}

Table 45-717 Medicare Beneficiary Identifier medium-breadth validator

Mandatory validator

Medicare Beneficiary Identifier Number Validation Computes the checksum and validates the pattern against
Check it.

Medicare Beneficiary Identifier narrow breadth


The narrow breadth detects an 11-character alphanumeric pattern with checksum validation.
It also requires the presence of related keywords.
Library of system data identifiers 1346
Mexican Personal Registration and Identification Number

Table 45-718 Medicare Beneficiary Identifier narrow-breadth pattern

Pattern

[1-9][A-Za-z][0-9A-Za-z][0-9][A-Za-z][0-9A-Za-z][0-9][A-Za-z]{2}[0-9]{2}

Table 45-719 Medicare Beneficiary Identifier narrow breadth validators

Mandatory validators

Number delimiter Validates a match by checking the surrounding characters.

Medicare Beneficiary Identifier Number Validation Computes the checksum and validates the pattern against
Check it.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

Medicare Beneficiary Identifier, medicare beneficiary


identifier, mbi number, mbi no, mbi number#, mbi
number#, mbi no#, medicare beneficiary number,
medicare beneficiary no, medicare beneficiary#

Mexican Personal Registration and Identification


Number
The Mexican Personal Registration and Identification Number is a number used in Mexican
states (with the exception of Mexico City) as a personal identification code.
The Mexican Personal Registration and Identification Number detects a 15-character
alphanumeric pattern that matches the Mexican Personal Registration and Identification Number
format.
The Mexican Personal Registration and Identification Number data identifier provides three
breadths of detection:
■ The wide breadth detects a 15-character alphanumeric pattern without checksum validation.
See “Mexican Personal Registration and Identification Number wide breadth” on page 1347.
■ The medium breadth detects a 15-character alphanumeric pattern with checksum validation.
See “Mexican Personal Registration and Identification Number medium breadth” on page 1347.
■ The narrow breadth detects a 15-character alphanumeric pattern with checksum validation.
It also requires the presence of related keywords.
See “Mexican Personal Registration and Identification Number narrow breadth” on page 1348.
Library of system data identifiers 1347
Mexican Personal Registration and Identification Number

Mexican Personal Registration and Identification Number wide


breadth
The wide breadth detects a 15-character alphanumeric pattern without checksum validation.

Table 45-720 Mexican Personal Registration and Identification Number wide-breadth pattern

Pattern

\d{2}-\d{3}-\d{2}-\d{7}-\w

Table 45-721 Mexican Personal Registration and Identification Number wide-breadth validator

Mandatory validator Description

Exclude ending characters Data ending with any of the following list of values is not
matched:

00000000000000, 11111111111111, 22222222222222,


33333333333333, 44444444444444, 55555555555555,
66666666666666, 77777777777777, 88888888888888,
99999999999999

Mexican Personal Registration and Identification Number medium


breadth
The medium breadth detects a 15-character alphanumeric pattern with checksum validation.

Table 45-722 Mexican Personal Registration and Identification Number medium-breadth


pattern

Pattern

\d{2}-\d{3}-\d{2}-\d{7}-\w

Table 45-723 Mexican Personal Registration and Identification Number medium-breadth


validator

Mandatory validator Description

Exclude ending characters Data ending with any of the following list of values is not
matched:

00000000000000, 11111111111111, 22222222222222,


33333333333333, 44444444444444, 55555555555555,
66666666666666, 77777777777777, 88888888888888,
99999999999999
Library of system data identifiers 1348
Mexican Personal Registration and Identification Number

Table 45-723 Mexican Personal Registration and Identification Number medium-breadth


validator (continued)

Mandatory validator Description

Mexican CRIP Validation Check Computes the checksum for the match and validates the
pattern against it.

Mexican Personal Registration and Identification Number narrow


breadth
The narrow breadth detects a 15-character alphanumeric pattern with checksum validation. It
also requires the presence of related keywords.

Table 45-724 Mexican Personal Registration and Identification Number narrow-breadth


pattern

Pattern

\d{2}-\d{3}-\d{2}-\d{7}-\w

Table 45-725 Mexican Personal Registration and Identification Number narrow-breadth


validator

Mandatory validator Description

Exclude ending characters Data ending with any of the following list of values is not
matched:
00000000000000, 11111111111111, 22222222222222,
33333333333333, 44444444444444, 55555555555555,
66666666666666, 77777777777777, 88888888888888,
99999999999999

Mexican CRIP Validation Check Computes the checksum for every number matched and
validates the pattern against it.
Library of system data identifiers 1349
Mexican Tax Identification Number

Table 45-725 Mexican Personal Registration and Identification Number narrow-breadth


validator (continued)

Mandatory validator Description

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

Personal Registration and Identification Code, CRIP,


crip, CRIP#, crip#, Mexican Personal ID Code, Mexican
personal identification number

Clave de Registro de Identidad Personal, Código de


Identificación Personal mexicana, número de
identificación personal mexicana

Mexican Tax Identification Number


In Mexico, a legal entity, such as a company or a person, is assigned a tax identification
number. A tax identification number for a company is 12 characters, while a tax identification
number for a person is 13 characters.
The Mexican Tax Identification Number data identifier detects a 12- or 13-character
alphanumeric pattern that matches the Mexican Tax Identification Number format.
The Mexican Tax Identification Number data identifier provides three breadths of detection:
■ The wide breadth detects a 12- or 13-character alphanumeric pattern without validation.
See “Mexican Tax Identification Number wide breadth” on page 1349.
■ The medium breadth detects a 12- or 13-character alphanumeric pattern with checksum
validation.
See “Mexican Tax Identification Number medium breadth” on page 1350.
■ The narrow breadth detects a 12- or 13-character alphanumeric pattern with checksum
validation. It also requires the presence of related keywords.
See “Mexican Tax Identification Number narrow breadth” on page 1350.

Mexican Tax Identification Number wide breadth


The wide breadth detects a 12- or 13-character alphanumeric pattern without validation.
Library of system data identifiers 1350
Mexican Tax Identification Number

Table 45-726 Mexican Tax Identification Number wide-breadth patterns

Patterns

[a-zA-Z][a-zA-Z][a-zA-Z][a-zA-Z]\d\d[01]\d[0-3]\d\w\w\w

[a-zA-Z][a-zA-Z][a-zA-Z][a-zA-Z][- ]\d\d[01]\d[0-3]\d\w\w\w

[a-zA-Z][a-zA-Z][a-zA-Z]\d\d[01]\d[0-3]\d\w\w\w

[a-zA-Z][a-zA-Z][a-zA-Z][- ]\d\d[01]\d[0-3]\d\w\w\w

Mexican Tax Identification Number medium breadth


The medium breadth detects a 12- or 13-character alphanumeric pattern with checksum
validation.

Table 45-727 Mexican Tax Identification Number medium-breadth patterns

Patterns

[a-zA-Z][a-zA-Z][a-zA-Z][a-zA-Z]\d\d[01]\d[0-3]\d\w\w\w

[a-zA-Z][a-zA-Z][a-zA-Z][a-zA-Z][- ]\d\d[01]\d[0-3]\d\w\w\w

[a-zA-Z][a-zA-Z][a-zA-Z]\d\d[01]\d[0-3]\d\w\w\w

[a-zA-Z][a-zA-Z][a-zA-Z][- ]\d\d[01]\d[0-3]\d\w\w\w

Table 45-728 Mexican Tax Identification Number medium-breadth validator

Mandatory validator Description

Mexican TAX ID Validation Check Computes the checksum and validates the pattern against
it.

Mexican Tax Identification Number narrow breadth


The narrow breadth detects a 12- or 13-character alphanumeric pattern with checksum
validation. It also requires the presence of related keywords.

Table 45-729 Mexican Tax Identification Number narrow-breadth patterns

Patterns

[a-zA-Z][a-zA-Z][a-zA-Z][a-zA-Z]\d\d[01]\d[0-3]\d\w\w\w

[a-zA-Z][a-zA-Z][a-zA-Z][a-zA-Z][- ]\d\d[01]\d[0-3]\d\w\w\w
Library of system data identifiers 1351
Mexican Unique Population Registry Code

Table 45-729 Mexican Tax Identification Number narrow-breadth patterns (continued)

Patterns

[a-zA-Z][a-zA-Z][a-zA-Z]\d\d[01]\d[0-3]\d\w\w\w

[a-zA-Z][a-zA-Z][a-zA-Z][- ]\d\d[01]\d[0-3]\d\w\w\w

Table 45-730 Mexican Tax Identification Number narrow-breadth validators

Mandatory validator Description

Mexican TAX ID Validation Check Computes the checksum and validates the pattern against
it.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

Tax Identification Number, Tax ID, Tax ID No., RFC


Number, TIN, TIN#, Federal Taxpayer Registry Code

Registro Federal de Contribuyentes, número de


identificación de impuestos, Código del Registro
Federal de Contribuyentes, Número RFC, Clave del
RFC

Mexican Unique Population Registry Code


The Mexican Unique Population Registry Code (Clave Única de Registro de Población, or
CURP) is the unique alphanumeric identifier assigned to each person living in Mexico, either
nationals or foreigners, as well as Mexican nationals who live in other countries.
The Mexican Unique Population Registry Code data identifier detects an 18-character
alphanumeric pattern that matches the CURP format.
The Mexican Unique Population Registry Code system data identifier provides three breadths
of detection:
■ The wide breadth detects an 18-character alphanumeric pattern without validation.
See “Mexican Unique Population Registry Code wide breadth” on page 1352.
■ The medium breadth detects an 18-character alphanumeric pattern with checksum
validation.
See “Mexican Unique Population Registry Code medium breadth” on page 1352.
■ The narrow breadth detects an 18-character alphanumeric pattern with checksum validation.
It also requires the presence of related keywords.
Library of system data identifiers 1352
Mexican Unique Population Registry Code

See “ Mexican Unique Population Registry Code narrow breadth” on page 1352.

Mexican Unique Population Registry Code wide breadth


The wide breadth detects an 18-character alphanumeric pattern without validation.

Table 45-731 Mexican Unique Population Registry Code wide-breadth pattern

Pattern

\w[AEIOUaeiou]\w{2}\d{2}[0-1]\d[0-3]\d[HMhm]\w{7}

Mexican Unique Population Registry Code medium breadth


The medium breadth detects an 18-character alphanumeric pattern with checksum validation.

Table 45-732 Mexican Unique Population Registry Code medium-breadth pattern

Pattern

\w[AEIOUaeiou]\w{2}\d{2}[0-1]\d[0-3]\d[HMhm]\w{7}

Table 45-733 Mexican Unique Population Registry Code medium-breadth validator

Mandatory validator Description

Mexican Personal ID Code Number Validation Check Computes the checksum and validates the pattern against
it.

Mexican Unique Population Registry Code narrow breadth


The narrow breadth detects an 18-character alphanumeric pattern with checksum validation.
It also requires the presence of related keywords.

Table 45-734 Mexican Unique Population Registry Code narrow-breadth pattern

Pattern

\w[AEIOUaeiou]\w{2}\d{2}[0-1]\d[0-3]\d[HMhm]\w{7}

Table 45-735 Mexican Unique Population Registry Code narrow-breadth validators

Mandatory validator Description

Mexican Personal ID Code Number Validation Check Computes the checksum and validates the pattern against
it.
Library of system data identifiers 1353
Mexico CLABE Number

Table 45-735 Mexican Unique Population Registry Code narrow-breadth validators (continued)

Mandatory validator Description

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

Personal ID, personal ID number, personal ID, unique


ID number, unique ID key, personal ID code, unique
population registry code, unique population code,
personalid#, personalidnumber#, uniqueidkey#

CURP, curp#, clave Única de registro de Población,


clave única, clave única de identidad, clave personal
Identidad, personal Identidad Clave, ClaveÚnica#,
clavepersonalIdentidad#

Mexico CLABE Number


The Mexico CLABE (Clave Bancaria Estandarizada) Number is an 18-digit number used as
a banking standard for the numbering of bank accounts in Mexico.
The Mexico CLABE Number data identifier detects an 18-digit number that matches the CLABE
Number format.
The Mexico CLABE Number data identifier provides three breadths of detection:
■ The wide breadth detects an 18-digit number without checksum validation.
See “ Mexico CLABE Number wide breadth” on page 1353.
■ The medium breadth detects an 18-digit number with checksum validation.
See “Mexico CLABE Number medium breadth” on page 1354.
■ The narrow breadth detects an 18-digit number with checksum validation. It also requires
the presence of related keywords.
See “Mexico CLABE Number narrow breadth” on page 1354.

Mexico CLABE Number wide breadth


The wide breadth detects an 18-digit number without checksum validation.

Table 45-736 Mexico CLABE Number wide-breadth patterns

Pattern

\d{18}
Library of system data identifiers 1354
Mexico CLABE Number

Table 45-737 Mexico CLABE Number wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Mexico CLABE Number medium breadth


The medium breadth detects an 18-digit number with checksum validation.

Table 45-738 Mexico CLABE Number medium-breadth patterns

Pattern

\d{18}

Table 45-739 Mexico CLABE Number medium-breadth validators

Mandatory validator Description

Mexico CLABE Number Validation Check Computes the checksum and validates the pattern against
it.

Number delimiter Validates a match by checking the surrounding numbers.

Exclude beginning characters Data beginning with any of the following list of values is
not matched:

555555555555555555

Mexico CLABE Number narrow breadth


The narrow breadth detects an 18-digit number with checksum validation. It also requires the
presence of related keywords.

Table 45-740 Mexico CLABE Number narrow-breadth patterns

Pattern

\d{18}

Table 45-741 Mexico CLABE Number narrow-breadth validators

Mandatory validator Description

Mexico CLABE Number Validation Check Computes the checksum and validates the pattern against
it.
Library of system data identifiers 1355
National Drug Code (NDC)

Table 45-741 Mexico CLABE Number narrow-breadth validators (continued)

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding numbers.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

Mexico CLABE Number, mexico clabe number, clabe


number, clabe no., Mexico CLABE No., mexico clabe
no., CLABE No#, clabe no#

Clave Bancaria Estandarizada, Estandarizado Banco


número de clave, número de clave, clave número,
clave#

National Drug Code (NDC)


The National Drug Code (NDC) is an identifier issued by the Food and Drug Administration
(FDA) for an individual drug in the United States. An alternate format is defined by HIPAA
regulations.
The National Drug Code data identifier detects the existence of an NDC as well as the HIPAA
version.
This data identifier provides three breadths of detection:
■ The wide breadth checks for the existence of an NDC number or its HIPAA version.
See “National Drug Code (NDC) wide breadth” on page 1355.
■ The medium breadth restricts the patterns for detecting the numbers.
See “National Drug Code (NDC) medium breadth” on page 1356.
■ The narrow breadth requires a keyword match.
See “National Drug Code (NDC) narrow breadth” on page 1356.

National Drug Code (NDC) wide breadth


The wide breadth detects the standard FDA format, which is a 10-digit number in the format
4-4-2, 5-4-1 or 5-3-2, with the numbers separated by dashes or spaces.
This data identifier also detects the HIPAA format, an 11-digit number in the format 5-4-2. The
HIPAA format may include a single asterisk to represent a missing digit.
Library of system data identifiers 1356
National Drug Code (NDC)

Table 45-742 National Drug Code (NDC) wide breadth patterns

Patterns

*?\d{4} \d{4} \d{2}

*?\d{4}-\d{4}-\d{2}

\d{5} *?\d{3} \d{2}

\d{5}-*?\d{3}-\d{2}

\d{5} \d{4} *?\d

\d{5}-\d{4}-*?\d

\d{5} \d{4} \d{2}

\d{5}-\d{4}-\d{2}

National Drug Code (NDC) medium breadth


The medium breadth detects the standard FDA format, which is a 10-digit number in the format
4-4-2, 5-4-1 or 5-3-2, with the numbers separated by dashes.
This data identifier also detects the HIPAA format, an 11-digit number in the format 5-4-2. The
HIPAA format may include a single asterisk to represent a missing digit.

Note: The medium breadth of this data identifier does not include any validators.

Table 45-743 National Drug Code (NDC) medium breadth patterns

Pattern

*?\d{4}-\d{4}-\d{2}

\d{5}-*?\d{3}-\d{2}

\d{5}-\d{4}-*?\d

\d{5}-\d{4}-\d{2}

National Drug Code (NDC) narrow breadth


The narrow breadth detects the standard FDA format, which is a 10-digit number in the format
4-4-2, 5-4-1 or 5-3-2, with the numbers separated by dashes.
Library of system data identifiers 1357
National Provider Identifier Number

This data identifier also detects the HIPAA format, an 11-digit number in the format 5-4-2. The
HIPAA format may include a single asterisk to represent a missing digit. This data identifier
also requires the presence of an NDC-related keyword.

Table 45-744 National Drug Code (NDC) narrow breadth patterns

Pattern

*?\d{4}-\d{4}-\d{2}

\d{5}-*?\d{3}-\d{2}

\d{5}-\d{4}-*?\d

\d{5}-\d{4}-\d{2}

Table 45-745 National Drug Code (NDC) narrow breadth validators

Mandatory validator Description

Find keywords With this option selected, at least one of the following keywords or key phrases
must be present for the data to be matched.

Find keywords input ndc, national drug code

National Provider Identifier Number


National Provider Identifier (NPI) is a unique 10-digit identification number issued to health
care providers in the United States by the Centers for Medicare and Medicaid Services.
The National Provider Identifier Number data identifier detects a 10-digit number that matches
the National Provider Identifier Number format.
The National Provider Identifier Number data identifier provides three breadths of detection:
■ The wide breadth detects a 10-digit number without checksum validation.
See “National Provider Identifier Number wide breadth” on page 1357.
■ The medium breadth detects a 10-digit number with checksum validation.
See “National Provider Identifier Number medium breadth” on page 1358.
■ The narrow breadth detects a 10-digit number with checksum validation. It also requires
the presence of related keywords.
See “National Provider Identifier Number narrow breadth” on page 1358.

National Provider Identifier Number wide breadth


The wide breadth detects a 10-digit number without checksum validation.
Library of system data identifiers 1358
National Provider Identifier Number

Table 45-746 National Provider Identifier Number wide-breadth patterns

Pattern

\d{10}

80840\d{10}

Table 45-747 National Provider Identifier Number wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

National Provider Identifier Number medium breadth


The medium breadth detects a 10-digit number with checksum validation.

Table 45-748 National Provider Identifier Number medium-breadth patterns

Pattern

\d{10}

80840\d{10}

Table 45-749 National Provider Identifier Number medium-breadth validators

Mandatory validator Description

National Provider Identifier Number Validation Check Computes the checksum and validates the pattern against
it.

Number delimiter Validates a match by checking the surrounding numbers.

National Provider Identifier Number narrow breadth


The narrow breadth detects a 10-digit number with checksum validation. It also requires the
presence of related keywords.

Table 45-750 National Provider Identifier Number narrow-breadth patterns

Pattern

\d{10}

80840\d{10}
Library of system data identifiers 1359
Netherlands Bank Account Number

Table 45-751 National Provider Identifier Number narrow-breadth validators

Mandatory validator Description

National Provider Identifier Number Validation Check Computes the checksum and validates the pattern against
it.

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding numbers.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

National Provider Identifier, NPI, npi, n.p.i, hipaa,


National Provider ID, npiid, national provider ID
number, NPI ID

Netherlands Bank Account Number


The Netherlands bank account number is the standard bank account number used across the
Netherlands.
The Netherlands Bank Account Number data identifier detects an 8-, 9-, or 10-character
alphanumeric pattern that matches the Netherlands Bank Account Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects an 8-, 9-, or 10-character alphanumeric pattern that matches the
Netherlands Bank Account Number format without checksum validation. It checks for
common test patterns.
See “Netherlands Bank Account Number wide breadth” on page 1360.
■ The medium breadth detects an 8-, 9-, or 10-character alphanumeric pattern that matches
the Netherlands Bank Account Number format with checksum validation.
See “Netherlands Bank Account Number medium breadth” on page 1360.
■ The narrow breadth detects an 8-, 9-, or 10-character alphanumeric pattern that matches
the Netherlands Bank Account Number format with checksum validation. It checks for
common test patterns, and also requires the presence of related keywords.
See “Netherlands Bank Account Number medium breadth” on page 1360.
Library of system data identifiers 1360
Netherlands Bank Account Number

Netherlands Bank Account Number wide breadth


The wide breadth detects an 8-, 9-, or 10-character alphanumeric pattern that matches the
Netherlands Bank Account Number format without checksum validation. It checks for common
test patterns.

Table 45-752 Netherlands Bank Account Number wide-breadth patterns

Pattern

[PpGg]\d\d\d\d\d\d\d

\d\d\d\d\d\d\d\d\d\d

\d\d\d\d\d\d\d\d\d

Table 45-753 Netherlands Bank Account Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Duplicate digits Ensures that a string of digits is not all the same.

Exclude ending characters Data ending with any of the following list of values is not
matched:

0000000, 1111111, 2222222, 3333333, 4444444,


5555555, 6666666, 7777777, 8888888, 9999999

Netherlands Bank Account Number medium breadth


The medium breadth detects an 8-, 9-, or 10-character alphanumeric pattern that matches the
Netherlands Bank Account Number format with checksum validation.

Table 45-754 Netherlands Bank Account Number medium-breadth patterns

Pattern

[PpGg]\d\d\d\d\d\d\d

\d\d\d\d\d\d\d\d\d\d

\d\d\d\d\d\d\d\d\d
Library of system data identifiers 1361
Netherlands Bank Account Number

Table 45-755 Netherlands Bank Account Number medium-breadth validators

Mandatory validator Description

Netherlands Bank Account Number Validation Check Computes the checksum and validates the pattern against
it.

Netherlands Bank Account Number narrow breadth


The narrow breadth detects an 8-, 9-, or 10-character alphanumeric pattern that matches the
Netherlands Bank Account Number format with checksum validation. It checks for common
test patterns, and also requires the presence of related keywords.

Table 45-756 Netherlands Bank Account Number narrow-breadth patterns

Pattern

[PpGg]\d\d\d\d\d\d\d

\d\d\d\d\d\d\d\d\d\d

\d\d\d\d\d\d\d\d\d

Table 45-757 Netherlands Bank Account Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Duplicate digits Ensures that a string of digits is not all the same.

Exclude ending characters Data ending with any of the following list of values is not
matched:

0000000, 1111111, 2222222, 3333333, 4444444,


5555555, 6666666, 7777777, 8888888, 9999999

Netherlands Bank Account Number Validation Check Computes the checksum and validates the pattern against
it.

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

bank account number, account number, bancu


aklarashon number, aklarashon number

bancu aklarashon number, aklarashon number,


bankrekeningnummer, rekeningnummer
Library of system data identifiers 1362
Netherlands Driver's License Number

Netherlands Driver's License Number


Identification number for an individual driver's license issue by the Netherlands' RDW agency.
The Netherlands Driver's License Number data identifier detects a 10-digit number that matches
the Netherlands Driver's License Number format.
The Netherlands Driver's License Number data identifier provides two breadths of detection:
■ The wide breadth detects a 10-digit number without checksum validation.
See “Netherlands Driver's License Number wide breadth” on page 1362.
■ The narrow breadth detects a 10-digit number without checksum validation. It also requires
the presence of related keywords.
See “Netherlands Driver's License Number narrow breadth” on page 1362.

Netherlands Driver's License Number wide breadth


The wide breadth detects a 10-digit number without checksum validation.

Table 45-758 Netherlands Driver's License Number wide-breadth pattern

Pattern

\d{10}

Table 45-759 Netherlands Driver's License Number wide-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

Netherlands Driver's License Number narrow breadth


The narrow breadth detects a 10-digit number without checksum validation. It also requires
the presence of related keywords.

Table 45-760 Netherlands Driver's License Number narrow-breadth pattern

Pattern

\d{10}
Library of system data identifiers 1363
Netherlands Passport Number

Table 45-761 Netherlands Driver's License Number narrow-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

RIJMEWIJS, Driver License, Driver License Number,


driver license number, Driver Licence, Drivers Lic.,
Drivers License, Drivers Licence, Driver's License,
Driver's License Number, driver's license number,
Driver's Licence Number, Driving License number,
driving license number, DLNo#, dlno#

permis de conduire, rijbewijs, Rijbewijsnummer, DL#,


RIJBEWIJSNUMMER

Netherlands Passport Number


The Dutch passports are issued to Netherlands citizens for the purpose of international travel.
The Netherlands Passport Number data identifier detects a nine-digit number that matches
the Netherlands Passport Number format.
The Netherlands Passport Number data identifier provides two breadths of detection:
■ The wide breadth detects a nine-digit number without checksum validation.
See “Netherlands Passport Number wide breadth” on page 1363.
■ The narrow breadth detect a nine-digit number. It also requires the presence of related
keywords.
See “Netherlands Passport Number narrow breadth” on page 1364.

Netherlands Passport Number wide breadth


The wide breadth detects a nine-digit number without checksum validation.

Table 45-762 Netherlands Passport Number wide-breadth pattern

Pattern

\w{9}
Library of system data identifiers 1364
Netherlands Tax Identification Number

Table 45-763 Netherlands Passport Number wide-breadth validator

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding numbers.

Netherlands Passport Number narrow breadth


The narrow breadth detect a nine-digit number. It also requires the presence of related
keywords.

Table 45-764 Netherlands Passport Number narrow-breadth pattern

Pattern

\w{9}

Table 45-765 Netherlands Passport Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding numbers.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

Dutch Passport Number, Dutch passport number,


passport number, Netherlands passport number

Nederlanden paspoort nummer, Paspoort, paspoort,


Nederlanden paspoortnummer, paspoortnummer

Netherlands Tax Identification Number


Netherlands issues a tax identification number at birth or at registration at the municipality.
The Netherlands Tax Identification Number data identifier detects a nine-digit number that
matches the Netherlands Tax Identification Number format.
The Netherlands Tax Identification Number data identifier provides three breadths of detection:
■ The wide breadth detects a nine-digit number without checksum validation.
See “Netherlands Tax Identification Number wide breadth” on page 1365.
■ The medium breadth detects a nine-digit number with checksum validation.
See “Netherlands Tax Identification Number medium breadth” on page 1365.
Library of system data identifiers 1365
Netherlands Tax Identification Number

■ The narrow breadth detects a nine-digit number with checksum validation. It also requires
the presence of related keywords.
See “Netherlands Tax Identification Number narrow breadth” on page 1366.

Netherlands Tax Identification Number wide breadth


The wide breadth detects a nine-digit number without checksum validation.

Table 45-766 Netherlands Tax Identification Number wide-breadth patterns

Pattern

\d{9}

\d{3}-\d{3}-\d{3}

\d{3}.\d{3}.\d{3}

\d{3} \d{3} \d{3}

\d{3} \d{3} \d{3}

Table 45-767 Netherlands Tax Identification Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding numbers.

Duplicate digits Ensures that a string of digits is not all the same.

Netherlands Tax Identification Number medium breadth


The medium breadth detects a nine-digit number with checksum validation.

Table 45-768 Netherlands Tax Identification Number medium-breadth patterns

Pattern

\d{9}

\d{3}-\d{3}-\d{3}

\d{3}.\d{3}.\d{3}

\d{3} \d{3} \d{3}

\d{3} \d{3} \d{3}


Library of system data identifiers 1366
Netherlands Tax Identification Number

Table 45-769 Netherlands Tax Identification Number medium-breadth validator

Mandatory validator Description

Dutch Tax Identification Number Validation Check Computes the checksum and validates the pattern against
it.

Netherlands Tax Identification Number narrow breadth


The narrow breadth detects a nine-digit number with checksum validation. It also requires the
presence of related keywords.

Table 45-770 Netherlands Tax Identification Number narrow-breadth patterns

Pattern

\d{9}

\d{3}-\d{3}-\d{3}

\d{3}.\d{3}.\d{3}

\d{3} \d{3} \d{3}

\d{3} \d{3} \d{3}

Table 45-771 Netherlands Tax Identification Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding numbers.

Duplicate digits Ensures that a string of digits is not all the same.

Dutch Tax Identification Number Validation Check Computes the checksum and validates the pattern against
it.
Library of system data identifiers 1367
Netherlands Value Added Tax (VAT) Number

Table 45-771 Netherlands Tax Identification Number narrow-breadth validators (continued)

Mandatory validator Description

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

netherlands tax identification number, netherlands tax


identification, netherland's tax identification number,
netherland's tax identification, tax identification
number, dutch tax id, dutch tax identification number,
tax id, tax id#, tax number, tax no#, tax#, TIN, TIN#,
tin#, tin, netherlands tin, netherland's tin

Nederlands belasting identificatienummer,


identificatienummer van belasting, identificatienummer
belasting, Nederlands belasting identificatie,
Nederlands belasting id nummer, Nederlands
belastingnummer, btw nummer, Nederlandse belasting
identificatie, Nederlands belastingnummer

netherlands tax identification tal, netherland's tax


identification tal, tax identification tal, tax tal,
Nederlânske tax identification tal, Hollânske tax
identification, Nederlânsk tax tal, Hollânske tax id tal

netherlands impuesto identification number,


netherland's impuesto identification number, impuesto
identification number, impuesto number, hulandes
impuesto identification number, hulandes impuesto
identification, hulandes impuesto number, hulandes
impuesto id number

Netherlands Value Added Tax (VAT) Number


Value-added Tax (VAT) is a consumption tax that is borne by the end consumer. VAT is paid
for each transaction in the manufacturing and distribution process. For Netherlands, the Value
Added Tax is issued by VAT office for the region in which the business is established.
The Netherlands Value Added Tax (VAT) Number data identifier detects a 14-character
alphanumeric pattern that matches the Netherlands VAT Number format.
The Netherlands Value Added Tax (VAT) Number data identifier provides three breadths of
detection:
■ The wide breadth detects a 14-character alphanumeric pattern beginning with NL, without
checksum validation.
Library of system data identifiers 1368
Netherlands Value Added Tax (VAT) Number

See “Netherlands Value Added Tax (VAT) Number wide breadth” on page 1368.
■ The medium breadth detects a 14-character alphanumeric pattern beginning with NL, with
checksum validation.
See “Netherlands Value Added Tax (VAT) Number medium breadth” on page 1368.
■ The narrow breadth detects a 14-character alphanumeric pattern beginning with NL, with
checksum validation. It also requires the presence of related keywords.
See “Netherlands Value Added Tax (VAT) Number narrow breadth” on page 1369.

Netherlands Value Added Tax (VAT) Number wide breadth


The wide breadth detects a 14-character alphanumeric pattern beginning with NL, without
checksum validation

Table 45-772 Netherlands Value Added Tax (VAT) Number wide-breadth patterns

Pattern

[Nn][Ll]\d{9}[Bb]\d{2}

[Nn][Ll]-\d{9}-[Bb]\d{2}

[Nn][Ll] \d{9} [Bb]\d{2}

[Nn][Ll].\d{9}.[Bb]\d{2}

Table 45-773 Netherlands Value Added Tax (VAT) Number wide-breadth validator

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding numbers.

Netherlands Value Added Tax (VAT) Number medium breadth


The medium breadth detects a 14-character alphanumeric pattern beginning with NL, with
checksum validation.

Table 45-774 Netherlands Value Added Tax (VAT) Number medium-breadth patterns

Pattern

[Nn][Ll]\d{9}[Bb]\d{2}

[Nn][Ll]-\d{9}-[Bb]\d{2}

[Nn][Ll] \d{9} [Bb]\d{2}


Library of system data identifiers 1369
Netherlands Value Added Tax (VAT) Number

Table 45-774 Netherlands Value Added Tax (VAT) Number medium-breadth patterns
(continued)

Pattern

[Nn][Ll].\d{9}.[Bb]\d{2}

Table 45-775 Netherlands Value Added Tax (VAT) Number medium breadth validator

Mandatory validator Description

Netherlands VAT Number Validation Check Computes the checksum and validates the pattern against
it.

Netherlands Value Added Tax (VAT) Number narrow breadth


The narrow breadth detects a 14-character alphanumeric pattern beginning with NL, with
checksum validation. It also requires the presence of related keywords.

Table 45-776 Netherlands Value Added Tax (VAT) Number narrow-breadth patterns

Pattern

[Nn][Ll]\d{9}[Bb]\d{2}

[Nn][Ll]-\d{9}-[Bb]\d{2}

[Nn][Ll] \d{9} [Bb]\d{2}

[Nn][Ll].\d{9}.[Bb]\d{2}

Table 45-777 Netherlands Value Added Tax (VAT) Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding numbers.

Netherlands VAT Number Validation Check Computes the checksum and validates the pattern against
it.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

VAT Number, vat no, vat number, VAT#, vat#

BTW, wearde tafoege tax getal, BTW nûmer,


BTW-nummer
Library of system data identifiers 1370
New Zealand Driver's Licence Number

New Zealand Driver's Licence Number


The New Zealand driver license allows the holder to drive specified vehicles with or without
restrictions on public roads. New Zealand driver's licenses are issued by the NZ Transport
Agency.
The New Zealand Driver's Licence data identifier detects an eight-character alphanumeric
pattern that matches the New Zealand Driver's Licence format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects an eight-character alphanumeric pattern that matches the New
Zealand Driver's Licence format. It checks for common test patterns.
See “New Zealand Driver's Licence Number wide breadth” on page 1370.
■ The narrow breadth detects an eight-character alphanumeric pattern that matches the New
Zealand Driver's Licence format. It checks for common test patterns, and also requires the
presence of related keywords.
See “New Zealand Driver's Licence Number narrow breadth” on page 1370.

New Zealand Driver's Licence Number wide breadth


The wide breadth detects an eight-character alphanumeric pattern that matches the New
Zealand Driver's Licence format. It checks for common test patterns.

Table 45-778 New Zealand Driver's Licence Number wide-breadth patterns

Pattern

[a-zA-Z][a-zA-Z]\d\d\d\d\d\d

Table 45-779 New Zealand Driver's Licence Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

000000, 111111, 222222, 333333, 444444, 555555,


666666, 777777, 888888, 999999

New Zealand Driver's Licence Number narrow breadth


The narrow breadth detects an eight-character alphanumeric pattern that matches the New
Zealand Driver's Licence format. It checks for common test patterns, and also requires the
presence of related keywords.
Library of system data identifiers 1371
New Zealand National Health Index Number

Table 45-780 New Zealand Driver's Licence Number narrow-breadth patterns

Pattern

[a-zA-Z][a-zA-Z]\d\d\d\d\d\d

Table 45-781 New Zealand Driver's Licence Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

000000, 111111, 222222, 333333, 444444, 555555,


666666, 777777, 888888, 999999

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

driver license, driver license number, drivers license


number, dlno#, driver's license number, driver permit,
drivers permit, driving permit, license number, licence
number, drivers permit number, dl#

raihana taraiwa

New Zealand National Health Index Number


The National Health Index number (NHI number) is a unique seven-character alphanumeric
identifier that is assigned to every person who uses health and disability support services in
New Zealand.
The New Zealand National Health Index Number detects a seven-character alphanumeric
pattern that matches the NHI number format.
The New Zealand National Health Index Number data identifier provides three breadths of
detection:
■ The wide breadth detects a seven-character alphanumeric pattern with no validation.
See “New Zealand National Health Index Number wide breadth” on page 1372.
■ The medium breadth detects a seven-character alphanumeric pattern with checksum
validation.
See “New Zealand National Health Index Number medium breadth” on page 1372.
Library of system data identifiers 1372
New Zealand National Health Index Number

■ The narrow breadth detects a seven-character alphanumeric pattern with checksum


validation. It also requires the presence of related keywords.
See “New Zealand National Health Index Number narrow breadth” on page 1372.

New Zealand National Health Index Number wide breadth


The wide breadth detects a seven-character alphanumeric pattern with no validation.

Table 45-782 New Zealand National Health Index Number wide-breadth pattern

Pattern

\l{3}\d{4}

The wide breadth does not include any validators.

New Zealand National Health Index Number medium breadth


The medium breadth detects a seven-character alphanumeric pattern with checksum validation.

Table 45-783 New Zealand National Health Index Number medium-breadth pattern

Pattern

\l{3}\d{4}

Table 45-784 New Zealand National Health Index Number medium-breadth validators

Mandatory validator Description

New Zealand National Health Index Number Validation Computes the checksum and validates the pattern against
Check it.

Number delimiter Validates a match by checking the surrounding numbers.

New Zealand National Health Index Number narrow breadth


The narrow breadth detects a seven-character alphanumeric pattern with checksum validation.
It also requires the presence of related keywords.

Table 45-785 New Zealand National Health Index Number narrow-breadth patterns

Pattern

\l{3}\d{4}
Library of system data identifiers 1373
New Zealand Passport Number

Table 45-786 New Zealand National Health Index Number narrow-breadth validators

Mandatory validator Description

New Zealand National Health Index Number Validation Computes the checksum and validates the pattern against
Check it.

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding numbers.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

New Zealand National Health Index Number Validation


Check Find keywords: National Health Index Number,
nhi number, NHI Number, nhi no., NHI number, National
Health Index No., National Health Index Id

New Zealand Passport Number


New Zealand passports are issued to New Zealand citizens for the purpose of international
travel by the Department of Internal Affairs.
The New Zealand Passport Number data identifier detects a seven- or eight-character
alphanumeric pattern that matches the New Zealand Passport Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects a seven- or eight-character alphanumeric pattern that matches
the New Zealand Passport Number format. It checks for common test numbers.
See “New Zealand Passport Number wide breadth” on page 1373.
■ The narrow breadth detects a seven- or eight-character alphanumeric pattern that matches
the New Zealand Passport Number format. It checks for common test numbers, and also
requires the presence of related keywords.
See “New Zealand Passport Number narrow breadth” on page 1374.

New Zealand Passport Number wide breadth


The wide breadth detects a seven- or eight-character alphanumeric pattern that matches the
New Zealand Passport Number format. It checks for common test numbers.
Library of system data identifiers 1374
New Zealand Passport Number

Table 45-787 New Zealand Passport Number wide-breadth patterns

Pattern

[Ll][Aa]\d\d\d\d\d\d

[Ll][Dd]\d\d\d\d\d\d

[Ll][Ff]\d\d\d\d\d\d

[Nn]\d\d\d\d\d\d

[Ee][Aa]\d\d\d\d\d\d

[Ll][Hh]\d\d\d\d\d\d

[Ee][Pp]\d\d\d\d\d\d

Table 45-788 New Zealand Passport Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

000000, 111111, 222222, 333333, 444444, 555555,


666666, 777777, 888888, 999999

New Zealand Passport Number narrow breadth


The narrow breadth detects a seven- or eight-character alphanumeric pattern that matches
the New Zealand Passport Number format. It checks for common test numbers, and also
requires the presence of related keywords.

Table 45-789 New Zealand Passport Number narrow-breadth patterns

Pattern

[Ll][Aa]\d\d\d\d\d\d

[Ll][Dd]\d\d\d\d\d\d

[Ll][Ff]\d\d\d\d\d\d

[Nn]\d\d\d\d\d\d

[Ee][Aa]\d\d\d\d\d\d
Library of system data identifiers 1375
Norway Driver's Licence Number

Table 45-789 New Zealand Passport Number narrow-breadth patterns (continued)

Pattern

[Ll][Hh]\d\d\d\d\d\d

[Ee][Pp]\d\d\d\d\d\d

Table 45-790 New Zealand Passport Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

000000, 111111, 222222, 333333, 444444, 555555,


666666, 777777, 888888, 999999

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

passport, passport number, passport no, passportno,


passport no., passport#, passportno#

uruwhenua, tau uruwhenua, uruwhenua no, uruwhenua


no.

Norway Driver's Licence Number


A driver's license is required in Norway before a person is permitted to drive a motor vehicle
of any description on a road in Norway.
The Norway Driver's Licence Number data identifier detects an 11-digit number that matches
the Norway Driver's Licence Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects an 11-digit number that matches the Norway Driver's Licence
Number format. It checks for common test numbers.
See “Norway Driver's Licence Number wide breadth” on page 1376.
■ The narrow breadth detects an 11-digit number that matches the Norway Driver's Licence
Number format. It checks for common test numbers, and also requires the presence of
related keywords.
See “Norway Driver's Licence Number narrow breadth” on page 1376.
Library of system data identifiers 1376
Norway Driver's Licence Number

Norway Driver's Licence Number wide breadth


The wide breadth detects an 11-digit number that matches the Norway Driver's Licence Number
format. It checks for common test numbers.

Table 45-791 Norway Driver's Licence Number wide-breadth patterns

Pattern

\d\d \d\d \d\d\d\d\d\d \d

Table 45-792 Norway Driver's Licence Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Duplicate digits Ensures that a string of digits is not all the same.

Norway Driver's Licence Number narrow breadth


The narrow breadth detects an 11-digit number that matches the Norway Driver's License
Number format. It checks for common test numbers, and also requires the presence of related
keywords.

Table 45-793 Norway Driver's Licence Number narrow-breadth patterns

Pattern

\d\d \d\d \d\d\d\d\d\d \d

Table 45-794 Norway Driver's Licence Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Duplicate digits Ensures that a string of digits is not all the same.
Library of system data identifiers 1377
Norway National Identification Number

Table 45-794 Norway Driver's Licence Number narrow-breadth validators (continued)

Mandatory validator Description

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

driver license, drivers license, driving license, driver


license number, drivers license number, driving license
number, dlno#, drivers lic., driver's license number,
driver licence, drivers licence, driving licence, driver
permit, drivers permit, driving permit, license number,
licence number

førerkort, førerkortnummer

Norway National Identification Number


The Norway National identification number is assigned by the Norwegian state to all citizens
of the country. It is administered by the Tax Administration.
The Norway National Identification Number data identifier detects a 9- or 11-digit number that
matches the Norway National Identification Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects a 9- or 11-digit number that matches the Norway National
Identification Number format without checksum validation. It checks for common test
numbers.
See “Norway National Identification Number wide breadth” on page 1377.
■ The medium breadth detects a 9- or 11-digit number that matches the Norway National
Identification Number format with checksum validation.
See “Norway National Identification Number medium breadth” on page 1378.
■ The narrow breadth detects a 9- or 11-digit number that matches the Norway National
Identification Number format. It checks for common test numbers, and also requires the
presence of related keywords.
See “Norway National Identification Number narrow breadth” on page 1379.

Norway National Identification Number wide breadth


The wide breadth detects a 9- or 11-digit number that matches the Norway National Identification
Number format without checksum validation. It checks for common test numbers.
Library of system data identifiers 1378
Norway National Identification Number

Table 45-795 Norway National Identification Number wide-breadth patterns

Pattern

[0123]\d[01]\d\d\d\d\d\d\d\d

[89]\d\d\d\d\d\d\d\d\d\d

[89]\d\d\d\d\d\d\d\d

[89]\d\d \d\d\d \d\d\d

Table 45-796 Norway National Identification Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Duplicate digits Ensures that a string of digits is not all the same.

Norway National Identification Number medium breadth


The medium breadth detects a 9- or 11-digit number that matches the Norway National
Identification Number format with checksum validation.

Table 45-797 Norway National Identification Number medium-breadth patterns

Pattern

[0123]\d[01]\d\d\d\d\d\d\d\d

[89]\d\d\d\d\d\d\d\d\d\d

[89]\d\d\d\d\d\d\d\d

[89]\d\d \d\d\d \d\d\d

Table 45-798 Norway National Identification Number medium-breadth validators

Mandatory validator Description

Norway National Identification Number Validation Computes the checksum and validates the pattern against
Check it.
Library of system data identifiers 1379
Norway Value Added Tax Number

Norway National Identification Number narrow breadth


The narrow breadth detects a 9- or 11-digit number that matches the Norway National
Identification Number format. It checks for common test numbers, and also requires the
presence of related keywords.

Table 45-799 Norway National Identification Number narrow-breadth patterns

Pattern

[0123]\d[01]\d\d\d\d\d\d\d\d

[89]\d\d\d\d\d\d\d\d\d\d

[89]\d\d\d\d\d\d\d\d

[89]\d\d \d\d\d \d\d\d

Table 45-800 Norway National Identification Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Duplicate digits Ensures that a string of digits is not all the same.

Norway National Identification Number Validation Computes the checksum and validates the pattern against
Check it.

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

national ID, national identification number, personal


ID, personal identification number, nationalid#,
personalid#, Nasjonalt ID, personlig ID, Nasjonalt ID#,
personlig ID#, tax id, tax number, tax identification
number, tax code, taxpayer id, taxpayer identification
number, skatt id, skattenummer, skattekode,
skattebetalers id, skattebetalers identifikasjonsnummer

Norway Value Added Tax Number


Value Added Tax (VAT) is a consumption tax that is borne by the end consumer. VAT is paid
for each transaction in the manufacturing and distribution process. For Norway, VAT Is
administered by the VAT office for the region in which the business is established.
Library of system data identifiers 1380
Norway Value Added Tax Number

The Norway Value Added Tax Number data identifier detects an 11- or 14-character
alphanumeric pattern that matches the Norway Value Added Tax Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects an 11- or 14-character alphanumeric pattern that matches the
Norway Value Added Tax Number format without checksum validation. It checks for common
test patterns.
See “Norway Value Added Tax Number wide breadth” on page 1380.
■ The medium breadth detects an 11- or 14-character alphanumeric pattern that matches
the Norway Value Added Tax Number format with checksum validation.
See “Norway Value Added Tax Number medium breadth” on page 1381.
■ The narrow breadth detects an 11- or 14-character alphanumeric pattern that matches the
Norway Value Added Tax Number format with checksum validation. It checks for common
test patterns, and also requires the presence of related keywords.
See “Norway Value Added Tax Number narrow breadth” on page 1381.

Norway Value Added Tax Number wide breadth


The wide breadth detects an 11- or 14-character alphanumeric pattern that matches the Norway
Value Added Tax Number format without checksum validation. It checks for common test
patterns.

Table 45-801 Norway Value Added Tax Number wide-breadth patterns

Pattern

[Nn][Oo]\d\d\d-\d\d\d-\d\d\d

[Nn][Oo]\d\d\d\d\d\d\d\d\d[Mm][Vv][Aa]

Table 45-802 Norway Value Added Tax Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

000000000, 111111111, 222222222, 333333333,


444444444, 555555555, 666666666, 777777777,
888888888, 999999999
Library of system data identifiers 1381
Norway Value Added Tax Number

Norway Value Added Tax Number medium breadth


The medium breadth detects an 11- or 14-character alphanumeric pattern that matches the
Norway Value Added Tax Number format with checksum validation.

Table 45-803 Norway Value Added Tax Number medium-breadth patterns

Pattern

[Nn][Oo]\d\d\d-\d\d\d-\d\d\d

[Nn][Oo]\d\d\d\d\d\d\d\d\d[Mm][Vv][Aa]

Table 45-804 Norway Value Added Tax Number medium-breadth validators

Mandatory validator Description

Norway Value Added Tax (VAT) Number Check Computes the checksum and validates the pattern against
it.

Norway Value Added Tax Number narrow breadth


The narrow breadth detects an 11- or 14-character alphanumeric pattern that matches the
Norway Value Added Tax Number format with checksum validation. It checks for common test
patterns, and also requires the presence of related keywords.

Table 45-805 Norway Value Added Tax Number narrow-breadth patterns

Pattern

[Nn][Oo]\d\d\d-\d\d\d-\d\d\d

[Nn][Oo]\d\d\d\d\d\d\d\d\d[Mm][Vv][Aa]

Table 45-806 Norway Value Added Tax Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

000000000, 111111111, 222222222, 333333333,


444444444, 555555555, 666666666, 777777777,
888888888, 999999999
Library of system data identifiers 1382
Norwegian Birth Number

Table 45-806 Norway Value Added Tax Number narrow-breadth validators (continued)

Mandatory validator Description

Norway Value Added Tax (VAT) Number Check Computes the checksum and validates the pattern against
it.

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

vat, vat number, vat#, value added tax number, VAT,


VAT#, vat registration number, VAT Number

mva, MVA, momsnummer, Momsnummer,


momsregistreringsnummer

Norwegian Birth Number


The Norwegian Birth Number is assigned at birth or registration with the National Population
Register. The birth number is written on identity documents, making it possible to match a
bank account or authority document to a person.
The Norwegian Birth Number data identifier detects an 11-digit number that matches the
Norwegian Birth Number format.
The Norwegian Birth Number system data identifier provides three breadths of detection:
■ The wide breadth detects an 11-digit number without checksum validation.
See “ Norwegian Birth Number wide breadth” on page 1382.
■ The medium breadth detects an 11-digit number with checksum validation.
See “ Norwegian Birth Number medium breadth” on page 1383.
■ The narrow breadth detects an 11-digit number that passes checksum validation. It also
requires the presence of related keywords.
See “ Norwegian Birth Number narrow breadth” on page 1383.

Norwegian Birth Number wide breadth


The wide breadth detects an 11- digit number without checksum validation.

Table 45-807 Norwegian Birth Number wide breadth patterns

Pattern

[01234567]\d[012345]\d[56789]\d[567]\d{4}
Library of system data identifiers 1383
Norwegian Birth Number

Table 45-807 Norwegian Birth Number wide breadth patterns (continued)

Pattern

[01234567]\d[012345]\d\d\d[01234]\d{4}

[01234567]\d[012345]\d[456789]\d[9]\d{4}

[01234567]\d[012345]\d[0123]\d[56789]\d{4}

Table 45-808 Norwegian Birth Number wide breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Norwegian Birth Number medium breadth


The medium breadth detects an 11-digit number with checksum validation.

Table 45-809 Norwegian Birth Number medium breadth patterns

Pattern

[01234567]\d[012345]\d[56789]\d[567]\d{4}

[01234567]\d[012345]\d\d\d[01234]\d{4}

[01234567]\d[012345]\d[456789]\d[9]\d{4}

[01234567]\d[012345]\d[0123]\d[56789]\d{4}

Table 45-810 Norwegian Birth Number medium breadth validator

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Norwegian Birth Number Validation Check Computes the checksum and validates the pattern against
it.

Norwegian Birth Number narrow breadth


The narrow breadth detects an 11-digit number that passes checksum validation. It also
requires the presence of Norwegian Birth Number-related keywords.
Library of system data identifiers 1384
People's Republic of China ID

Table 45-811 Norwegian Birth Number narrow breadth patterns

Pattern

[01234567]\d[012345]\d[56789]\d[567]\d{4}

[01234567]\d[012345]\d\d\d[01234]\d{4}

[01234567]\d[012345]\d[456789]\d[9]\d{4}

[01234567]\d[012345]\d[0123]\d[56789]\d{4}

Table 45-812 Norwegian Birth Number narrow breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

Norwegian Birth Number Validation Check Computes the checksum and validates the pattern against
it.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

Norwegian birth number, birth number, birth no,


birthnumber#, birthbo#

fødselsnummer#, fødsel nummer, Fødsel nr, fødsel


nei, fødselnei#

People's Republic of China ID


The People's Republic of China ID is used for residential registration, army enrollment
registration, registration of marriage/divorce, traveling abroad, taking part in various national
exams, and other social or civil matters in China.
The People's Republic of China ID data identifier detects an 18-digit number that matches the
People's Republic of China ID format.
The People's Republic of China ID data identifier provides two breadths of detection:
■ The wide breadth detects an 18-digit number with the checksum validation.
See “People's Republic of China ID wide breadth” on page 1385.
■ The narrow breadth detects an 18-digit number with the checksum validation. It also requires
the presence of People's Republic of China ID-related keywords.
Library of system data identifiers 1385
People's Republic of China ID

See “People's Republic of China ID narrow breadth” on page 1385.

People's Republic of China ID wide breadth


The wide breadth detects an 18-digit number with the checksum validation.

Table 45-813 People's Republic of China ID wide-breadth pattern

Pattern

\d{18}

\d{17}[Xx]

Table 45-814 People's Republic of China ID wide-breadth validator

Mandatory validator Description

China ID checksum validator Computes the checksum and validates the pattern against
it.

People's Republic of China ID narrow breadth


The narrow breadth detects an 18-digit number with the checksum validation. It also requires
the presence of People's Republic of China ID-related keywords.

Table 45-815
Pattern

\d{18}

\d{17}[Xx]

Table 45-816
Mandatory validator Description

China ID checksum validator Computes the checksum and validates the pattern
against it.
Library of system data identifiers 1386
Poland Driver's Licence Number

Table 45-816 (continued)

Mandatory validator Description

Find keywords At least one of the following keywords or key


phrases must be present for the data to be matched
when you use this option.

Inputs:

身份证,居民信息,居民身份信息

Identity Card, Information of resident,


Information of resident identification

Poland Driver's Licence Number


Poland issues driving licenses confirming the rights of the holder to drive motor vehicles.
The Poland Driver's Licence Number data identifier detects an 11-digit number that matches
the Poland Driver's Licence Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects an 11-digit number that matches the Poland Driver's License
Number format. It checks for common test numbers.
See “Poland Driver's Licence Number wide breadth” on page 1386.
■ The narrow breadth detects an 11-digit number that matches the Poland Driver's License
Number format. It checks for common test numbers, and also requires the presence of
related keywords.
See “Poland Driver's Licence Number narrow breadth” on page 1387.

Poland Driver's Licence Number wide breadth


The wide breadth detects an 11-digit number that matches the Poland Driver's Licence Number
format. It checks for common test numbers.

Table 45-817 Poland Driver's Licence Number wide-breadth patterns

Pattern

\d{5}\/\d{2}\/\d{4}
Library of system data identifiers 1387
Poland European Health Insurance Number

Table 45-818 Poland Driver's Licence Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Duplicate digits Ensures that a string of numbers is not all the same.

Poland Driver's Licence Number narrow breadth


The narrow breadth detects an 11-digit number that matches the Poland Driver's Licence
Number format. It checks for common test numbers, and also requires the presence of related
keywords.

Table 45-819 Poland Driver's Licence Number narrow-breadth patterns

Pattern

\d{5}\/\d{2}\/\d{4}

Table 45-820 Poland Driver's Licence Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Duplicate digits Ensures that a string of numbers is not all the same.

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

DLNo#, dlno#, DL#, Drivers Lic., driver licence, driver


license, drivers licence, drivers license, driver's
licence, driver's license, driving licence, driving
license, licence number, license number, driving permit

Kierowcy Lic., prawo jazdy, numer licencyjny,


zezwolenie na prowadzenie, PRAWO JAZDY

Poland European Health Insurance Number


The Polish European Health Insurance Number is a unique 20-digit identifier assigned to each
person using Polish health services.
The Poland European Health Insurance Number data identifier detects a 20-digit number that
matches the Polish European Health Insurance Number format.
Library of system data identifiers 1388
Poland European Health Insurance Number

This data identifier provides the following breadths of detection:


■ The wide breadth detects a 20-digit number that matches the Polish European Health
Insurance Number format. It checks for common test numbers.
See “Poland European Health Insurance Number wide breadth” on page 1388.
■ The narrow breadth detects a 20-digit number that matches the Polish European Health
Insurance Number format. It checks for common test numbers, and also requires the
presence of related keywords.
See “Poland European Health Insurance Number narrow breadth” on page 1388.

Poland European Health Insurance Number wide breadth


The wide breadth detects a 20-digit number that matches the Polish European Health Insurance
Number format. It checks for common test numbers.

Table 45-821 Poland European Health Insurance Number wide-breadth pattern

Pattern

80616000\d{2}\d{10}

Table 45-822 Poland European Health Insurance Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude beginning characters Data beginning with any of the following list of values is
not matched:

80616000000000000000, 80616000111111111111,
80616000222222222222, 80616000333333333333,
80616000444444444444, 80616000555555555555,
80616000666666666666, 80616000777777777777,
80616000888888888888, 80616000999999999999

Poland European Health Insurance Number narrow breadth


The narrow breadth detects a 20-digit number that matches the Polish European Health
Insurance Number format. It checks for common test numbers, and also requires the presence
of related keywords.
Library of system data identifiers 1389
Poland Passport Number

Table 45-823 Poland European Health Insurance Number narrow-breadth pattern

Pattern

80616000\d{2}\d{10}

Table 45-824 Poland European Health Insurance Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude beginning characters Data beginning with any of the following list of values is
not matched:

80616000000000000000, 80616000111111111111,
80616000222222222222, 80616000333333333333,
80616000444444444444, 80616000555555555555,
80616000666666666666, 80616000777777777777,
80616000888888888888, 80616000999999999999

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

poland medical account number, health insurance


number, health insurance card, EHIC number, Numer
EHIC, Karta Ubezpieczenia Zdrowotnego, Europejska
Karta Ubezpieczenia Zdrowotnego, numer
ubezpieczenia zdrowotnego, numer rachunku
medycznego, ehic, ehic#, EHIC, EHIC#, medical
account number, medical account no, numer rachunku
medycznego, medical account#, health insurance no,
health insurance#

Poland Passport Number


A Polish passport is an international travel document issued to nationals of Poland. It may also
serve as proof of Polish citizenship.
The Poland Passport Number data identifier detects a nine-character alphanumeric pattern
that matches the Poland Passport Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects a nine-character alphanumeric pattern that matches the Poland
Passport Number format. It checks for common test patterns.
See “Poland Passport Number wide breadth” on page 1390.
Library of system data identifiers 1390
Poland Passport Number

■ The narrow breadth detects a nine-character alphanumeric pattern that matches the Poland
Passport Number format. It checks for common test patterns, and also requires the presence
of related keywords.
See “Poland Passport Number narrow breadth” on page 1390.

Poland Passport Number wide breadth


The wide breadth detects a nine-character alphanumeric pattern that matches the Poland
Passport Number format. It checks for common test patterns.

Table 45-825 Poland Passport Number wide-breadth patterns

Pattern

[a-zA-Z]{2}\d{7}

Table 45-826 Poland Passport Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

1111111, 2222222, 3333333, 4444444, 5555555,


6666666, 7777777, 8888888, 9999999

Poland Passport Number narrow breadth


The narrow breadth detects a nine-character alphanumeric pattern that matches the Poland
Passport Number format. It checks for common test patterns, and also requires the presence
of related keywords.

Table 45-827 Poland Passport Number narrow-breadth patterns

Pattern

[a-zA-Z]{2}\d{7}

Table 45-828 Poland Passport Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.


Library of system data identifiers 1391
Poland Value Added Tax (VAT) Number

Table 45-828 Poland Passport Number narrow-breadth validators (continued)

Mandatory validator Description

Exclude ending characters Data ending with any of the following list of values is not
matched:

1111111, 2222222, 3333333, 4444444, 5555555,


6666666, 7777777, 8888888, 9999999

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

passport, passport number, passportnumber,


passport#, passport no, passport book

paszport#, numer paszportu, Nr paszportu, paszport,


książka paszportowa, passeport, nombre, numéro de
passeport, passeport#, No de passeport

Poland Value Added Tax (VAT) Number


Value Added Tax (VAT) is a consumption tax that is borne by the end consumer. VAT is paid
for each transaction in the manufacturing and distribution process. For Poland, VAT is
administered by the VAT office for the region in which the business is established.
The Poland Value Added Tax (VAT) Number data identifier detects a 12-character alphanumeric
pattern that matches the Poland VAT Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects a 12-character alphanumeric pattern that matches the Poland
VAT Number format without checksum validation. It checks for common test patterns.
See “Poland Value Added Tax (VAT) Number wide breadth” on page 1392.
■ The medium breadth detects a 12-character alphanumeric pattern that matches the Poland
VAT Number format with checksum validation.
See “Poland Value Added Tax (VAT) Number medium breadth” on page 1392.
■ The narrow breadth detects a 12-character alphanumeric pattern that matches the Poland
VAT Number format with checksum validation. It checks for common test patterns, and
also requires the presence of related keywords.
See “Poland Value Added Tax (VAT) Number narrow breadth” on page 1393.
Library of system data identifiers 1392
Poland Value Added Tax (VAT) Number

Poland Value Added Tax (VAT) Number wide breadth


The wide breadth detects a 12-character alphanumeric pattern that matches the Poland VAT
Number format without checksum validation. It checks for common test patterns.

Table 45-829 Poland Value Added Tax (VAT) Number wide-breadth patterns

Pattern

[Pp][Ll]\d{10}

[Pp][Ll] \d{10}

[Pp][Ll]\d{3}-\d{3}-\d{2}-\d{2}

[Pp][Ll]\d{3}-\d{2}-\d{2}-\d{3}

[Pp][Ll] \d{3}-\d{3}-\d{2}-\d{2}

[Pp][Ll] \d{3}-\d{2}-\d{2}-\d{3}

Table 45-830 Poland Value Added Tax (VAT) Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

1111111111, 2222222222, 3333333333, 4444444444,


5555555555, 6666666666, 7777777777, 8888888888,
9999999999

Poland Value Added Tax (VAT) Number medium breadth


The medium breadth detects a 12-character alphanumeric pattern that matches the Poland
VAT Number format with checksum validation.

Table 45-831 Poland Value Added Tax (VAT) Number medium-breadth patterns

Pattern

[Pp][Ll]\d{10}

[Pp][Ll] \d{10}

[Pp][Ll]\d{3}-\d{3}-\d{2}-\d{2}

[Pp][Ll]\d{3}-\d{2}-\d{2}-\d{3}
Library of system data identifiers 1393
Poland Value Added Tax (VAT) Number

Table 45-831 Poland Value Added Tax (VAT) Number medium-breadth patterns (continued)

Pattern

[Pp][Ll] \d{3}-\d{3}-\d{2}-\d{2}

[Pp][Ll] \d{3}-\d{2}-\d{2}-\d{3}

Table 45-832 Poland Value Added Tax (VAT) Number medium-breadth validators

Mandatory validator Description

Poland VAT Number Validation Check Computes the checksum and validates the pattern against
it.

Poland Value Added Tax (VAT) Number narrow breadth


The narrow breadth detects a 12-character alphanumeric pattern that matches the Poland
VAT Number format with checksum validation. It checks for common test patterns, and also
requires the presence of related keywords.

Table 45-833 Poland Value Added Tax (VAT) Number narrow-breadth patterns

Pattern

[Pp][Ll]\d{10}

[Pp][Ll] \d{10}

[Pp][Ll]\d{3}-\d{3}-\d{2}-\d{2}

[Pp][Ll]\d{3}-\d{2}-\d{2}-\d{3}

[Pp][Ll] \d{3}-\d{3}-\d{2}-\d{2}

[Pp][Ll] \d{3}-\d{2}-\d{2}-\d{3}

Table 45-834 Poland Value Added Tax (VAT) Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

1111111111, 2222222222, 3333333333, 4444444444,


5555555555, 6666666666, 7777777777, 8888888888,
9999999999
Library of system data identifiers 1394
Polish Identification Number

Table 45-834 Poland Value Added Tax (VAT) Number narrow-breadth validators (continued)

Mandatory validator Description

Poland VAT Number Validation Check Computes the checksum and validates the pattern against
it.

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

vat number, value added tax, vat, VAT, VAT#, vat#


VATIN, vatin

Numer Identyfikacji Podatkowej, NIP, nip, Liczba VAT,


podatek od wartosci dodanej, faktura VAT, faktura
VAT#

Polish Identification Number


Every Polish citizen 18 years of age or older residing permanently in Poland must have an
Identity Card, with a unique personal number. The number is used as identification for almost
all purposes.
The Polish Identification Number detects a nine-digit alphanumeric pattern that matches the
Polish Identification Number format.
The Polish ID Number system data identifier provides three breadths of detection:
■ The wide breadth detects a nine-digit alphanumeric pattern without checksum validation.
See “Polish Identification Number wide breadth” on page 1394.
■ The medium breadth detects a nine-digit alphanumeric pattern with checksum validation.
See “Polish Identification Number medium breadth” on page 1395.
■ The narrow breadth detects a nine-digit alphanumeric pattern with checksum validation. It
also requires the presence of related keywords.
See “Polish Identification Number narrow breadth” on page 1395.

Polish Identification Number wide breadth


The wide breadth detects a nine-digit alphanumeric pattern without checksum validation.

Table 45-835 Polish Identification Number wide-breadth pattern

Pattern

[A-Z]{3}\d{6}
Library of system data identifiers 1395
Polish Identification Number

Table 45-836 Polish Identification Number wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Polish Identification Number medium breadth


The medium breadth detects a nine-digit alphanumeric pattern with checksum validation.

Table 45-837 Polish Identification Number medium-breadth pattern

Pattern

[A-Z]{3}\d{6}

Table 45-838 Polish Identification Number medium-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Polish ID Number Validation Check Computes the checksum and validates the pattern against
it.

Polish Identification Number narrow breadth


The narrow breadth detects a nine-digit alphanumeric pattern with checksum validation. It also
requires the presence of related keywords.

Table 45-839 Polish ID Number narrow-breadth pattern

Pattern

[A-Z]{3}\d{6}

Table 45-840 Polish Identification Number narrow-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

Polish ID Number Validation Check Computes the checksum and validates the pattern against
it.
Library of system data identifiers 1396
Polish REGON Number

Table 45-840 Polish Identification Number narrow-breadth validators (continued)

Mandatory validator Description

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

national identification number, personal identification


number, personal identity no, unique identity number,
nationalidno#, personal ID, personal identity,
personalidentityno#, uniqueid#, nationalid#,
natioanlidentity#

Dowód osobisty, Tożsamości narodowej, osobisty


numer identyfikacyjny, niepowtarzalny numer, numer
identyfikacyjny, Dowódosobisty#,
niepowtarzalnynumer#

Polish REGON Number


Each national economy entity is obligated to register in the register of business entities called
REGON in Poland. It is the only integrated register in Poland covering all of the national
economy entities. Each company has a unique REGON number.
The Polish REGON Number data identifier detects a 14-digit number that matches the REGON
Number format.
The Polish REGON Number system data identifier provides three breadths of detection:
■ The wide breadth detects a14-digit number without checksum validation.
See “Polish REGON Number wide breadth” on page 1396.
■ The medium breadth detects a 14-digit number with checksum validation.
See “Polish REGON Number medium breadth” on page 1397.
■ The narrow breadth detects a 14-digit number with checksum validation. It also requires
the presence related keywords.
See “Polish REGON Number narrow breadth” on page 1397.

Polish REGON Number wide breadth


The wide breadth detects a 14-digit number without checksum validation.
Library of system data identifiers 1397
Polish REGON Number

Table 45-841 Polish REGON Number wide-breadth patterns

Patterns

\d{14}

\d{9}-\d{5}

Table 45-842 Polish REGON Number wide breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Polish REGON Number medium breadth


The medium breadth detects a 14-digit number with checksum validation.

Table 45-843 Polish REGON Number medium-breadth patterns

Patterns

\d{14}

\d{9}-\d{5}

Table 45-844 Polish REGON Number medium-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Polish REGON Number Validation Check Computes the checksum and validates the pattern against
it.

Polish REGON Number narrow breadth


The narrow breadth detects a 14-digit number with checksum validation. It also requires the
presence of related keywords.

Table 45-845 Polish REGON Number narrow-breadth patterns

Patterns

\d{14}

\d{9}-\d{5}
Library of system data identifiers 1398
Polish Social Security Number (PESEL)

Table 45-846 Polish REGON Number narrow-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

Polish REGON Number Validation Check Computes the checksum and validates the pattern against
it.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

REGON ID, statistical number, statistical ID, statistical


no, REGON number, regonid#, REGONID#, regonno#,
company ID, companyID#, company ID no, company
ID number, companyIDno#

numer statystyczny, REGON, numeru REGON,


numerstatystyczny#, numeruREGON#

Polish Social Security Number (PESEL)


The Polish Social Security Number (PESEL) is the national identification number used in
Poland. The PESEL number is mandatory for all permanent residents of Poland and for
temporary residents living in Poland. It uniquely identifies a person and cannot be transferred
to another.
The Polish Social Security Number (PESEL) data identifier detects an 11-digit number that
matches the PESEL format.
The Polish Social Security Number (PESEL) system data identifier provides three breadths of
detection:
■ The wide breadth detects an 11-digit number without checksum validation.
See “Polish Social Security Number (PESEL) wide breadth” on page 1399.
■ The medium breadth detects an 11-digit number with checksum validation.
See “Polish Social Security Number (PESEL) medium breadth” on page 1399.
■ The narrow breadth detects an 11-digit number with checksum validation. It also requires
the presence related keywords.
See “Polish Social Security Number (PESEL) narrow breadth” on page 1399.
Library of system data identifiers 1399
Polish Social Security Number (PESEL)

Polish Social Security Number (PESEL) wide breadth


The wide breadth detects an 11-digit number without checksum validation.

Table 45-847 Polish Social Security Number (PESEL) wide-breadth pattern

Pattern

\d{2}[012389]\d[0-3]\d{6}

Table 45-848 Polish Social Security Number (PESEL) wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Polish Social Security Number (PESEL) medium breadth


The medium breadth detects an 11-digit number with checksum validation.

Table 45-849 Polish Social Security Number (PESEL) medium breadth pattern

Pattern

\d{2}[012389]\d[0-3]\d{6}

Table 45-850 Polish Social Security Number (PESEL) medium breadth validators

Mandatory validator Description

Polish Social Security Number Validation Check Computes the checksum and validates the pattern against
it.

Number delimiter Validates a match by checking the surrounding characters.

Polish Social Security Number (PESEL) narrow breadth


The narrow breadth detects an 11-digit number with checksum validation. It also requires the
presence of related keywords.

Table 45-851 Polish Social Security Number (PESEL) narrow breadth patterns

Pattern

\d{2}[012389]\d[0-3]\d{6}
Library of system data identifiers 1400
Polish Tax Identification Number

Table 45-852 Polish Social Security Number (PESEL) narrow breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

Polish Social Security Number Validation Check Computes the checksum and validates the pattern against
it.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

PESEL ID, polish SSN, social security number, social


security no, SSN#, PESELID#, peselno#, pesel number,
social security code

PESEL Liczba, społeczny bezpieczeństwo liczba,


społeczny bezpieczeństwo ID, społeczny
bezpieczeństwo kod, PESELliczba#,
społecznybezpieczeństwoliczba#

Polish Tax Identification Number


The Polish Tax Identification Number (NIP) is a number the government gives to every Poland
citizen who works or does business in Poland. All taxpayers have a tax identification number
called NIP.
The Polish Tax Identification Number data identifier detects a 10-digit number that matches
the NIP format.
The Polish Tax Identification Number (NIP) system data identifier provides three breadths of
detection:
■ The wide breadth detects a 10-digit number without checksum validation.
See “Polish Tax Identification Number wide breadth” on page 1401.
■ The medium breadth detects a 10-digit number with checksum validation.
See “Polish Tax Identification Number medium breadth” on page 1401.
■ The narrow breadth detects a 10-digit number with checksum validation. It also requires
the presence of related keywords.
See “Polish Tax Identification Number narrow breadth” on page 1401.
Library of system data identifiers 1401
Polish Tax Identification Number

Polish Tax Identification Number wide breadth


The wide breadth detects a 10-digit number without checksum validation.

Table 45-853 Polish Tax Identification Number wide-breadth patterns

Pattern

\d{10}

\d{3}[ -]\d{3}[ -]\d{2}[ -]\d{2}

\d{3}[ -]\d{2}[ -]\d{2}[ -]\d{3}

Table 45-854 Polish Tax Identification Number wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Polish Tax Identification Number medium breadth


The medium breadth detects a 10-digit number with checksum validation.

Table 45-855 Polish Tax Identification Number medium-breadth patterns

Pattern

\d{10}

\d{3}[ -]\d{3}[ -]\d{2}[ -]\d{2}

\d{3}[ -]\d{2}[ -]\d{2}[ -]\d{3}

Table 45-856 Polish Tax Identification Number medium breadth-validators

Mandatory validator Description

Polish Social Security Number Validation Check Computes the checksum and validates the pattern against
it.

Number delimiter Validates a match by checking the surrounding characters.

Polish Tax Identification Number narrow breadth


The narrow breadth detects a 10-digit number with checksum validation. It also requires the
presence of related keywords.
Library of system data identifiers 1402
Portugal Driver's Licence Number

Table 45-857 Polish Tax Identification Number narrow-breadth patterns

Pattern

\d{10}

\d{3}[ -]\d{3}[ -]\d{2}[ -]\d{2}

\d{3}[ -]\d{2}[ -]\d{2}[ -]\d{3}

Table 45-858 Polish Tax Identification Number narrow-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

Polish Tax ID Number Validation Check Computes the checksum and validates the pattern against
it.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

Tax Number, tax number, tax no., taxno#, taxnumber#,


taxnumber, NIP, NIP#, Tax ID, taxid#, TAXID#, NIP ID,
NIPID#, nip#, tax identification number, tax
identification no., VAT Number, VAT No., vatno#, VAT
ID, VATID#

Numer Identyfikacji Podatkowej, Polski numer


identyfikacji podatkowej,
NumerIdentyfikacjiPodatkowej#, NIP

Portugal Driver's Licence Number


The institute for Mobility and Land Transport (IMTT) issues driver's licenses in Portugal.
The Portugal Driver's Licence Number data identifier detects an 8- to 10-character alphanumeric
pattern that matches the Portugal Driver's Licence Number format.
The Portugal Driver's Licence Number data identifier provides two breadths of detection:
■ The wide breadth detects an 8- to 10-character alphanumeric pattern without checksum
validation.
See “Portugal Driver's Licence Number wide breadth” on page 1403.
Library of system data identifiers 1403
Portugal Driver's Licence Number

■ The narrow breadth detects an 8- to 10-character alphanumeric pattern without checksum


validation. It requires the presence of related keywords.
See “Portugal Driver's Licence Number narrow breadth” on page 1403.

Portugal Driver's Licence Number wide breadth


The wide breadth detects an 8- to 10-character alphanumeric pattern without checksum
validation.

Table 45-859 Portugal Driver's Licence Number wide-breadth patterns

Patterns

[A-Za-z]{2}-\d{5,6} \d

[A-Za-z]-\d{6,8} \d

Table 45-860 Portugal Driver's Licence Number wide-breadth validator

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Portugal Driver's Licence Number narrow breadth


The narrow breadth detects an 8- to 10-character alphanumeric pattern without checksum
validation. It requires the presence of related keywords.

Table 45-861 Portugal Driver's Licence Number narrow-breadth patterns

Patterns

[A-Za-z]{2}-\d{5,6} \d

[A-Za-z]-\d{6,8} \d

Table 45-862 Portugal Driver's Licence Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.


Library of system data identifiers 1404
Portugal National Identification Number

Table 45-862 Portugal Driver's Licence Number narrow-breadth validators (continued)

Mandatory validator Description

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

DLNo#, dlno#, DL#, Drivers Lic., driver licence, driver


license, drivers licence, drivers license, driver's
licence, driver's license, driving licence, driving
license, licence number, license number, driving
permit, portugal driving license

carteira de motorista, carteira motorista, carteira de


habilitação, carteira habilitação, número de licença,
número licença, permissão de condução, permissão
condução, Licença condução Portugal, carta de
condução

Portugal National Identification Number


The national identification number is a unique identification number usually present on
documents like citizen cards that are issued by the Portuguese government to its citizens. It
can be used as a travel document within the EU and some other European countries.
The Portugal National Identification Number data identifier detects a seven- to nine-character
alphanumeric pattern that matches the Portugal National Identification Number format.
The Portugal National Identification Number data identifier provides three breadths of detection:
■ The wide breadth detects a seven- to nine-character alphanumeric pattern without checksum
validation.
See “Portugal National Identification Number wide breadth” on page 1405.
■ The medium breadth detects a seven- to nine-character alphanumeric pattern with checksum
validation.
See “Portugal National Identification Number medium breadth” on page 1405.
■ The narrow breadth detects a seven- to nine-character alphanumeric pattern with checksum
validation. It also requires the presence of related keywords.
See “Portugal National Identification Number narrow breadth” on page 1406.
Library of system data identifiers 1405
Portugal National Identification Number

Portugal National Identification Number wide breadth


The wide breadth detects a seven- to nine-character alphanumeric pattern without checksum
validation.

Table 45-863 Portugal National Identification Number wide-breadth patterns

Patterns

\d{8}

\d{7} \d

\d{7}-\d

\d{9}

\d{9}\l{2}\d

\d{8} \d

\d{8}-\d

\d{8} \d \l{2}\d

\d{8}-\d-\l{2}\d

Table 45-864 Portugal National Identification Number wide-breadth validators

Mandatory validators Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

Portugal National Identification Number medium breadth


The medium breadth detects a seven- to nine-character alphanumeric pattern with checksum
validation.

Table 45-865 Portugal National Identification Number medium-breadth patterns

Patterns

\d{8}

\d{7} \d

\d{7}-\d
Library of system data identifiers 1406
Portugal National Identification Number

Table 45-865 Portugal National Identification Number medium-breadth patterns (continued)

Patterns

\d{9}

\d{9}\l{2}\d

\d{8} \d

\d{8}-\d

\d{8} \d \l{2}\d

\d{8}-\d-\l{2}\d

Table 45-866 Portugal National Identification Number medium-breadth validator

Mandatory validator Description

Portugal National Identification Number Validation Computes the checksum and validates the pattern against
Check it.

Portugal National Identification Number narrow breadth


The narrow breadth detects a seven- to nine-character alphanumeric pattern with checksum
validation. It also requires the presence of related keywords.

Table 45-867 Portugal National Identification Number narrow-breadth patterns

Patterns

\d{8}

\d{7} \d

\d{7}-\d

\d{9}

\d{9}\l{2}\d

\d{8} \d

\d{8}-\d

\d{8} \d \l{2}\d

\d{8}-\d-\l{2}\d
Library of system data identifiers 1407
Portugal Passport Number

Table 45-868 Portugal National Identification Number narrow-breadth validators

Mandatory validators Description

Portugal National Identification Number Validation Computes the checksum and validates the pattern against
Check it.

Duplicate digits Ensures that a string of digits is not all the same.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

id number, portugal bi number, NIC, nic, document


number, citizen card, identity card number, identity
card no, national identity card number, national identity
card no, national identification number, national
identification no, identification number, identification
no

bilhete de identidade, número de identificação civil,


número de cartão de cidadão, documento de
identificação, cartão de cidadão, número bi de
portugal, número do documento

Number delimiter Validates a match by checking the surrounding characters.

Portugal Passport Number


Portuguese passports are issued to citizens of Portugal for the purpose of international travel.
The passport, along with the national identity card allows for free rights of movement and
residence in any of the states of the European Union and European economic area.
The Portugal Passport Number data identifier detects a seven-character alphanumeric pattern
that matches the Portugal Passport Number format.
The Portugal Passport Number data identifier provides two breadths of detection:
■ The wide breadth detects a seven-character alphanumeric pattern without validation.
See “Portugal Passport Number wide breadth” on page 1408.
■ The narrow breadth detects a seven-character alphanumeric pattern without checksum
validation. It requires the presence of related keywords.
See “Portugal Passport Number narrow breadth” on page 1408.
Library of system data identifiers 1408
Portugal Tax Identification Number

Portugal Passport Number wide breadth


The wide breadth detects a seven-character alphanumeric pattern without validation.

Table 45-869 Portugal Passport Number

Pattern

[a-zA-Z]\d{6}

Portugal Passport Number narrow breadth


The narrow breadth detects a seven-character alphanumeric pattern without checksum
validation. It requires the presence of related keywords.

Table 45-870 Portugal Passport Number narrow-breadth pattern

Pattern

[a-zA-Z]\d{6}

Table 45-871 Portugal Passport Number narrow-breadth validators

Mandatory validator Description

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

passport number, passport, passport no, passaporte,


passeport, portuguese passport, portuguese
passeport, portuguese passaporte, passaporte nº,
passeport nº

Number delimiter Validates a match by checking the surrounding characters.

Portugal Tax Identification Number


A fiscal number is a tax identification number that is issued in Portugal to anyone who wishes
to undertake any official matters in Portugal.
The Portugal Tax Identification Number data identifier detects a nine-digit number in the
Portugal Tax Identification Number format.
The Portugal Tax Identification Number data identifier provides three breadths of detection:
■ The wide breadth detects a nine-digit number without checksum validation.
Library of system data identifiers 1409
Portugal Tax Identification Number

See “Portugal Tax Identification Number wide breadth” on page 1409.


■ The medium breadth detects a nine-digit number with checksum validation.
See “Portugal Tax Identification Number medium breadth” on page 1409.
■ The narrow breadth detects a nine-digit number with checksum validation. It also requires
the presence of related keywords.
See “Portugal Tax Identification Number narrow breadth” on page 1410.

Portugal Tax Identification Number wide breadth


The wide breadth detects a nine-digit number without checksum validation.

Table 45-872 Portugal Tax Identification Number wide-breadth patterns

Patterns

\d{9}

\d{3}-\d{3}-\d{3}

\d{3} \d{3} \d{3}

\d{3}.\d{3}.\d{3}

\d{3}.\d{3}.\d{3}

Table 45-873 Portugal Tax Identification Number wide-breadth validators

Mandatory validators Description

Duplicate digits Ensures that a string of digits is not all the same.

Portugal Tax Identification Number medium breadth


The medium breadth detects a nine-digit number with checksum validation.

Table 45-874 Portugal Tax Identification Number medium-breadth patterns

Patterns

\d{9}

\d{3}-\d{3}-\d{3}

\d{3} \d{3} \d{3}

\d{3}.\d{3}.\d{3}
Library of system data identifiers 1410
Portugal Tax Identification Number

Table 45-874 Portugal Tax Identification Number medium-breadth patterns (continued)

Patterns

\d{3}.\d{3}.\d{3}

Table 45-875 Portugal Tax Identification Number medium-breadth validator

Mandatory validator Description

Exclude ending characters Data ending with any of the following list of values is not
matched:

000000000,111111111, 222222222, 333333333,


444444444, 555555555, 666666666, 777777777,
888888888, 999999999

Portugal Tax and VAT Identification Number Validation Computes the checksum and validates the match against
Check it.

Number delimiter Validates a match by checking the surrounding characters.

Portugal Tax Identification Number narrow breadth


The narrow breadth detects a nine-digit number with checksum validation. It also requires the
presence of related keywords.

Table 45-876 Portugal Tax Identification Number narrow-breadth patterns

Patterns

\d{9}

\d{3}-\d{3}-\d{3}

\d{3} \d{3} \d{3}

\d{3}.\d{3}.\d{3}

\d{3}.\d{3}.\d{3}
Library of system data identifiers 1411
Portugal Value Added Tax (VAT) Number

Table 45-877 Portugal Tax Identification Number narrow-breadth validators

Mandatory validator Description

Exclude ending characters Data ending with any of the following list of values is not
matched:

000000000,111111111, 222222222, 333333333,


444444444, 555555555, 666666666, 777777777,
888888888, 999999999

Portugal Tax and VAT Identification Number Validation Computes the checksum and validates the match against
Check it.

Duplicate digits Ensures that a string of digits is not all the same.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

TIN#, NIF#, tax identification number, taxpayer


identification number, tax id number, tax id no, tax id

CPF, CPF#, NIF, número identificação fiscal

Number delimiter Validates a match by checking the surrounding characters.

Portugal Value Added Tax (VAT) Number


VAT is a consumption tax that is borne by the end consumer. VAT is paid for each transaction
in the manufacturing and distribution process.
The Portugal Value Added Tax (VAT) Number data identifier detects an 11-character
alphanumeric pattern that matches the Portugal Value Added Tax (VAT) Number format.
The Portugal Value Added Tax (VAT) Number data identifier provides three breadths of
detection:
■ The wide breadth detects an 11-character alphanumeric pattern starting with PT and followed
by nine digits without checksum validation.
See “Portugal Value Added Tax (VAT) Number wide breadth” on page 1412.
■ The medium breadth detects an 11-character alphanumeric pattern starting with PT and
followed by nine digits with checksum validation.
See “Portugal Value Added Tax (VAT) Number medium breadth” on page 1412.
Library of system data identifiers 1412
Portugal Value Added Tax (VAT) Number

■ The narrow breadth detects an 11-character alphanumeric pattern starting with PT and
followed by nine digits with checksum validation. It also requires the presence of related
keywords.
See “Portugal Value Added Tax (VAT) Number narrow breadth” on page 1413.

Portugal Value Added Tax (VAT) Number wide breadth


The wide breadth detects an 11-character alphanumeric pattern starting with PT and followed
by nine digits without checksum validation.

Table 45-878 Portugal Value Added Tax (VAT) Number wide-breadth patterns

Patterns

[Pp][Tt]\d{9}

[Pp][Tt] \d{9}

[Pp][Tt]-\d{9}

[Pp][Tt] \d{3} \d{4} \d{2}

[Pp][Tt] \d{3}-\d{3}-\d{3}

[Pp][Tt] \d{3} \d{3} \d{3}

Table 45-879 Portugal Value Added Tax (VAT) Number wide-breadth validators

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

000000000,111111111, 222222222, 333333333,


444444444, 555555555, 666666666, 777777777,
888888888, 999999999

Portugal Value Added Tax (VAT) Number medium breadth


The medium breadth detects an 11-character alphanumeric pattern starting with PT and followed
by nine digits with checksum validation.
Library of system data identifiers 1413
Portugal Value Added Tax (VAT) Number

Table 45-880 Portugal Value Added Tax (VAT) Number medium-breadth patterns

Patterns

[Pp][Tt]\d{9}

[Pp][Tt] \d{9}

[Pp][Tt]-\d{9}

[Pp][Tt] \d{3} \d{4} \d{2}

[Pp][Tt] \d{3}-\d{3}-\d{3}

[Pp][Tt] \d{3} \d{3} \d{3}

Table 45-881 Portugal Value Added Tax (VAT) Number medium-breadth validator

Mandatory validator Description

Portugal Tax and VAT Identification Number Validation Computes the checksum and validates the pattern against
Check it.

Portugal Value Added Tax (VAT) Number narrow breadth


The narrow breadth detects an 11-character alphanumeric pattern starting with PT and followed
by nine digits with checksum validation. It also requires the presence of related keywords.

Table 45-882 Portugal Value Added Tax (VAT) Number narrow-breadth patterns

Patterns

[Pp][Tt]\d{9}

[Pp][Tt] \d{9}

[Pp][Tt]-\d{9}

[Pp][Tt] \d{3} \d{4} \d{2}

[Pp][Tt] \d{3}-\d{3}-\d{3}

[Pp][Tt] \d{3} \d{3} \d{3}


Library of system data identifiers 1414
Randomized US Social Security Number (SSN)

Table 45-883 Portugal Value Added Tax (VAT) Number narrow-breadth validators

Mandatory validators Description

Portugal Tax and VAT Identification Number Validation Computes the checksum and validates the pattern against
Check it.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

portugal vat number, portugal vat no, vat number,


NUPC, vat no, vat, VAT#, vat code, value added tax
number, vat id, vat registration number, value added
tax, vat reg no

imposto sobre valor acrescentado, VAT nº, número


iva, vat não, cuba, código iva

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

000000000,111111111, 222222222, 333333333,


444444444, 555555555, 666666666, 777777777,
888888888, 999999999

Randomized US Social Security Number (SSN)


The Randomized US Social Security Number (SSN) data identifier detects 9-digit numbers
with the pattern DDD-DD-DDDD, separated with dashes or spaces or without separators. The
number must be in valid assigned number ranges. Pattern validators eliminate common test
numbers, such as 123456789 or all the same digit. The data identifier is narrow in breadth
and requires the presence of a Social Security-related keyword.
See “Updating policies to use the Randomized US SSN data identifier” on page 810.
See “Use the Randomized US SSN data identifier to detect SSNs” on page 836.
The Randomized US SSN data identifier provides two breadths of detection:
■ The medium breadth detects a 9-digit number in the format DDD-DD-DDDD. The digits
must be in assigned number ranges.
See “Randomized US Social Security Number (SSN) medium breadth” on page 1415.
■ The narrow breadth detects a 9-digit number in the format DDD-DD-DDDD. The digits must
be in assigned number ranges. It also requires the presence of SSN-related keywords.
Library of system data identifiers 1415
Randomized US Social Security Number (SSN)

See “Randomized US Social Security Number (SSN) narrow breadth” on page 1415.

Randomized US Social Security Number (SSN) medium breadth


The medium breadth detects a 9-digit number in the format DDD-DD-DDDD. The digits must
be in assigned number ranges.

Table 45-884 Randomized US SSN medium-breadth patterns and normalizer

Component Value Description

Patterns Detects 9-digit numbers with the


[0-8]\d{2} \d{1}[1-9] \d{4} pattern DDD-DD-DDDD, separated
[0-8]\d{3}[1-9]\d{4} with dashes, spaces, or none. The
[0-8]\d{2}[1-9]\d{5} number must be in valid assigned
[0-8]\d{2}-\d{1}[1-9]-\d{4} number ranges
[0-8]\d{2} [1-9]\d{1} \d{4}
[0-8]\d{2}-[1-9]\d{1}-\d{4}

Data normalizer Digits See “About data normalizers”


on page 733.

Table 45-885 Randomized US SSN medium breadth validators and input

Active Validators Input (if any) Description

Exclude beginning characters 666, 000, 123456789, 111111111, See “Using pattern validators”
222222222, 333333333, 444444444, on page 818.
555555555, 666666666, 77777777,
888888888

Number delimiter

Exclude ending characters 0000

Randomized US Social Security Computes the checksum and validates


Number Validation Check the pattern against it.

Randomized US Social Security Number (SSN) narrow breadth


The narrow breadth detects a 9-digit number in the format DDD-DD-DDDD. The digits must
be in assigned number ranges. It also requires the presence of SSN-related keywords.
Library of system data identifiers 1416
Romania Driver's Licence Number

Table 45-886 Randomized US Social Security Number (SSN) narrow-breadth patterns

Pattern

[0-8]\d{2} \d{1}[1-9] \d{4}

[0-8]\d{3}[1-9]\d{4}

[0-8]\d{2}[1-9]\d{5}

[0-8]\d{2}-\d{1}[1-9]-\d{4}

[0-8]\d{2} [1-9]\d{1} \d{4}

[0-8]\d{2}-[1-9]\d{1}-\d{4}

Table 45-887
Validator Description

Number Delimiter Validates a match by checking the surrounding characters.

Exclude beginning characters Data beginning with any of the following list of values is
not matched:

666, 000, 123456789, 111111111, 222222222,


333333333, 444444444, 555555555, 666666666,
77777777, 888888888

Exclude ending characters Data ending with any of the following list of values is not
matched:
0000

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched when you use this
option.

Inputs:

social security number, ssn, ss#

Randomized US Social Security Number Validation Computes the checksum and validates the pattern against
Check it.

Romania Driver's Licence Number


A driving license in Romania is a document confirming the rights of the holder to drive motor
vehicles.
Library of system data identifiers 1417
Romania Driver's Licence Number

The Romania Driver's Licence Number data identifier detects a 9-, 10-, or 11-character
alphanumeric pattern that matches the Romania Driver's Licence Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects a 9-, 10-, or 11-character alphanumeric pattern that matches the
Romania Driver's Licence Number format with checksum validation. It checks for common
test patterns.
See “Romania Driver's Licence Number wide breadth” on page 1417.
■ The narrow breadth detects a 9-, 10-, or 11-character alphanumeric pattern that matches
the Romania Driver's Licence Number format with checksum validation. It checks for
common test patterns, and also requires the presence of related keywords.
See “Romania Driver's Licence Number narrow breadth” on page 1418.

Romania Driver's Licence Number wide breadth


The wide breadth detects a 9-, 10-, or 11-character alphanumeric pattern that matches the
Romania Driver's Licence Number format with checksum validation. It checks for common test
patterns.

Table 45-888 Romania Driver's Licence Number wide-breadth patterns

Pattern

[Ii][Gg][Pp]\d{8}

[A-Za-z]\d{8}

[A-Za-z]\d{8}[A-Za-z]

Table 45-889 Romania Driver's Licence Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

00000000, 11111111, 22222222, 33333333, 44444444,


55555555, 66666666, 77777777, 88888888, 99999999

Romania Driver's Licence Number Validation Check Computes the checksum and validates the pattern against
it.
Library of system data identifiers 1418
Romania Driver's Licence Number

Romania Driver's Licence Number narrow breadth


The narrow breadth detects a 9-, 10-, or 11-character alphanumeric pattern that matches the
Romania Driver's Licence Number format with checksum validation. It checks for common test
patterns, and also requires the presence of related keywords.

Table 45-890 Romania Driver's Licence Number narrow-breadth patterns

Pattern

[Ii][Gg][Pp]\d{8}

[A-Za-z]\d{8}

[A-Za-z]\d{8}[A-Za-z]

Table 45-891 Romania Driver's Licence Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

00000000, 11111111, 22222222, 33333333, 44444444,


55555555, 66666666, 77777777, 88888888, 99999999

Romania Driver's Licence Number Validation Check Computes the checksum and validates the pattern against
it.

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

driver license, drivers license, driving license, driver


license number, drivers license number, driving license
number, DLNo#, dlno#, drivers lic., Driver's License,
Driver's License Number, driver's license number,
Driver's Licence Number, driver licence, drivers
licence, driving licence, Driver's Licence, driver permit,
drivers permit, driving permit, license number, licence
number

permis de conducere, PERMIS DE CONDUCERE,


Permis de conducere, numărul permisului de
conducere, Numărul permisului de conducere
Library of system data identifiers 1419
Romania National Identification Number

Romania National Identification Number


In Romania each citizen has a personal numerical code as a unique national identification
number. This number is also used as a tax identification number for financial purposes.
The Romania National Identification Number data identifier detects a 13-digit number that
matches the CNP format.
The Romania National Identification Number data identifier provides three breadths of detection:
■ The wide breadth detects a 13-digit number without checksum validation.
See “Romania National Identification Number wide breadth” on page 1419.
■ The medium breadth detects a 13-digit number with checksum validation.
See “Romania National Identification Number medium breadth” on page 1419.
■ The narrow breadth detects a 13-digit number with checksum validation. It also requires
the presence of related keywords.
See “Romania National Identification Number narrow breadth” on page 1420.

Romania National Identification Number wide breadth


The wide breadth detects a 13-digit number without checksum validation.

Table 45-892 Romania National Identification Number wide-breadth pattern

Pattern

\d{13}

Table 45-893 Romania National Identification Number wide-breadth validators

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding characters.

Duplicate digits Ensures that a string of digits is not all the same.

Romania National Identification Number medium breadth


The medium breadth detects a 13-digit number with checksum validation.

Table 45-894 Romania National Identification Number medium-breadth pattern

Pattern

\d{13}
Library of system data identifiers 1420
Romania Value Added Tax (VAT) Number

Table 45-895 Romania National Identification Number medium-breadth validator

Mandatory validator Description

Romania National Identification Number Check Computes the checksum and validates the pattern against
it.

Romania National Identification Number narrow breadth


The narrow breadth detects a 13-digit number with checksum validation. It also requires the
presence of related keywords.

Table 45-896 Romania National Identification Number narrow-breadth pattern

Pattern

\d{13}

Table 45-897 Romania National Identification Number narrow-breadth validators

Mandatory validators Description

Romania National Identification Number Check Computes the checksum and validates the pattern against
it.

Number delimiter Validates a match by checking the surrounding characters.

Duplicate digits Ensures that a string of digits is not all the same.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

romania tax identification number, tax identification


number, tin, tin#, tin number, tin no, numărul de
identificare fiscală, identificarea fiscală nr #, codul
fiscal nr.

national ID, national ID#, ID#, national identification


number, Cod Numeric Personal, cnp, CNP

Romania Value Added Tax (VAT) Number


Value Added Tax (VAT) is a consumption tax that is borne by the end consumer. VAT is paid
for each transaction in the manufacturing and distribution process. In Romania, it is also called
TVA or CIF.
Library of system data identifiers 1421
Romania Value Added Tax (VAT) Number

The Romania Value Added Tax (VAT) Number data identifier detects a 4- to 12-character
alphanumeric pattern that matches the Romania VAT Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects a 4- to 12-character alphanumeric pattern that matches the
Romania VAT Number format without checksum validation. It checks for common test
patterns.
See “Romania Value Added Tax (VAT) Number wide breadth” on page 1421.
■ The medium breadth detects a 4- to 12-character alphanumeric pattern that matches the
Romania VAT Number format with checksum validation.
See “Romania Value Added Tax (VAT) Number medium breadth” on page 1422.
■ The narrow breadth detects a 4- to 12-character alphanumeric pattern that matches the
Romania VAT Number format with checksum validation. It checks for common test patterns,
and also requires the presence of related keywords.
See “Romania Value Added Tax (VAT) Number narrow breadth” on page 1423.

Romania Value Added Tax (VAT) Number wide breadth


The wide breadth detects a 4- to 12-character alphanumeric pattern that matches the Romania
VAT Number format without checksum validation. It checks for common test patterns.

Table 45-898 Romania Value Added Tax (VAT) Number wide-breadth patterns

Pattern

[Rr][Oo][1-9]\d{1,9}

[Rr][Oo] [1-9]\d{1,9}

Table 45-899 Romania Value Added Tax (VAT) Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.


Library of system data identifiers 1422
Romania Value Added Tax (VAT) Number

Table 45-899 Romania Value Added Tax (VAT) Number wide-breadth validators (continued)

Mandatory validator Description

Exclude ending characters Data ending with any of the following list of values is not
matched:

00, 11, 22, 33, 44, 55, 66, 77, 88, 99

000, 111, 222, 333, 444, 555, 666, 777, 888, 999

0000, 1111, 2222, 3333, 4444, 5555, 6666, 7777, 8888,


9999

00000, 11111, 22222, 33333, 44444, 55555, 66666,


77777, 88888, 99999

000000, 111111, 222222, 333333, 444444, 555555,


666666, 777777, 888888, 999999

0000000, 1111111, 2222222, 3333333, 4444444,


5555555, 6666666, 7777777, 8888888, 9999999,
00000000, 11111111, 22222222, 33333333, 44444444,
55555555, 66666666, 77777777, 88888888, 99999999

000000000, 111111111, 222222222, 333333333,


444444444, 555555555, 666666666, 777777777,8
88888888, 999999999

0000000000, 1111111111, 2222222222, 3333333333,


4444444444, 5555555555, 6666666666, 7777777777,
8888888888, 9999999999

Romania Value Added Tax (VAT) Number medium breadth


The medium breadth detects a 4- to 12-character alphanumeric pattern that matches the
Romania VAT Number format with checksum validation.

Table 45-900 Romania Value Added Tax (VAT) Number medium-breadth patterns

Pattern

[Rr][Oo][1-9]\d{1,9}

[Rr][Oo] [1-9]\d{1,9}
Library of system data identifiers 1423
Romania Value Added Tax (VAT) Number

Table 45-901 Romania Value Added Tax (VAT) Number medium-breadth validators

Mandatory validator Description

Romania VAT Number Validation Check Computes the checksum and validates the pattern against
it.

Romania Value Added Tax (VAT) Number narrow breadth


The narrow breadth detects a 4- to 12-character alphanumeric pattern that matches the
Romania VAT Number format with checksum validation. It checks for common test patterns,
and also requires the presence of related keywords.

Table 45-902 Romania Value Added Tax (VAT) Number narrow-breadth patterns

Pattern

[Rr][Oo][1-9]\d{1,9}

[Rr][Oo] [1-9]\d{1,9}

Table 45-903 Romania Value Added Tax (VAT) Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.


Library of system data identifiers 1424
Romania Value Added Tax (VAT) Number

Table 45-903 Romania Value Added Tax (VAT) Number narrow-breadth validators (continued)

Mandatory validator Description

Exclude ending characters Data ending with any of the following list of values is not
matched:

00, 11, 22, 33, 44, 55, 66, 77, 88, 99

000, 111, 222, 333, 444, 555, 666, 777, 888, 999

0000, 1111, 2222, 3333, 4444, 5555, 6666, 7777, 8888,


9999

00000, 11111, 22222, 33333, 44444, 55555, 66666,


77777, 88888, 99999

000000, 111111, 222222, 333333, 444444, 555555,


666666, 777777, 888888, 999999

0000000, 1111111, 2222222, 3333333, 4444444,


5555555, 6666666, 7777777, 8888888, 9999999,
00000000, 11111111, 22222222, 33333333, 44444444,
55555555, 66666666, 77777777, 88888888, 99999999

000000000, 111111111, 222222222, 333333333,


444444444, 555555555, 666666666, 777777777,8
88888888, 999999999

0000000000, 1111111111, 2222222222, 3333333333,


4444444444, 5555555555, 6666666666, 7777777777,
8888888888, 9999999999

Romania VAT Number Validation Check Computes the checksum and validates the pattern against
it.

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

vat number, value added tax, vat, VAT, VAT#, vat#,


VATIN, vatin, fiscal identification code, fiscal code,
unique identification code, unique registration code

CIF, cif, CUI, cui, TVA, tva, TVA#, tva#, taxa pe valoare
adaugata, cod fiscal, cod fiscal de identificare, cod
fiscal identificare, Cod Unic de Înregistrare, cod unic
de identificare, cod unic identificare, cod unic de
înregistrare, cod unic înregistrare
Library of system data identifiers 1425
Romanian Numerical Personal Code

Romanian Numerical Personal Code


In Romania, each citizen has a unique numerical personal code. The number is used by
authorities, health care, schools, universities, banks, and insurance companies for customer
identification.
The Romanian Numerical Personal Code data identifier detects a 13-digit number that matches
the CNP format.
The Romanian Numerical Personal Code system data identifier provides three breadths of
detection:
■ The wide breadth detects a 13-digit number without checksum validation.
See “ Romanian Numerical Personal Code wide breadth” on page 1425.
■ The medium breadth detects a 13-digit number with checksum validation.
See “ Romanian Numerical Personal Code medium breadth” on page 1425.
■ The narrow breadth a 13-digit number that passes checksum validation. It also requires
the presence of CNP-related keywords.
See “ Romanian Numerical Personal Code narrow breadth” on page 1426.

Romanian Numerical Personal Code wide breadth


The wide breadth detects a 13-digit number without checksum validation.

Table 45-904 Romanian Numerical Personal Code wide-breadth pattern

Pattern

[1-9]\d\d[0-1]\d[0-3]\d{7}

Table 45-905 Romanian Numerical Personal Code wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Romanian Numerical Personal Code medium breadth


The medium breadth detects a 13-digit number with checksum validation.

Table 45-906 Romanian Numerical Personal Code medium-breadth pattern

Pattern

[1-9]\d\d[0-1]\d[0-3]\d{7}
Library of system data identifiers 1426
Romanian Numerical Personal Code

Table 45-907 Romanian Numerical Personal Code medium-breadth validators

Mandatory validator Description

Romanian Numerical Personal Code Check Computes the checksum and validates the pattern against
it.

Number delimiter Validates a match by checking the surrounding characters.

Romanian Numerical Personal Code narrow breadth


The narrow breadth a 13-digit number with checksum validation. It also requires the presence
of related keywords.

Table 45-908 Romanian Numerical Personal Code narrow-breadth pattern

Pattern

[1-9]\d\d[0-1]\d[0-3]\d{7}

Table 45-909 Romanian Numerical Personal Code narrow-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

Romanian Numerical Personal Code Check Computes the checksum and validates the pattern against
it.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

Personal Numeric Code, unique identification number,


CNP, CNP#, PIN, PIN#, Insurance Number,
insurancenumber#, unique identity number,
uniqueidentityno#, Cod Numeric Personal, cod
identificare personal, cod unic identificare, număr
personal unic, număr identitate, număr identificare
personal, număridentitate#, CodNumericPersonal#,
numărpersonalunic#
Library of system data identifiers 1427
Russian Passport Identification Number

Russian Passport Identification Number


Russia issues two types of passports: domestic and international. Every Russian citizen has
a domestic passport. It is the main document used for personal identification.
The Russian Passport Identification Number data identifier detects a 10-digit number that
matches the Russian Passport Identification Number format.
The Russian Passport Identification Number data identifier provides two breadths of detection:
■ The wide breadth detects a 10-digit number without checksum validation.
See “Russian Passport Identification Number wide breadth” on page 1427.
■ The narrow breadth detects a 10-digit number with checksum validation. It also requires
the presence of related keywords.
See “Russian Passport Identification Number narrow breadth” on page 1427.

Russian Passport Identification Number wide breadth


The wide breadth detects a 10-digit number without checksum validation.

Table 45-910 Russian Passport Identification Number wide-breadth patterns

Pattern

\d{10}

\d{4}[ ]\d{6}

\d{2}[- ]\d{2}[ ]\d{6}

Table 45-911 Russian Passport Identification Number wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Russian Passport Identification Number narrow breadth


The narrow breadth detects a 10-digit number with checksum validation. It also requires the
presence of related keywords.

Table 45-912 Russian Passport Identification Number narrow-breadth patterns

Pattern

\d{10}
Library of system data identifiers 1428
Russian Taxpayer Identification Number

Table 45-912 Russian Passport Identification Number narrow-breadth patterns (continued)

Pattern

\d{4}[ ]\d{6}

\d{2}[- ]\d{2}[ ]\d{6}

Table 45-913 Russian Passport Identification Number narrow-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding numbers.

Find keywords If you select this option, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

passport number, passport no, passport ID,


passportnumber#, passportno#, russian passport ID,
паспорт нет, паспорт, номер паспорта, паспорт ID,
Российской паспорт, Русский номер паспорта,
паспорт#, паспортID#, номерпаспорта#

Russian Taxpayer Identification Number


The Russian Taxpayer Identification Number (TIN or INN) is a multi-digit number that enables
the tax inspectorate to identify the tax status of legal entities and individuals.
The Russian Taxpayer Identification Number data identifier detects a 10- or 12-digit number
that matches the Russian Taxpayer Identification Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects a 10- or 12-digit number without checksum validation.
See “Russian Taxpayer Identification Number wide breadth” on page 1429.
■ The medium breadth validates the detected number using the final check digit and eliminates
common test numbers.
See “Russian Taxpayer Identification Number medium breadth” on page 1429.
■ The narrow breadth detects a 10- or 12-digit number with checksum validation. It also
requires the presence of related keywords.
See “Russian Taxpayer Identification Number narrow breadth” on page 1429.
Library of system data identifiers 1429
Russian Taxpayer Identification Number

Russian Taxpayer Identification Number wide breadth


The wide breadth detects a 10- or 12-digit number without checksum validation.

Table 45-914 Russian Taxpayer Identification Number wide-breadth patterns

Pattern

\d{10}

\d{12}

\d{3}[ -]\d{3}[ -]\d{3}[ -]\d{3}

Table 45-915 Russian Taxpayer Identification Number wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Russian Taxpayer Identification Number medium breadth


The medium breadth detects a 10- or 12-digit number with checksum validation.

Table 45-916 Russian Taxpayer Identification Number medium-breadth patterns

Pattern

\d{10}

\d{12}

\d{3}[ -]\d{3}[ -]\d{3}[ -]\d{3}

Table 45-917 Russian Taxpayer Identification Number medium-breadth validators

Mandatory validator Description

Russian Taxpayer Identification Number Validation Computes the checksum and validates the pattern against
Check it.

Number delimiter Validates a match by checking the surrounding numbers.

Russian Taxpayer Identification Number narrow breadth


The narrow breadth detects a 10- or 12-digit number with checksum validation. It also requires
the presence of related keywords.
Library of system data identifiers 1430
SEPA Creditor Identifier Number North

Table 45-918 Russian Taxpayer Identification Number narrow-breadth patterns

Pattern

\d{10}

\d{12}

\d{3}[ -]\d{3}[ -]\d{3}[ -]\d{3}

Table 45-919 Russian Taxpayer Identification Number narrow-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same

Russian Taxpayer Identification Number Validation Computes the checksum and validates the pattern against
Check it.

Duplicate digits Ensures that a string of digits is not all the same.

Find keywords If you select this option, you have to use at least one of
the following keywords or key phrases must be present
for the data to be matched.

Inputs:

TIN, taxpayer number, taxpayer ID, taxpayer no, tax


ID, tin,tinno#, inn, inn#, taxpayerno#, taxid#,
taxpayeridno#, taxpayerid#, НДС, номер
налогоплательщика, Налогоплательщика ИД, налог
число, налогчисло#, ИНН#, НДС#

SEPA Creditor Identifier Number North


The Single Euro Payment Area (SEPA) is a payments system created by the European Union
that harmonizes the way cashless payments transact between Euro countries. SEPA North is
for the United Kingdom, Sweden, Denmark, Finland, Ireland. European consumers, businesses,
and government agents who make payments by direct debit, credit card or through credit
transfers use the SEPA architecture. The Single Euro Payment Area is approved and regulated
by European Commission.
The SEPA Creditor Identifier Number North data identifier detects a unique alphanumeric
string that matches the SEPA Credit Identifier North format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects a unique alphanumeric string that matches the SEPA Credit
Identifier North format without checksum validation.
Library of system data identifiers 1431
SEPA Creditor Identifier Number North

See “SEPA Creditor Identifier Number North wide breadth” on page 1431.
■ The medium breadth detects a unique alphanumeric string that matches the SEPA Credit
Identifier North format with checksum validation.
See “SEPA Creditor Identifier Number North medium breadth” on page 1433.
■ The narrow breadth detects a unique alphanumeric string that matches the SEPA Credit
Identifier North format with checksum validation. It also requires the presence of related
keywords.
See “SEPA Creditor Identifier Number North narrow breadth” on page 1435.

SEPA Creditor Identifier Number North wide breadth


The wide breadth detects a unique alphanumeric string that matches the SEPA Credit Identifier
North format without checksum validation.

Table 45-920 SEPA Creditor Identifier Number North wide-breadth patterns

Pattern

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w\w\w\w\w\w\w
Library of system data identifiers 1432
SEPA Creditor Identifier Number North

Table 45-920 SEPA Creditor Identifier Number North wide-breadth patterns (continued)

Pattern

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w\w\w\w\w\w\w

[Ss][Ee]\d\d[Zz][Zz][Zz]\d\d\d\d\d\d\d\d\d\d

[Ss][Ee]\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

[Ii][Ee]\d\d[Zz][Zz][Zz]\d\d\d\d\d\d

[Ii][Ee]\d\d\d\d\d\d\d\d\d\d\d

[Ff][Ii]\d\d[Zz][Zz][Zz]\d\d\d\d\d\d\d\d

[Ff][Ii]\d\d\d\d\d\d\d\d\d\d\d\d\d

[Dd][Kk]\d\d[Zz][Zz][Zz]\d\d\d\d\d\d\d\d\d\d\d\d

[Dd][Kk]\d\d[Zz][Zz][Zz]\d\d\d\d\d\d\d\d\d\d\d\d
Library of system data identifiers 1433
SEPA Creditor Identifier Number North

Table 45-921 SEPA Creditor Identifier Number North wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

SEPA Creditor Identifier Number North medium breadth


The medium breadth detects a unique alphanumeric string that matches the SEPA Credit
Identifier North format with checksum validation.

Table 45-922 SEPA Creditor Identifier Number North medium-breadth patterns

Pattern

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w
Library of system data identifiers 1434
SEPA Creditor Identifier Number North

Table 45-922 SEPA Creditor Identifier Number North medium-breadth patterns (continued)

Pattern

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w\w\w\w\w\w\w

[Ss][Ee]\d\d[Zz][Zz][Zz]\d\d\d\d\d\d\d\d\d\d

[Ss][Ee]\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

[Ii][Ee]\d\d[Zz][Zz][Zz]\d\d\d\d\d\d

[Ii][Ee]\d\d\d\d\d\d\d\d\d\d\d

[Ff][Ii]\d\d[Zz][Zz][Zz]\d\d\d\d\d\d\d\d

[Ff][Ii]\d\d\d\d\d\d\d\d\d\d\d\d\d

[Dd][Kk]\d\d[Zz][Zz][Zz]\d\d\d\d\d\d\d\d\d\d\d\d

[Dd][Kk]\d\d[Zz][Zz][Zz]\d\d\d\d\d\d\d\d\d\d\d\d

Table 45-923 SEPA Creditor Identifier Number North medium-breadth validators

Mandatory validator Description

SEPA Creditor Number Validation Check Computes the checksum and validates the pattern against
it.
Library of system data identifiers 1435
SEPA Creditor Identifier Number North

SEPA Creditor Identifier Number North narrow breadth


The narrow breadth detects a unique alphanumeric string that matches the SEPA Credit
Identifier North format with checksum validation. It also requires the presence of related
keywords.

Table 45-924 SEPA Creditor Identifier Number North narrow-breadth patterns

Pattern

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w
Library of system data identifiers 1436
SEPA Creditor Identifier Number North

Table 45-924 SEPA Creditor Identifier Number North narrow-breadth patterns (continued)

Pattern

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w\w\w\w\w\w

[Gg][Bb]\d\d\d\d\d\w\w\w\w\w\w\w\d\d\d\d\d\d\w\w\w\w\w\w\w\w\w\w\w\w\w\w\w

[Ss][Ee]\d\d[Zz][Zz][Zz]\d\d\d\d\d\d\d\d\d\d

[Ss][Ee]\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

[Ii][Ee]\d\d[Zz][Zz][Zz]\d\d\d\d\d\d

[Ii][Ee]\d\d\d\d\d\d\d\d\d\d\d

[Ff][Ii]\d\d[Zz][Zz][Zz]\d\d\d\d\d\d\d\d

[Ff][Ii]\d\d\d\d\d\d\d\d\d\d\d\d\d

[Dd][Kk]\d\d[Zz][Zz][Zz]\d\d\d\d\d\d\d\d\d\d\d\d

[Dd][Kk]\d\d[Zz][Zz][Zz]\d\d\d\d\d\d\d\d\d\d\d\d

Table 45-925 SEPA Creditor Identifier Number North narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

SEPA Creditor Number Validation Check Computes the checksum and validates the pattern against
it.
Library of system data identifiers 1437
SEPA Creditor Identifier Number South

Table 45-925 SEPA Creditor Identifier Number North narrow-breadth validators (continued)

Mandatory validator Description

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

sepa, SEPA, SEPA Creditor Identifier, Creditor ID,


SEPA ID, Creditor Identifier, sep

SEPA-Gläubiger-Identifikator, Gläubiger-ID, SEPA-ID,


Gläubiger-Kennung

ID créancier, ID SEPA, Identifiant du créancie

SEPA Krediter Identifizéierer, Kreditergeld, Krediter


Identifizéierer

SEPA kreditoridentifikator, Kreditoridentifikator

Velkojan tunnus, SEPA-tunnus, Velkojan tunniste

ID Creidiúnaí, Aithnitheoir Creidiúnaí

ID del creditore, Identificatore del creditore

Identificador de acreedor SEPA, ID del acreedor, ID de


SEPA, Identificador del acreedor

Identificador Credor SEPA, Identificador do Credor

SEPA Creditor Identifier Number South


The Single Euro Payment Area (SEPA) is a payments system created by the European Union
that harmonizes the way cashless payments transact between Euro countries. SEPA South
is for Italy, Spain, and Portugal. European consumers, businesses, and government agents
who make payments by direct debit, credit card or through credit transfers use the SEPA
architecture. The Single Euro Payment Area is approved and regulated by European
Commission.
The SEPA Creditor Identifier Number South data identifier detects a unique alphanumeric
string that matches the SEPA Credit Identifier South format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects a unique alphanumeric string that matches the SEPA Credit
Identifier South format without checksum validation.
See “SEPA Creditor Identifier Number South wide breadth” on page 1438.
■ The medium breadth detects a unique alphanumeric string that matches the SEPA Credit
Identifier South format with checksum validation.
Library of system data identifiers 1438
SEPA Creditor Identifier Number South

See “SEPA Creditor Identifier Number South medium breadth” on page 1439.
■ The narrow breadth detects a unique alphanumeric string that matches the SEPA Credit
Identifier South format with checksum validation. It also requires the presence of related
keywords.
See “SEPA Creditor Identifier Number South narrow breadth” on page 1439.

SEPA Creditor Identifier Number South wide breadth


The wide breadth detects a unique alphanumeric string that matches the SEPA Credit Identifier
South format without checksum validation.

Table 45-926 SEPA Creditor Identifier Number South wide-breadth patterns

Pattern

[Pp][Tt]\d\d[Zz][Zz][Zz]\d\d\d\d\d\d

[Pp][Tt]\d\d\d\d\d\d\d\d\d\d\d

[Ii][Tt]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\w\w\w\w\w\w\w\w\w

[Ii][Tt]\d\d\d\d\d\w\w\w\w\w\w\w\w\w\w\w\w\w\w\w\w

[Ee][Ss]\d\d[Zz][Zz][Zz][A-Za-z]\d\d\d\d\d\d\d[A-Za-z]

[Ee][Ss]\d\d[Zz][Zz][Zz]\d\d\d\d\d\d\d\d[A-Za-z]

[Ee][Ss]\d\d[Zz][Zz][Zz][LlMmKk]\w\w\w\w\w\w\w[A-Za-z]

[Ee][Ss]\d\d[Zz][Zz][Zz][XxYyZz]\d\d\d\d\d\d\d[A-Za-z]

[Ee][Ss]\d\d\d\d\d[A-Za-z]\d\d\d\d\d\d\d[A-Za-z]

[Ee][Ss]\d\d\d\d\d\d\d\d\d\d\d\d\d[A-Za-z]

[Ee][Ss]\d\d\d\d\d[LlMmKk]\w\w\w\w\w\w\w[A-Za-z]

[Ee][Ss]\d\d\d\d\d[XxYyZz]\d\d\d\d\d\d\d[A-Za-z]

Table 45-927 SEPA Creditor Identifier Number South wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.


Library of system data identifiers 1439
SEPA Creditor Identifier Number South

SEPA Creditor Identifier Number South medium breadth


The medium breadth detects a unique alphanumeric string that matches the SEPA Credit
Identifier South format with checksum validation.

Table 45-928 SEPA Creditor Identifier Number South medium-breadth patterns

Pattern

[Pp][Tt]\d\d[Zz][Zz][Zz]\d\d\d\d\d\d

[Pp][Tt]\d\d\d\d\d\d\d\d\d\d\d

[Ii][Tt]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\w\w\w\w\w\w\w\w\w

[Ii][Tt]\d\d\d\d\d\w\w\w\w\w\w\w\w\w\w\w\w\w\w\w\w

[Ee][Ss]\d\d[Zz][Zz][Zz][A-Za-z]\d\d\d\d\d\d\d[A-Za-z]

[Ee][Ss]\d\d[Zz][Zz][Zz]\d\d\d\d\d\d\d\d[A-Za-z]

[Ee][Ss]\d\d[Zz][Zz][Zz][LlMmKk]\w\w\w\w\w\w\w[A-Za-z]

[Ee][Ss]\d\d[Zz][Zz][Zz][XxYyZz]\d\d\d\d\d\d\d[A-Za-z]

[Ee][Ss]\d\d\d\d\d[A-Za-z]\d\d\d\d\d\d\d[A-Za-z]

[Ee][Ss]\d\d\d\d\d\d\d\d\d\d\d\d\d[A-Za-z]

[Ee][Ss]\d\d\d\d\d[LlMmKk]\w\w\w\w\w\w\w[A-Za-z]

[Ee][Ss]\d\d\d\d\d[XxYyZz]\d\d\d\d\d\d\d[A-Za-z]

Table 45-929 SEPA Creditor Identifier Number South medium-breadth validators

Mandatory validator Description

SEPA Creditor Number Validation Check Computes the checksum and validates the pattern against
it.

SEPA Creditor Identifier Number South narrow breadth


The narrow breadth detects a unique alphanumeric string that matches the SEPA Credit
Identifier South format with checksum validation. It also requires the presence of related
keywords.
Library of system data identifiers 1440
SEPA Creditor Identifier Number South

Table 45-930 SEPA Creditor Identifier Number South narrow-breadth patterns

Pattern

[Pp][Tt]\d\d[Zz][Zz][Zz]\d\d\d\d\d\d

[Pp][Tt]\d\d\d\d\d\d\d\d\d\d\d

[Ii][Tt]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w\w\w\w\w\w\w\w\w\w\w

[Ii][Tt]\d\d\d\d\d\w\w\w\w\w\w\w\w\w\w\w\w\w\w\w\w

[Ee][Ss]\d\d[Zz][Zz][Zz][A-Za-z]\d\d\d\d\d\d\d[A-Za-z]

[Ee][Ss]\d\d[Zz][Zz][Zz]\d\d\d\d\d\d\d\d[A-Za-z]

[Ee][Ss]\d\d[Zz][Zz][Zz][LlMmKk]\w\w\w\w\w\w\w[A-Za-z]

[Ee][Ss]\d\d[Zz][Zz][Zz][XxYyZz]\d\d\d\d\d\d\d[A-Za-z]

[Ee][Ss]\d\d\d\d\d[A-Za-z]\d\d\d\d\d\d\d[A-Za-z]

[Ee][Ss]\d\d\d\d\d\d\d\d\d\d\d\d\d[A-Za-z]

[Ee][Ss]\d\d\d\d\d[LlMmKk]\w\w\w\w\w\w\w[A-Za-z]

[Ee][Ss]\d\d\d\d\d[XxYyZz]\d\d\d\d\d\d\d[A-Za-z]

Table 45-931 SEPA Creditor Identifier Number South narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

SEPA Creditor Number Validation Check Computes the checksum and validates the pattern against
it.
Library of system data identifiers 1441
SEPA Creditor Identifier Number West

Table 45-931 SEPA Creditor Identifier Number South narrow-breadth validators (continued)

Mandatory validator Description

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

sepa, SEPA, SEPA Creditor Identifier, Creditor ID,


SEPA ID, Creditor Identifier, sep

SEPA-Gläubiger-Identifikator, Gläubiger-ID, SEPA-ID,


Gläubiger-Kennung

ID créancier, ID SEPA, Identifiant du créancie

SEPA Krediter Identifizéierer, Kreditergeld, Krediter


Identifizéierer

SEPA kreditoridentifikator, Kreditoridentifikator

Velkojan tunnus, SEPA-tunnus, Velkojan tunniste

ID Creidiúnaí, Aithnitheoir Creidiúnaí

ID del creditore, Identificatore del creditore

Identificador de acreedor SEPA, ID del acreedor, ID de


SEPA, Identificador del acreedor

Identificador Credor SEPA, Identificador do Credor

SEPA Creditor Identifier Number West


The Single Euro Payment Area (SEPA) is a payments system created by the European Union
that harmonizes the way cashless payments transact between Euro countries. SEPA West is
for Germany, France, Netherlands, Belgium, Austria, and Luxembourg. European consumers,
businesses, and government agents who make payments by direct debit, credit card, or through
credit transfers use the SEPA architecture. The Single Euro Payment Area is approved and
regulated by European Commission.
This data identifier provides the following breadths of detection:
■ The wide breadth detects a unique alphanumeric string that matches the SEPA Credit
Identifier North format without checksum validation.
See “SEPA Creditor Identifier Number West wide breadth” on page 1442.
■ The medium breadth detects a unique alphanumeric string that matches the SEPA Credit
Identifier West format with checksum validation.
See “SEPA Creditor Identifier Number West medium breadth” on page 1443.
Library of system data identifiers 1442
SEPA Creditor Identifier Number West

■ The narrow breadth detects a unique alphanumeric string that matches the SEPA Credit
Identifier North format with checksum validation. It also requires the presence of related
keywords.
See “SEPA Creditor Identifier Number West narrow breadth” on page 1443.

SEPA Creditor Identifier Number West wide breadth


The wide breadth detects a unique alphanumeric string that matches the SEPA Credit Identifier
North format without checksum validation.

Table 45-932 SEPA Creditor Identifier Number West wide-breadth patterns

Pattern

[Dd][Ee]\d\d[Zz][Zz][Zz]\d\d\d\d\d\d\d\d\d\d\d

[Dd][Ee]\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

[Nn][Ll]\d\d[Zz][Zz][Zz]\d\d\d\d\d\d\d\d\d\d\d\d

[Nn][Ll]\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

[Aa][Tt]\d\d[Zz][Zz][Zz]\d\d\d\d\d\d\d\d\d\d\d

[Aa][Tt]\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

[Ll][Uu]\d\d\w\w\w0\w\w\w\w\w\w\w\w\w\w\w\w\w\w\w\w\w\w

[Bb][Ee]\d\d[Zz][Zz][Zz]\d\d\d\d\d\d\d\d\d\d

[Bb][Ee]\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

[Bb][Ee]\d\d[Zz][Zz][Zz]\d\d\d\d\d\d\d\d\d\d\d\d\d

[Bb][Ee]\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

[Ff][Rr]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w

[Ff][Rr]\d\d\d\d\d\w\w\w\w\w\w

Table 45-933 SEPA Creditor Identifier Number West wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.


Library of system data identifiers 1443
SEPA Creditor Identifier Number West

SEPA Creditor Identifier Number West medium breadth


The medium breadth detects a unique alphanumeric string that matches the SEPA Credit
Identifier West format with checksum validation.

Table 45-934 SEPA Creditor Identifier Number West medium-breadth patterns

Pattern

[Dd][Ee]\d\d[Zz][Zz][Zz]\d\d\d\d\d\d\d\d\d\d\d

[Dd][Ee]\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

[Nn][Ll]\d\d[Zz][Zz][Zz]\d\d\d\d\d\d\d\d\d\d\d\d

[Nn][Ll]\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

[Aa][Tt]\d\d[Zz][Zz][Zz]\d\d\d\d\d\d\d\d\d\d\d

[Aa][Tt]\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

[Ll][Uu]\d\d\w\w\w0\w\w\w\w\w\w\w\w\w\w\w\w\w\w\w\w\w\w

[Bb][Ee]\d\d[Zz][Zz][Zz]\d\d\d\d\d\d\d\d\d\d

[Bb][Ee]\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

[Bb][Ee]\d\d[Zz][Zz][Zz]\d\d\d\d\d\d\d\d\d\d\d\d\d

[Bb][Ee]\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

[Ff][Rr]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w

[Ff][Rr]\d\d\d\d\d\w\w\w\w\w\w

Table 45-935 SEPA Creditor Identifier Number West medium-breadth validators

Mandatory validator Description

SEPA Creditor Number Validation Check Computes the checksum and validates the pattern against
it.

SEPA Creditor Identifier Number West narrow breadth


The narrow breadth detects a unique alphanumeric string that matches the SEPA Credit
Identifier West format with checksum validation. It also requires the presence of related
keywords.
Library of system data identifiers 1444
SEPA Creditor Identifier Number West

Table 45-936 SEPA Creditor Identifier Number West narrow-breadth patterns

Pattern

[Dd][Ee]\d\d[Zz][Zz][Zz]\d\d\d\d\d\d\d\d\d\d\d

[Dd][Ee]\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

[Nn][Ll]\d\d[Zz][Zz][Zz]\d\d\d\d\d\d\d\d\d\d\d\d

[Nn][Ll]\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

[Aa][Tt]\d\d[Zz][Zz][Zz]\d\d\d\d\d\d\d\d\d\d\d

[Aa][Tt]\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

[Ll][Uu]\d\d\w\w\w0\w\w\w\w\w\w\w\w\w\w\w\w\w\w\w\w\w\w

[Bb][Ee]\d\d[Zz][Zz][Zz]\d\d\d\d\d\d\d\d\d\d

[Bb][Ee]\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

[Bb][Ee]\d\d[Zz][Zz][Zz]\d\d\d\d\d\d\d\d\d\d\d\d\d

[Bb][Ee]\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d

[Ff][Rr]\d\d[Zz][Zz][Zz]\w\w\w\w\w\w

[Ff][Rr]\d\d\d\d\d\w\w\w\w\w\w

Table 45-937 SEPA Creditor Identifier Number West narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

SEPA Creditor Number Validation Check Computes the checksum and validates the pattern against
it.
Library of system data identifiers 1445
Serbia Unique Master Citizen Number

Table 45-937 SEPA Creditor Identifier Number West narrow-breadth validators (continued)

Mandatory validator Description

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

sepa, SEPA, SEPA Creditor Identifier, Creditor ID,


SEPA ID, Creditor Identifier, sep

SEPA-Gläubiger-Identifikator, Gläubiger-ID, SEPA-ID,


Gläubiger-Kennung

ID créancier, ID SEPA, Identifiant du créancie

SEPA Krediter Identifizéierer, Kreditergeld, Krediter


Identifizéierer

SEPA kreditoridentifikator, Kreditoridentifikator

Velkojan tunnus, SEPA-tunnus, Velkojan tunniste

ID Creidiúnaí, Aithnitheoir Creidiúnaí

ID del creditore, Identificatore del creditore

Identificador de acreedor SEPA, ID del acreedor, ID de


SEPA, Identificador del acreedor

Identificador Credor SEPA, Identificador do Credor

Serbia Unique Master Citizen Number


The Serbian Unique Master Citizen Number is a unique identifier for Serbian citizens. It is
assigned to every citizen of Serbia at birth or upon acquiring citizenship.
The Serbia Unique Master Citizen Number data identifier detects a 13-digit number that matches
the Serbian Unique Master Citizen Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects a 13-digit number that matches the Serbian Unique Master Citizen
Number format without checksum validation. It checks for common test numbers.
See “Serbia Unique Master Citizen Number wide breadth” on page 1446.
■ The medium breadth detects a 13-digit number that matches the Serbian Unique Master
Citizen Number format with checksum validation.
See “Serbia Unique Master Citizen Number medium breadth” on page 1446.
Library of system data identifiers 1446
Serbia Unique Master Citizen Number

■ The narrow breadth detects a 13-digit number that matches the Serbian Unique Master
Citizen Number format with checksum validation. It checks for common test numbers, and
also requires the presence of related keywords.
See “Serbia Unique Master Citizen Number narrow breadth” on page 1447.

Serbia Unique Master Citizen Number wide breadth


The wide breadth detects a 13-digit number that matches the Serbian Unique Master Citizen
Number format without checksum validation. It checks for common test numbers.

Table 45-938 Serbia Unique Master Citizen Number wide-breadth patterns

Pattern

0[1-9]0[1-9]\d\d\d[07]\d\d\d\d\d

[12][0-9]0[1-9]\d\d\d[07]\d\d\d\d\d

3[01]0[1-9]\d\d\d[07]\d\d\d\d\d

0[1-9]1[012]\d\d\d[07]\d\d\d\d\d

[12][0-9]1[012]\d\d\d[07]\d\d\d\d\d

3[01]1[012]\d\d\d[07]\d\d\d\d\d

Table 45-939 Serbia Unique Master Citizen Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Duplicate digits Ensures that a string of numbers is not all the same.

Serbia Unique Master Citizen Number medium breadth


The medium breadth detects a 13-digit number that matches the Serbian Unique Master Citizen
Number format with checksum validation.

Table 45-940 Serbia Unique Master Citizen Number medium-breadth patterns

Pattern

0[1-9]0[1-9]\d\d\d[07]\d\d\d\d\d

[12][0-9]0[1-9]\d\d\d[07]\d\d\d\d\d

3[01]0[1-9]\d\d\d[07]\d\d\d\d\d
Library of system data identifiers 1447
Serbia Unique Master Citizen Number

Table 45-940 Serbia Unique Master Citizen Number medium-breadth patterns (continued)

Pattern

0[1-9]1[012]\d\d\d[07]\d\d\d\d\d

[12][0-9]1[012]\d\d\d[07]\d\d\d\d\d

3[01]1[012]\d\d\d[07]\d\d\d\d\d

Table 45-941 Serbia Unique Master Citizen Number medium-breadth validators

Mandatory validator Description

Slovenia Unique Master Citizen Number Validation Computes the checksum and validates the pattern against
Check it.

Serbia Unique Master Citizen Number narrow breadth


The narrow breadth detects a 13-digit number that matches the Serbian Unique Master Citizen
Number format with checksum validation. It checks for common test numbers, and also requires
the presence of related keywords.

Table 45-942 Serbia Unique Master Citizen Number narrow-breadth patterns

Pattern

0[1-9]0[1-9]\d\d\d[07]\d\d\d\d\d

[12][0-9]0[1-9]\d\d\d[07]\d\d\d\d\d

3[01]0[1-9]\d\d\d[07]\d\d\d\d\d

0[1-9]1[012]\d\d\d[07]\d\d\d\d\d

[12][0-9]1[012]\d\d\d[07]\d\d\d\d\d

3[01]1[012]\d\d\d[07]\d\d\d\d\d

Table 45-943 Serbia Unique Master Citizen Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Duplicate digits Ensures that a string of numbers is not all the same.

Slovenia Unique Master Citizen Number Validation Computes the checksum and validates the pattern against
Check it.
Library of system data identifiers 1448
Serbia Value Added Tax (VAT) Number

Table 45-943 Serbia Unique Master Citizen Number narrow-breadth validators (continued)

Mandatory validator Description

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

unique master citizen number, unique identification


number, unique id number, unique citizen number,
unique citizen Number, uniqueid#, personalid#,
personal identification number, national identification
number, nationalid#

јединствен мајстор грађанин Број, Јединствен


матични број, јединствен број ид, Национални
идентификациони број

Serbia Value Added Tax (VAT) Number


Value Added Tax (VAT) is a consumption tax that is borne by the end consumer. VAT is paid
for each transaction in the manufacturing and distribution process. In Serbia, VAT is
administered by the Tax Administration department of the Ministry of Finance.
The Serbia Value Added Tax (VAT) Number data identifier detects a nine-digit number or
11-character alphanumeric pattern that matches the Serbian VAT Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects a nine-digit number or 11-character alphanumeric pattern that
matches the Serbian VAT Number format without checksum validation. It checks for common
test numbers and patterns.
See “Serbia Value Added Tax (VAT) Number wide breadth” on page 1449.
■ The medium breadth detects a nine-digit number or 11-character alphanumeric pattern
that matches the Serbian VAT Number format with checksum validation.
See “Serbia Value Added Tax (VAT) Number medium breadth” on page 1449.
■ The narrow breadth detects a nine-digit number or 11-character alphanumeric pattern that
matches the Serbian VAT Number format with checksum validation. It checks for common
test numbers and patterns, and also requires the presence of related keywords.
See “Serbia Value Added Tax (VAT) Number narrow breadth” on page 1450.
Library of system data identifiers 1449
Serbia Value Added Tax (VAT) Number

Serbia Value Added Tax (VAT) Number wide breadth


The wide breadth detects a nine-digit number or 11-character alphanumeric pattern that
matches the Serbian VAT Number format without checksum validation. It checks for common
test numbers and patterns.

Table 45-944 Serbia Value Added Tax (VAT) Number wide-breadth patterns

Pattern

\d{9}

[Rr][Ss]\d{9}

[Rr][Ss] \d{9}

[Ss][Rr]\d{9}

[Ss][Rr] \d{9}

Table 45-945 Serbia Value Added Tax (VAT) Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

111111111, 222222222, 333333333, 444444444,


555555555, 666666666, 777777777, 888888888,
999999999

Serbia Value Added Tax (VAT) Number medium breadth


The medium breadth detects a nine-digit number or 11-character alphanumeric pattern that
matches the Serbian VAT Number format with checksum validation.

Table 45-946 Serbia Value Added Tax (VAT) Number medium-breadth patterns

Pattern

\d{9}

[Rr][Ss]\d{9}

[Rr][Ss] \d{9}

[Ss][Rr]\d{9}
Library of system data identifiers 1450
Serbia Value Added Tax (VAT) Number

Table 45-946 Serbia Value Added Tax (VAT) Number medium-breadth patterns (continued)

Pattern

[Ss][Rr] \d{9}

Table 45-947 Serbia Value Added Tax (VAT) Number medium-breadth validators

Mandatory validator Description

Serbia Value Added Tax (VAT) Number Validation Computes the checksum and validates the pattern against
Check it.

Serbia Value Added Tax (VAT) Number narrow breadth


The narrow breadth detects a nine-digit number or 11-character alphanumeric pattern that
matches the Serbian VAT Number format with checksum validation. It checks for common test
numbers and patterns, and also requires the presence of related keywords.

Table 45-948 Serbia Value Added Tax (VAT) Number narrow-breadth patterns

Pattern

\d{9}

[Rr][Ss]\d{9}

[Rr][Ss] \d{9}

[Ss][Rr]\d{9}

[Ss][Rr] \d{9}

Table 45-949 Serbia Value Added Tax (VAT) Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

111111111, 222222222, 333333333, 444444444,


555555555, 666666666, 777777777, 888888888,
999999999

Serbia Value Added Tax (VAT) Number Validation Computes the checksum and validates the pattern against
Check it.
Library of system data identifiers 1451
Singapore NRIC data identifier

Table 45-949 Serbia Value Added Tax (VAT) Number narrow-breadth validators (continued)

Mandatory validator Description

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

vat number, value added tax, vat, VAT, VAT#, vat#,


VATIN, vatin, tax identification number, tin, tax number,
tax id

poreski identifikacioni broj, PORESKI


IDENTIFIKACIONI BROJ, Poreski br., ПДВ број, Порез
на додату вредност, PDV broj, Porez na dodatu
vrednost, porez na dodatu vrednost, PDV, pdv, ПДВ,
порески идентификациони број, PIB, pib, пиб,
poreski broj, порески број

Singapore NRIC data identifier


The Singapore NRIC (National Registration Identity Card) is the identity document used in
Singapore. The NRIC is a required document for some government procedures, commercial
transactions such as the opening of a bank account, or to gain entry to premises by surrendering
or exchanging for an entry pass.
The wide breadth of the Singapore NRIC data identifier detects nine characters in the pattern
LDDDDDDDL. The last character is used to validate a checksum.

Table 45-950 Singapore NRIC wide-breadth pattern

Pattern

[SFTGsftg]\d{7}\w

Table 45-951 Singapore NRIC wide-breadth validator

Mandatory validator Description

Singapore NRIC Computes the checksum and validates the pattern against
it.

Slovakia Driver's Licence Number


A Slovak drivers license is a document confirming the rights of the holder to drive motor
vehicles. Slovak driver's licenses are granted by the Ministry of Interior.
Library of system data identifiers 1452
Slovakia Driver's Licence Number

The Slovakia Driver's Licence Number data identifier detects an eight-character alphanumeric
pattern that matches the Slovak driver's license number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects an eight-character alphanumeric pattern that matches the Slovak
driver's license number format. It checks for common test patterns.
See “Slovakia Driver's Licence Number wide breadth” on page 1452.
■ The narrow breadth detects an eight-character alphanumeric pattern that matches the
Slovak driver's license number format. It checks for common test patterns, and also requires
the presence of related keywords.
See “Slovakia Driver's Licence Number narrow breadth” on page 1452.

Slovakia Driver's Licence Number wide breadth


The wide breadth detects an eight-character alphanumeric pattern that matches the Slovak
driver's license number format. It checks for common test patterns.

Table 45-952 Slovakia Driver's Licence Number wide-breadth patterns

Pattern

[A-Za-z]\d{7}

Table 45-953 Slovakia Driver's Licence Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

1111111, 2222222, 3333333, 4444444, 5555555,


6666666, 7777777, 8888888, 9999999

Slovakia Driver's Licence Number narrow breadth


The narrow breadth detects an eight-character alphanumeric pattern that matches the Slovak
driver's license number format. It checks for common test patterns, and also requires the
presence of related keywords.

Table 45-954 Slovakia Driver's Licence Number narrow-breadth patterns

Pattern

[A-Za-z]\d{7}
Library of system data identifiers 1453
Slovakia National Identification Number

Table 45-955 Slovakia Driver's Licence Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

1111111, 2222222, 3333333, 4444444, 5555555,


6666666, 7777777, 8888888, 9999999

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

driver license, driver licence, drivers license, drivers


licence, driving license, driving licence, driver license
number, driver licence number, drivers license number,
drivers licence number, driving license number, driving
licence number, driver's license, driver's licence,
Driver's License, Driver's Licence, driver's license
number, driver's licence number, Driver's License
Number, Driver's Licence Number, DLNo#, dlno#,
drivers lic., driver permit, drivers permit, driving permit,
license number, licence number

vodičský preukaz, Vodičský preukaz, VODIČSKÝ


PREUKAZ, číslo vodičského preukazu, ovládače lic.,
povolenie vodiča, povolenia vodičov, povolenie na
jazdu, povolenie jazdu, číslo licencie

Slovakia National Identification Number


In Slovakia, identification cards are issued by the state authorities at 15 years of age for every
citizen. This number is used in Slovak Republic as the primary unique identifier for every person
by government institutions, banks, and so on.
The Slovakia National Identification Number data identifier detects either an eight-character
alphanumeric pattern or a 9- to 10-digit number that matches the Slovakia National Identification
Number format.
The Slovakia National Identification Number data identifier provides three breadths of detection:
■ The wide breadth detects either an eight-character alphanumeric pattern or a 9- to 10-digit
number without checksum validation.
See “Slovakia National Identification Number wide breadth” on page 1454.
Library of system data identifiers 1454
Slovakia National Identification Number

■ The medium breadth detects either an eight-character alphanumeric pattern or a 9- to


10-digit number with checksum validation.
See “Slovakia National Identification Number medium breadth” on page 1455.
■ The narrow breadth detects either an eight-character alphanumeric pattern or a 9- to 10-digit
number with checksum validation. It also requires the presence of related keywords.
See “Slovakia National Identification Number narrow breadth” on page 1455.

Slovakia National Identification Number wide breadth


The wide breadth detects either an eight-character alphanumeric pattern or a 9- to 10-digit
number without checksum validation.

Table 45-956 Slovakia National Identification Number wide-breadth patterns

Patterns

\d{10}

\d{9}

[A-Za-z]{2} \d{6}

[A-Za-z]{2}/\d{6}

[A-Za-z]{2}/\d{6}

\d{6}/\d{3}

\d{6}-\d{3}

\d{6} \d{3}

\d{6}/\d{4}

\d{6}-\d{4}

\d{6} \d{4}

Table 45-957 Slovakia National Identification Number wide-breadth validators

Mandatory validators Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding numbers.


Library of system data identifiers 1455
Slovakia National Identification Number

Slovakia National Identification Number medium breadth


The medium breadth detects either an eight-character alphanumeric pattern or a 9- to 10-digit
number with checksum validation.

Table 45-958 Slovakia National Identification Number medium-breadth patterns

Patterns

\d{10}

\d{9}

[A-Za-z]{2} \d{6}

[A-Za-z]{2}/\d{6}

[A-Za-z]{2}/\d{6}

\d{6}/\d{3}

\d{6}-\d{3}

\d{6} \d{3}

\d{6}/\d{4}

\d{6}-\d{4}

\d{6} \d{4}

Table 45-959 Slovakia National Identification Number medium-breadth validators

Mandatory validators Description

Duplicate digits Ensures that a string of digits is not all the same.

Slovakia National Identification Number Validation Computes the checksum and validates the pattern against
Check it.

Slovakia National Identification Number narrow breadth


The narrow breadth detects either an eight-character alphanumeric pattern or a 9- to 10-digit
number with checksum validation. It also requires the presence of related keywords.
Library of system data identifiers 1456
Slovakia National Identification Number

Table 45-960 Slovakia National Identification Number narrow-breadth patterns

Patterns

\d{10}

\d{9}

[A-Za-z]{2} \d{6}

[A-Za-z]{2}/\d{6}

[A-Za-z]{2}/\d{6}

\d{6}/\d{3}

\d{6}-\d{3}

\d{6} \d{3}

\d{6}/\d{4}

\d{6}-\d{4}

\d{6} \d{4}

Table 45-961 Slovakia National Identification Number narrow-breadth validators

Mandatory validator Description

Slovakia National Identification Number Validation Computes the checksum and validates the pattern against
Check it.

Duplicate digits Ensures that a string of digits is not all the same.
Library of system data identifiers 1457
Slovakia Passport Number

Table 45-961 Slovakia National Identification Number narrow-breadth validators (continued)

Mandatory validator Description

Find keywords At least one of the following keywords or key phrases must
be present for he data to match when you use this option.

Inputs:

id number, identity card number, identity card no,


national identity card number, national identity card
no, national identification number, national
identification no, identification number, identification
no

identifikačné číslo, személyi igazolvány száma,


személyigazolvány szám, číslo občianského preukazu,
identifikačná karta č, személyi igazolvány szám,
nemzeti személyi igazolvány száma, číslo národnej
identifikačnej karty, národná identifikačná karta č,
nemzeti személyazonosító igazolvány, nemzeti
azonosító szám, národné identifikačné číslo, národná
identifikačná značka č, nemzeti azonosító szám,
azonosító szám, identifikačné číslo, rodné číslo, RČ

Number delimiter Validates a match by checking the surrounding numbers.

Slovakia Passport Number


Slovak passports are issued to citizens of Slovakia to facilitate international travel.
The Slovakia Passport Number data identifier detects an eight- or nine-character alphanumeric
pattern that matches the Slovak passport number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects an eight- or nine-character alphanumeric pattern that matches
the Slovak passport number format. It checks for common test patterns.
See “Slovakia Passport Number wide breadth” on page 1458.
■ The narrow breadth detects an eight- or nine-character alphanumeric pattern that matches
the Slovak passport number format. It checks for common test patterns, and also requires
the presence of related keywords.
See “Slovakia Passport Number narrow breadth” on page 1458.
Library of system data identifiers 1458
Slovakia Passport Number

Slovakia Passport Number wide breadth


The wide breadth detects an eight- or nine-character alphanumeric pattern that matches the
Slovak passport number format. It checks for common test patterns.

Table 45-962 Slovakia Passport Number wide-breadth patterns

Pattern

[A-Za-z]{2}\d{7}

[A-Za-z]\d{7}

Table 45-963 Slovakia Passport Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

1111111, 2222222, 3333333, 4444444, 5555555,


6666666, 7777777, 8888888, 9999999

Slovakia Passport Number narrow breadth


The narrow breadth detects an eight- or nine-character alphanumeric pattern that matches
the Slovak passport number format. It checks for common test patterns, and also requires the
presence of related keywords.

Table 45-964 Slovakia Passport Number narrow-breadth patterns

Pattern

[A-Za-z]{2}\d{7}

[A-Za-z]\d{7}

Table 45-965 Slovakia Passport Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.


Library of system data identifiers 1459
Slovakia Value Added Tax (VAT) Number

Table 45-965 Slovakia Passport Number narrow-breadth validators (continued)

Mandatory validator Description

Exclude ending characters Data ending with any of the following list of values is not
matched:

1111111, 2222222, 3333333, 4444444, 5555555,


6666666, 7777777, 8888888, 9999999

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

passport, passport number, passport no, passportno,


passport#,Passport, Passport No., PASSPORT

PASSEPORT, passeport, cestovný pas, číslo pasu,


pas č, Číslo pasu, PAS, CESTOVNÝ PAS, Passeport

Slovakia Value Added Tax (VAT) Number


Value Added Tax (VAT) is a consumption tax that is borne by the end consumer. VAT is paid
for each transaction in the manufacturing and distribution process. For Slovakia, VAT is
administered by the tax office for the region in which the business is established.
The Slovakia Value Added Tax (VAT) Number data identifier detects a 12-character
alphanumeric pattern that matches the Slovakia VAT number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects a 12-character alphanumeric pattern that matches the Slovakia
VAT number format without checksum validation. It checks for common test patterns.
See “Slovakia Value Added Tax (VAT) Number wide breadth” on page 1460.
■ The medium breadth detects a 12-character alphanumeric pattern that matches the Slovakia
VAT number format with checksum validation.
See “Slovakia Value Added Tax (VAT) Number medium breadth” on page 1460.
■ The narrow breadth detects a 12-character alphanumeric pattern that matches the Slovakia
VAT number format with checksum validation. It checks for common test patterns, and also
requires the presence of related keywords.
See “Slovakia Value Added Tax (VAT) Number narrow breadth” on page 1460.
Library of system data identifiers 1460
Slovakia Value Added Tax (VAT) Number

Slovakia Value Added Tax (VAT) Number wide breadth


The wide breadth detects a 12-character alphanumeric pattern that matches the Slovakia VAT
number format without checksum validation. It checks for common test patterns.

Table 45-966 Slovakia Value Added Tax (VAT) Number wide-breadth patterns

Pattern

[Ss][Kk][1-9][0-9][234789]\d{7}

Table 45-967 Slovakia Value Added Tax (VAT) Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

1111111, 2222222, 3333333, 4444444, 5555555,


6666666, 7777777, 8888888, 9999999

Slovakia Value Added Tax (VAT) Number medium breadth


The medium breadth detects a 12-character alphanumeric pattern that matches the Slovakia
VAT number format with checksum validation.

Table 45-968 Slovakia Value Added Tax (VAT) Number medium-breadth patterns

Pattern

[Ss][Kk][1-9][0-9][234789]\d{7}

Table 45-969 Slovakia Value Added Tax (VAT) Number medium-breadth validators

Mandatory validator Description

Slovakia Value Added Tax (VAT) Number Validation Computes the checksum and validates the pattern against
Check it.

Slovakia Value Added Tax (VAT) Number narrow breadth


The narrow breadth detects a 12-character alphanumeric pattern that matches the Slovakia
VAT number format with checksum validation. It checks for common test patterns, and also
requires the presence of related keywords.
Library of system data identifiers 1461
Slovenia Passport Number

Table 45-970 Slovakia Value Added Tax (VAT) Number narrow-breadth patterns

Pattern

[Ss][Kk][1-9][0-9][234789]\d{7}

Table 45-971 Slovakia Value Added Tax (VAT) Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

1111111, 2222222, 3333333, 4444444, 5555555,


6666666, 7777777, 8888888, 9999999

Slovakia Value Added Tax (VAT) Number Validation Computes the checksum and validates the pattern against
Check it.

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

vat number, VAT number, číslo DPH, vat, vat#, vatno#,


value added tax number, číslo dane z pridanej hodnoty,
VAT, VAT#, identifikačné číslo vat, vat no, VATIN, vatin,
value added tax, dph, DPH, daň z pridanej hodnoty,
daň pridanej hodnoty, číslo dane pridanej hodnoty,
identifikačné číslo DPH

Slovenia Passport Number


Slovenian passports are issued to citizens of Slovenia to facilitate international travel.
The Slovenia Passport Number data identifier detects a nine-character alphanumeric pattern
that matches the Slovenian passport number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects a nine-character alphanumeric pattern that matches the Slovenian
passport number format. It checks for common test patterns.
See “Slovenia Passport Number wide breadth” on page 1462.
■ The narrow breadth detects a nine-character alphanumeric pattern that matches the
Slovenian passport number format. It checks for common test patterns, and also requires
the presence of related keywords.
See “Slovenia Passport Number narrow breadth” on page 1462.
Library of system data identifiers 1462
Slovenia Passport Number

Slovenia Passport Number wide breadth


The wide breadth detects a nine-character alphanumeric pattern that matches the Slovenian
passport number format. It checks for common test patterns.

Table 45-972 Slovenia Passport Number wide-breadth patterns

Pattern

[Pp][Bb]\d{7}

Table 45-973 Slovenia Passport Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

1111111, 2222222, 3333333, 4444444, 5555555,


6666666, 7777777, 8888888, 9999999

Slovenia Passport Number narrow breadth


The narrow breadth detects a nine-character alphanumeric pattern that matches the Slovenian
passport number format. It checks for common test patterns, and also requires the presence
of related keywords.

Table 45-974 Slovenia Passport Number narrow-breadth patterns

Pattern

[Pp][Bb]\d{7}

Table 45-975 Slovenia Passport Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

1111111, 2222222, 3333333, 4444444, 5555555,


6666666, 7777777, 8888888, 9999999
Library of system data identifiers 1463
Slovenia Tax Identification Number

Table 45-975 Slovenia Passport Number narrow-breadth validators (continued)

Mandatory validator Description

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

Passport, passport number, passport, passport no,


passport book, passport#, passportno, passport card

številka potnega lista, potni list, knjiga potnega lista,


potni list #, passeport, Passeport

Slovenia Tax Identification Number


The Slovenia Tax Identification Number is a unique identifier of individuals and legal entities
for tax purposes. The Financial Administration of the Republic of Slovenia issues and
administers tax identification numbers in Slovenia.
The Slovenia Tax Identification Number data identifier detects an eight-digit number that
matches the Slovenian tax identification number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects an eight-digit number that matches the Slovenian tax identification
number format without checksum validation. It checks for common test numbers.
See “Slovenia Tax Identification Number wide breadth” on page 1463.
■ The medium breadth detects an eight-digit number that matches the Slovenian tax
identification number format with checksum validation.
See “Slovenia Tax Identification Number medium breadth” on page 1464.
■ The narrow breadth detects an eight-digit number that matches the Slovenian tax
identification number format with checksum validation. It checks for common test numbers,
and also requires the presence of related keywords.
See “Slovenia Tax Identification Number narrow breadth” on page 1464.

Slovenia Tax Identification Number wide breadth


The wide breadth detects an eight-digit number that matches the Slovenian tax identification
number format without checksum validation. It checks for common test numbers.
Library of system data identifiers 1464
Slovenia Tax Identification Number

Table 45-976 Slovenia Tax Identification Number wide-breadth patterns

Pattern

[1-9]\d{7}

Table 45-977 Slovenia Tax Identification Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

11111111, 22222222, 33333333, 44444444, 55555555,


66666666, 77777777, 88888888, 99999999

Slovenia Tax Identification Number medium breadth


The medium breadth detects an eight-digit number that matches the Slovenian tax identification
number format with checksum validation.

Table 45-978 Slovenia Tax Identification Number medium-breadth patterns

Pattern

[1-9]\d{7}

Table 45-979 Slovenia Tax Identification Number medium-breadth validators

Mandatory validator Description

Slovenia Tax Identification Number Validation Check Computes the checksum and validates the pattern against
it.

Slovenia Tax Identification Number narrow breadth


The narrow breadth detects an eight-digit number that matches the Slovenian tax identification
number format with checksum validation. It checks for common test numbers, and also requires
the presence of related keywords.

Table 45-980 Slovenia Tax Identification Number narrow-breadth patterns

Pattern

[1-9]\d{7}
Library of system data identifiers 1465
Slovenia Unique Master Citizen Number

Table 45-981 Slovenia Tax Identification Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

11111111, 22222222, 33333333, 44444444, 55555555,


66666666, 77777777, 88888888, 99999999

Slovenia Tax Identification Number Validation Check Computes the checksum and validates the pattern against
it.

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

tax number, Slovenia TIN, SloveniaTIN, TIN#, tin#, tin,


TIN, Slovenian Tax Number, Tax Number, tax
identification number

identifikacijska številka davka, Slovenska davčna


številka, Davčna številka

Slovenia Unique Master Citizen Number


The unique master citizen number is a unique identification number assigned to every citizen
of Slovenia at birth or on acquiring citizenship.
The Slovenia Unique Master Citizen Number detects a 13-digit number that matches the
Slovenia Unique Master Citizen Number format.
The Slovenia Unique Master Citizen Number data identifier provides three breadths of detection:
■ The wide breadth detects a 13-digit number without checksum validation.
See “Slovenia Unique Master Citizen Number wide breadth” on page 1465.
■ The medium breadth detects a 13-digit number with checksum validation.
See “Slovenia Unique Master Citizen Number medium breadth” on page 1466.
■ The narrow breadth detects a 13-digit number with checksum validation. It also requires
the presence of related keywords.
See “Slovenia Unique Master Citizen Number narrow breadth” on page 1466.

Slovenia Unique Master Citizen Number wide breadth


The wide breadth detects a 13-digit number without checksum validation.
Library of system data identifiers 1466
Slovenia Unique Master Citizen Number

Table 45-982 Slovenia Unique Master Citizen Number wide-breadth pattern

Pattern

\d{7}[05]\d{5}

Table 45-983 Slovenia Unique Master Citizen Number wide-breadth validators

Mandatory validators Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding numbers.

Slovenia Unique Master Citizen Number medium breadth


The medium breadth detects a 13-digit number with checksum validation.

Table 45-984 Slovenia Unique Master Citizen Number medium-breadth pattern

Pattern

\d{7}[05]\d{5}

Table 45-985 Slovenia Unique Master Citizen Number medium-breadth validator

Mandatory validator Description

Slovenia Unique Master Citizen Number Validation Computes the checksum and validates the pattern against
Check it.

Slovenia Unique Master Citizen Number narrow breadth


The narrow breadth detects a 13-digit number with checksum validation. It also requires the
presence of related keywords.

Table 45-986 Slovenia Unique Master Citizen Number narrow-breadth pattern

Pattern

\d{7}[05]\d{5}

Table 45-987 Slovenia Unique Master Citizen Number narrow-breadth validators

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding numbers.


Library of system data identifiers 1467
Slovenia Value Added Tax (VAT) Number

Table 45-987 Slovenia Unique Master Citizen Number narrow-breadth validators (continued)

Mandatory validators Description

Slovenia Unique Master Citizen Number Validation Computes the checksum and validates the pattern against
Check it.

Duplicate digits Ensures that a string of digits is not all the same.

Find keywords At least one of the following keywords or key phrases must
be present for he data to match when you use this option.

Inputs:

unique master citizen number, unique identification


number, unique id number, unique citizen number

EMŠO, emšo, edinstvena številka državljana, enotna


identifikacijska številka, Enotna maticna številka
obcana, enotna maticna številka obcana, številka
državljana, edinstvena identifikacijska številka

Slovenia Value Added Tax (VAT) Number


Value Added Tax (VAT) is a consumption tax that is borne by the end consumer. VAT is paid
for each transaction in the manufacturing and distribution process. For Slovenia, VAT is
administered by the tax office for the region in which the business is established.
The Slovenia Value Added Tax (VAT) Number data identifier detects a 10-character
alphanumeric pattern that matches the Slovenian VAT number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects a 10-character alphanumeric pattern that matches the Slovenian
VAT number format without checksum validation. It checks for common test patterns.
See “Slovenia Value Added Tax (VAT) Number wide breadth” on page 1468.
■ The medium breadth detects a 10-character alphanumeric pattern that matches the
Slovenian VAT number format with checksum validation.
See “Slovenia Value Added Tax (VAT) Number medium breadth” on page 1468.
■ The narrow breadth detects a 10-character alphanumeric pattern that matches the Slovenian
VAT number format with checksum validation. It checks for common test patterns, and also
requires the presence of related keywords.
See “Slovenia Value Added Tax (VAT) Number narrow breadth” on page 1469.
Library of system data identifiers 1468
Slovenia Value Added Tax (VAT) Number

Slovenia Value Added Tax (VAT) Number wide breadth


The wide breadth detects a 10-character alphanumeric pattern that matches the Slovenian
VAT number format without checksum validation. It checks for common test patterns.

Table 45-988 Slovenia Value Added Tax (VAT) Number wide-breadth patterns

Pattern

[Ss][Ii]\d{8}

[Ss][Ii] \d{8}

Table 45-989 Slovenia Value Added Tax (VAT) Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

11111111, 22222222, 33333333, 44444444, 55555555,


66666666, 77777777, 88888888, 99999999

Slovenia Value Added Tax (VAT) Number medium breadth


The medium breadth detects a 10-character alphanumeric pattern that matches the Slovenian
VAT number format with checksum validation.

Table 45-990 Slovenia Value Added Tax (VAT) Number medium-breadth patterns

Pattern

[Ss][Ii]\d{8}

[Ss][Ii] \d{8}

Table 45-991 Slovenia Value Added Tax (VAT) Number medium-breadth validators

Mandatory validator Description

Slovenia Value Added Tax (VAT) Number Validation Computes the checksum and validates the pattern against
Check it.
Library of system data identifiers 1469
South African Personal Identification Number

Slovenia Value Added Tax (VAT) Number narrow breadth


The narrow breadth detects a 10-character alphanumeric pattern that matches the Slovenian
VAT number format with checksum validation. It checks for common test patterns, and also
requires the presence of related keywords.

Table 45-992 Slovenia Value Added Tax (VAT) Number narrow-breadth patterns

Pattern

[Ss][Ii]\d{8}

[Ss][Ii] \d{8}

Table 45-993 Slovenia Value Added Tax (VAT) Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

11111111, 22222222, 33333333, 44444444, 55555555,


66666666, 77777777, 88888888, 99999999

Slovenia Value Added Tax (VAT) Number Validation Computes the checksum and validates the pattern against
Check it.

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

vat, vat number, value added tax number, vat


identification number, številka davka na dodano
vrednost, vat#, vat no, DDV št, slovenia vat št

South African Personal Identification Number


Every citizen has a national identification number in South Africa. The number serves as proof
of identification.
The South African Personal Identification Number detects a 13-digit number that matches the
South African Personal Identification Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects a 13-digit number without checksum validation.
See “South African Personal Identification Number wide breadth” on page 1470.
Library of system data identifiers 1470
South African Personal Identification Number

■ The medium breadth detects a 13-digit number with checksum validation.


See “South African Personal Identification Number medium breadth” on page 1470.
■ The narrow breadth detects a 13-digit number that passes checksum validation. It also
requires the presence of related keywords.
See “South African Personal Identification Number narrow breadth” on page 1471.

South African Personal Identification Number wide breadth


The wide breadth detects a 13-digit number without checksum validation.

Table 45-994 South African Personal Identification Number wide-breadth patterns

Patterns

[0123678]\d{8}

[0123678]\d{3}-\d{4}-\d

Table 45-995 South African Personal Identification Number wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

South African Personal Identification Number medium breadth


The medium breadth detects a 13-digit number with checksum validation.

Table 45-996 South African Personal Identification Number medium-breadth patterns

Patterns

\d{6}[ -]\d{4}[ -][01]\d{2}

\d{10}[01]\d{2}

Table 45-997 South African Personal Identification Number medium-breadth validators

Mandatory validators Description

South African Personal Identification Number Computes the checksum and validates the pattern against
Validation Check it.

Number delimiter Validates a match by checking the surrounding numbers.


Library of system data identifiers 1471
South Korea Resident Registration Number

South African Personal Identification Number narrow breadth


The narrow breadth detects a 13-digit number that passes checksum validation. It also requires
the presence of related keywords.

Table 45-998 South African Personal Identification Number narrow-breadth patterns

Patterns

\d{6}[ -]\d{4}[ -][01]\d{2}

\d{10}[01]\d{2}

Table 45-999 South African Personal Identification Number narrow-breadth validators

Mandatory validators Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

South African Personal Identification Number Computes the checksum and validates the pattern against
Validation Check it.

Find keywords If you select this option selected, at least one of the
following keywords or key phrases must be present for the
data to be matched.

Inputs:

national identification number, national identity


number, national insurance number, personal identity
number, personal identification number, insurance
number, nationalid#, personalidentityno#, unique
identity number, uniqueidentityno#

nasionale identifikasie nommer, nasionale


identiteitsnommer, versekering aantal, persoonlike
identiteitsnommer, unieke identiteitsnommer,
identiteitsnommer, identiteitsnommer#,
versekeringaantal#, nasionaleidentiteitsnommer#

South Korea Resident Registration Number


The South Korea Resident Registration Number is a 13-digit number issued to all residents
of the Republic of Korea. Similar to national identification numbers in other countries, it is used
to identify people in various private transactions such as in banking and employment. It is also
used extensively for online identification purposes.
Library of system data identifiers 1472
South Korea Resident Registration Number

The South Korea Resident Registration Number detects a 13-digit number that matches the
South Korea Resident Registration Number format.
The South Korea Resident Registration Number data identifier detects the presence of this
13-digit number.
This data identifier provides three breadths of detection:
■ The wide breadth matches numbers with duplicate digit validation.
See “South Korea Resident Registration Number wide breadth” on page 1472.
■ The medium breadth matches numbers with checksum validation.
See “South Korea Resident Registration Number medium breadth” on page 1472.
■ The narrow breadth matches numbers with checksum validation. It also requires the
presence related keywords.
See “South Korea Resident Registration Number narrow breadth” on page 1473.
This data identifier does not provide a narrow breadth option.

South Korea Resident Registration Number wide breadth


The wide breadth detects a 13-digit number that contain encoded birth date, gender, and origin
of birth. It validates the number with duplicate digit validation.

Table 45-1000 South Korea Resident Registration Number wide-breadth patterns

Patterns

\d{2}[01]\d[0123]\d{8}

\d{2}[01]\d[0123]\d-\d{7}

\d\d[01]\d[0123]\d-\d{7}

\d{2}[01]\d[0123]\d[ ]\d{7}

Table 45-1001 South Korea Resident Registration Number wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

South Korea Resident Registration Number medium breadth


The medium breadth detects a 13-digit number that contain encoded birth date, gender, and
origin of birth. It also validates the pattern using a checksum.
Library of system data identifiers 1473
South Korea Resident Registration Number

Table 45-1002 South Korea Resident Registration Number medium-breadth patterns

Pattern

\d{2}[01]\d[0123]\d{8}

\d{2}[01]\d[0123]\d-\d{7}

\d\d[01]\d[0123]\d-\d{7}

\d{2}[01]\d[0123]\d[ ]\d{7}

Table 45-1003 South Korea Resident Registration Number medium-breadth validators

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding numbers.

Advanced KRRN Validation Computes the checksum and validates the pattern against it.

South Korea Resident Registration Number narrow breadth


The narrow breadth detects a 13-digit number that contain encoded birth date, gender, and
origin of birth. It also validates the pattern using a checksum, and requires the presence of
related keywords.

Table 45-1004 South Korea Resident Registration Number narrow-breadth patterns

Patterns

\d{2}[01]\d[0123]\d{8}

\d{2}[01]\d[0123]\d-\d{7}

\d\d[01]\d[0123]\d-\d{7}

\d{2}[01]\d[0123]\d[ ]\d{7}

Table 45-1005 South Korea Resident Registration Number narrow-breadth validators

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding


numbers.

Advanced KRRN Validation Computes the checksum and validates the pattern
against it.

Duplicate digits Ensures that a string of digits is not all the same.
Library of system data identifiers 1474
Spain Value Added Tax (VAT) Number

Table 45-1005 South Korea Resident Registration Number narrow-breadth validators


(continued)

Mandatory validators Description

Find keywords At least one of the following keywords or key


phrases must be present for he data to match when
you use this option.

Inputs:

주민등록번호,주민번호

Resident Registration Number, Resident Number

Spain Value Added Tax (VAT) Number


VAT is a consumption tax that is borne by the end consumer. VAT is paid for each transaction
in the manufacturing and distribution process. VAT in Spain is overseen by the State Tax
Administration Agency.
The Spain Value Added Tax (VAT) Number data identifier detects an 11-character alphanumeric
pattern that matches the Spanish VAT number format.
The Spain Value Added Tax (VAT) Number data identifier provides three breadths of detection:
■ The wide breadth detects an 11-character alphanumeric pattern without checksum validation.
See “Spain Value Added Tax (VAT) Number wide breadth” on page 1474.
■ The medium breadth detects an 11-character alphanumeric pattern with checksum
validation.
See “Spain Value Added Tax (VAT) Number medium breadth” on page 1475.
■ The narrow breadth detects an 11-character alphanumeric pattern with checksum validation.
It also requires the presence of related keywords.
See “Spain Value Added Tax (VAT) Number narrow breadth” on page 1476.

Spain Value Added Tax (VAT) Number wide breadth


The wide breadth detects an 11-character alphanumeric pattern without checksum validation.

Table 45-1006 Spain Value Added Tax (VAT) Number wide-breadth patterns

Patterns

[Ee][Ss][0-9A-Za-z]\d{7}[0-9A-Za-z]

[Ee][Ss] [0-9A-Za-z]\d{7}[0-9A-Za-z]
Library of system data identifiers 1475
Spain Value Added Tax (VAT) Number

Table 45-1006 Spain Value Added Tax (VAT) Number wide-breadth patterns (continued)

Patterns

[Ee][Ss] [0-9A-Za-z]-\d{7}[0-9A-Za-z]

[Ee][Ss] [0-9A-Za-z]-\d{2}.\d{3}.\d{2}[0-9A-Za-z]

[Ee][Ss] [0-9A-Za-z]-\d{2},\d{3},\d{2}[0-9A-Za-z]

[Ee][Ss] [0-9A-Za-z]-\d{2}/\d{5}[0-9A-Za-z]

Table 45-1007 Spain Value Added Tax (VAT) Number wide-breadth validators

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

000000000, 111111111, 222222222, 333333333,


444444444, 555555555, 666666666, 777777777,
888888888, 999999999

Spain Value Added Tax (VAT) Number medium breadth


The medium breadth detects an 11-character alphanumeric pattern with checksum validation.

Table 45-1008 Spain Value Added Tax (VAT) Number medium-breadth patterns

Patterns

[Ee][Ss][0-9A-Za-z]\d{7}[0-9A-Za-z]

[Ee][Ss] [0-9A-Za-z]\d{7}[0-9A-Za-z]

[Ee][Ss] [0-9A-Za-z]-\d{7}[0-9A-Za-z]

[Ee][Ss] [0-9A-Za-z]-\d{2}.\d{3}.\d{2}[0-9A-Za-z]

[Ee][Ss] [0-9A-Za-z]-\d{2},\d{3},\d{2}[0-9A-Za-z]

[Ee][Ss] [0-9A-Za-z]-\d{2}/\d{5}[0-9A-Za-z]
Library of system data identifiers 1476
Spain Value Added Tax (VAT) Number

Table 45-1009 Spain Value Added Tax (VAT) Number medium-breadth validator

Mandatory validator Description

Spain VAT Number Validation Check Computes the checksum and validates the pattern against
it.

Spain Value Added Tax (VAT) Number narrow breadth


The narrow breadth detects an 11-character alphanumeric pattern with checksum validation.
It also requires the presence of related keywords.

Table 45-1010 Spain Value Added Tax (VAT) Number narrow-breadth patterns

Patterns

[Ee][Ss][0-9A-Za-z]\d{7}[0-9A-Za-z]

[Ee][Ss] [0-9A-Za-z]\d{7}[0-9A-Za-z]

[Ee][Ss] [0-9A-Za-z]-\d{7}[0-9A-Za-z]

[Ee][Ss] [0-9A-Za-z]-\d{2}.\d{3}.\d{2}[0-9A-Za-z]

[Ee][Ss] [0-9A-Za-z]-\d{2},\d{3},\d{2}[0-9A-Za-z]

[Ee][Ss] [0-9A-Za-z]-\d{2}/\d{5}[0-9A-Za-z]

Table 45-1011 Spain Value Added Tax (VAT) Number narrow-breadth validators

Mandatory validators Description

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

spain vat number, Spanish vat number, VAT Number,


vat no, VAT#, VAT, value added tax number, value
added tax

Número IVA españa, Número de IVA español, español


Número IVA, Número de valor agregado, IVA, Número
IVA, Número impuesto sobre valor añadido, Impuesto
valor agregado, Impuesto sobre valor añadido, valor
añadido el impuesto, valor añadido el impuesto
numero

Number delimiter Validates a match by checking the surrounding characters.


Library of system data identifiers 1477
Spain Driver's Licence Number

Table 45-1011 Spain Value Added Tax (VAT) Number narrow-breadth validators (continued)

Mandatory validators Description

Exclude ending characters Data ending with any of the following list of values is not
matched:

000000000, 111111111, 222222222, 333333333,


444444444, 555555555, 666666666, 777777777,
888888888, 999999999

Spain VAT Number Validation Check Computes the checksum and validates the pattern against
it.

Spain Driver's Licence Number


Identification number for an individual's driver's license issued by the Driver and Vehicle
Licensing Agency of the Spain.
The Spain Driver's Licence Number data identifier detects a nine-character alphanumeric
pattern that matches the Spain Driver's Licence Number format.
The Spain Driver's Licence Number data identifier provides two breadths of detection:
■ The wide breadth detects a nine-character alphanumeric pattern without checksum
validation. It also requires the presence of related keywords.
See “Spain Driver's Licence Number wide breadth” on page 1477.
■ The narrow breadth detects a nine-character alphanumeric pattern with checksum validation.
It also requires the presence of related keywords.
See “Spain Driver's Licence Number narrow breadth” on page 1478.

Spain Driver's Licence Number wide breadth


The wide breadth detects a nine-character alphanumeric pattern without checksum validation.
It also requires the presence of related keywords.

Table 45-1012 Spain Driver's Licence Number wide-breadth pattern

Patterns

\d{8}\w

\d{8}[- ]\w

\d{8}[ ][-]\w

\d{8}[ ][-][ ]\w


Library of system data identifiers 1478
Spain Driver's Licence Number

Table 45-1013 Spain Driver's Licence Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

DLNo#, dlno#, DL#, Drivers Lic., driver licence, driver


license, drivers licence, drivers license, driver's
licence, driver's license, driving licence, driving
license, driver licence number, driver license number,
drivers licence number, drivers license number,
driver's licence number, driver's license number,
driving licence number, driving license number, driving
permit, driving permit number

permiso de conducción, permiso conducción, Número


licencia conducir, Número de carnet de conducir,
Número carnet conducir, licencia conducir, Número
de permiso de conducir, Número de permiso conducir,
Número permiso conducir, permiso conducir, licencia
de manejo, el carnet de conducir, carnet conducir

Spain Driver's Licence Number narrow breadth


The narrow breadth detects a nine-character alphanumeric pattern with checksum validation.
It also requires the presence of related keywords.

Table 45-1014 Spain Driver's Licence Number narrow-breadth patterns

Patterns

\d{8}\w

\d{8}[- ]\w

\d{8}[ ][-]\w

\d{8}[ ][-][ ]\w


Library of system data identifiers 1479
Spanish Customer Account Number

Table 45-1015 Spain Driver's Licence Number narrow-breadth validators

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding characters.

DNI control key check Computes the control key and checks if it is valid.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

DLNo#, dlno#, DL#, Drivers Lic., driver licence, driver


license, drivers licence, drivers license, driver's
licence, driver's license, driving licence, driving
license, driver licence number, driver license number,
drivers licence number, drivers license number,
driver's licence number, driver's license number,
driving licence number, driving license number, driving
permit, driving permit number

permiso de conducción, permiso conducción, Número


licencia conducir, Número de carnet de conducir,
Número carnet conducir, licencia conducir, Número
de permiso de conducir, Número de permiso conducir,
Número permiso conducir, permiso conducir, licencia
de manejo, el carnet de conducir, carnet conducir

Spanish Customer Account Number


The Spanish customer account number is the standard customer bank account number used
across Spain.
This data identifier provides the following breadths of detection:
■ The wide breadth detects a 20-digit number without checksum validation.
See “Spanish Customer Account Number wide breadth” on page 1480.
■ The medium breadth detects a 20-digit number with checksum validation.
See “Spanish Customer Account Number medium breadth” on page 1480.
■ The narrow breadth detects a 20-digit number with checksum validation. It also requires
the presence of related keywords.
See “Spanish Customer Account Number narrow breadth” on page 1481.
Library of system data identifiers 1480
Spanish Customer Account Number

Spanish Customer Account Number wide breadth


The wide breadth detects a 20-digit number without checksum validation.

Table 45-1016 Spanish Customer Account Number wide-breadth patterns

Patterns

\d{20}

\d{4}[ -/]\d{4}[ -/]\d{2}[ -/]\d{10}

0128[ -/]\d{4}[ -/]\d{2}[ -/]\d{10}

0128\d{16}

Table 45-1017 Spanish Customer Account Number wide-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Spanish Customer Account Number medium breadth


The medium breadth detects a 20-digit number with checksum validation.

Table 45-1018 Spanish Customer Account Number medium-breadth patterns

Patterns

\d{20}

\d{4}[ -/]\d{4}[ -/]\d{2}[ -/]\d{10}

0128[ -/]\d{4}[ -/]\d{2}[ -/]\d{10}

0128\d{16}

Table 45-1019 Spanish Customer Account Number medium-breadth validator

Mandatory validator Description

Spanish Customer Account Number Validation Check Computes the checksum and validates the pattern against
it.

Number delimiter Validates a match by checking the surrounding characters.


Library of system data identifiers 1481
Spanish DNI ID

Spanish Customer Account Number narrow breadth


The narrow breadth detects a 20-digit number with checksum validation. It also requires the
presence of related keywords.

Table 45-1020 Spanish Customer Account Number narrow-breadth patterns

Pattern

\d{20}

\d{4}[ -/]\d{4}[ -/]\d{2}[ -/]\d{10}

0128[ -/]\d{4}[ -/]\d{2}[ -/]\d{10}

0128\d{16}

Table 45-1021 Spanish Customer Account Number narrow-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

Spanish Customer Account Number Validation Check Computes the checksum and validates the pattern against
it.

Find keywords At least one of the following keywords or key phrases must
be present for the data to match when you use this option.

Inputs:

customer account number, account code, customer


account ID, customer bank account ID, bank account
number, spanish customer bank code, account
number, accountno#, accountnumber#

número cuenta cliente, código cuenta, cuenta cliente


ID, número cuenta bancaria cliente, código cuenta
bancaria

Spanish DNI ID
The Spanish DNI ID appears on the Documento nacional de identidad (DNI) and is issued by
the Spanish Hacienda Publica to every citizen of Spain. It is the most important unique identifier
in Spain used for opening accounts, signing contracts, taxes, and elections.
The Spanish DNI ID data identifier provides two breadths of detection:
Library of system data identifiers 1482
Spanish DNI ID

■ The wide breadth detects an 8-digit number followed by a hyphen and letter. The last letter
must match a checksum algorithm.
See “Spanish DNI ID wide breadth” on page 1482.
■ The narrow breadth detects an 8-digit number followed by a hyphen and letter. The last
letter must match a checksum algorithm. It also requires the presence of Spanish DNI-related
keywords.
See “Spanish DNI ID narrow breadth” on page 1482.

Spanish DNI ID wide breadth


The wide breadth detects an 8-digit number followed by a hyphen and letter. The last letter
must match a checksum algorithm.

Table 45-1022 Spanish DNI ID wide-breadth patterns

Pattern

\d{7}\w

\d{7}[- ]\w

\d{7}[ ][-]\w

\d{7}[ ][-][ ]\w

Table 45-1023 Spanish DNI ID wide-breadth validator

Mandatory validator Description

DNI control key check Computes the control key and checks if it is valid.

Spanish DNI ID narrow breadth


The narrow breadth detects an 8-digit number followed by a hyphen and letter. The last letter
must match a checksum algorithm. It also requires the presence of Spanish DNI-related
keywords.

Table 45-1024 Spanish DNI ID narrow-breadth patterns

Pattern

\d{7}\w

\d{7}[- ]\w

\d{7}[ ][-]\w
Library of system data identifiers 1483
Spanish Passport Number

Table 45-1024 Spanish DNI ID narrow-breadth patterns (continued)

Pattern

\d{7}[ ][-][ ]\w

Table 45-1025 Spanish DNI ID narrow-breadth validators

DNI control key check Computes the control key and checks if it is valid.

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched when you use this
option.

Inputs:

DNI, National Identification Number, national identity


number, insurance number, personal identification
number, national identity, personal identity no, unique
identity number, nationalidno#, uniqueid#, DNI#,
nationalID#, DNINúmero#, Identidadúnico#, NIE ID,
Spanish NIE ID, Spanish NIE Number, NIE, NIE#,
NIEnúmero#, NIE número, Documento Nacional de
Identidad, Identidad único, Número nacional identidad,
DNI Número

Spanish Passport Number


Spanish passports are issued to Spanish citizens for the purpose of travel outside Spain.
The Spanish Passport Number data identifier provides two breadths of detection:
■ The wide breadth detects a valid Spanish Passport Number pattern.
See “Spanish Passport Number wide breadth” on page 1483.
■ The narrow breadth detects a valid Spanish Passport Number pattern. It also requires the
presence of related keywords.
See “Spanish Passport Number narrow breadth” on page 1484.

Spanish Passport Number wide breadth


The wide breadth detects a valid Spanish Passport Number pattern.
Library of system data identifiers 1484
Spanish Passport Number

Table 45-1026 Spanish Passport Number wide-breadth patterns

Patterns

\l{2}\d{6}

\l{2}-\d{6}

\l{2} \d{6}

\l{3}\d{6}

\l{3}-\d{6}

\l{3} \d{6}

Table 45-1027 Spanish Passport Number wide-breadth validator

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Spanish Passport Number narrow breadth


The narrow breadth detects a valid Spanish Passport Number pattern. It also requires the
presence of related keywords.

Table 45-1028 Spanish Passport Number narrow-breadth patterns

Patterns

\l{2}\d{6}

\l{2}-\d{6}

\l{2} \d{6}

\l{3}\d{6}

\l{3}-\d{6}

\l{3} \d{6}

Table 45-1029 Spanish Passport Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.


Library of system data identifiers 1485
Spanish Social Security Number

Table 45-1029 Spanish Passport Number narrow-breadth validators (continued)

Mandatory validator Description

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

passport, Passport, Spain Passport, spain passport,


passport book, Passport Book, passport number,
passport no, Passport Number, libreta pasaporte,
número pasaporte, Número Pasaporte, España
pasaporte, pasaporte

Spanish Social Security Number


The Spanish Social Security Number is a 12-digit number assigned to Spanish workers to
allow access to the Spanish healthcare system.
The Spanish Social Security Number system data identifier provides three breadths of detection:
■ The wide breadth detects a 12-digit number without checksum validation.
See “Spanish Social Security Number wide breadth” on page 1485.
■ The medium breadth detects a 12-digit number with checksum validation.
See “Spanish Social Security Number medium breadth” on page 1486.
■ The narrow breadth detects a 12-digit number that passes checksum validation. It also
requires the presence of Spanish Social Security Number-related keywords.
See “Spanish Social Security Number narrow breadth” on page 1486.

Spanish Social Security Number wide breadth


The wide breadth detects a 12-digit number without checksum validation.

Table 45-1030 Spanish Social Security Number wide-breadth patterns

Pattern

\d{12}

\d{2}[/]\d{8}[/]\d{2}

\d{2}[-]\d{8}[-]\d{2}
Library of system data identifiers 1486
Spanish Social Security Number

Table 45-1031 Spanish Social Security Number wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Spanish Social Security Number medium breadth


The medium breadth detects a 12-digit number with checksum validation.

Table 45-1032 Spanish Social Security Number medium-breadth patterns

Pattern

\d{12}

\d{2}[/]\d{8}[/]\d{2}

\d{2}[-]\d{8}[-]\d{2}

Table 45-1033 Spanish Social Security Number medium-breadth validators

Mandatory validator Description

Number Delimiter Validates a match by checking the surrounding characters.

Spanish SSN Number Validation Check Computes the checksum and validates the pattern against
it.

Spanish Social Security Number narrow breadth


The narrow breadth detects a 12-digit number that passes checksum validation. It also requires
the presence of Spanish Social Security Number-related keywords.

Table 45-1034 Spanish Social Security Number narrow breadth patterns

Pattern

\d{12}

\d{2}[/]\d{8}[/]\d{2}

\d{2}[-]\d{8}[-]\d{2}
Library of system data identifiers 1487
Spanish Tax Identification (CIF)

Table 45-1035 Spanish Social Security Number narrow-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Number Delimiter Validates a match by checking the surrounding characters.

Spanish SSN Number Validation Check Computes the checksum and validates the pattern against
it.

Find Keywords At least one of the following keywords or key phrases must
be present for the data to be matched when you use this
option.

Inputs:

SSN, social security number, SSN#, social security


no., socialsecurityno#, Social Security Number, Social
Security No. Número de la Seguridad Social, número
de la seguridad social

Spanish Tax Identification (CIF)


The Spanish Tax Identification corporate tax identifier (CIF) is equivalent to the VAT number,
required for running a business in Spain. This identifier is a company's identification for tax
purposes and is required for any legal transactions.
The Spanish Tax Identification (CIF) system data identifier provides three breadths of detection:
■ The wide breadth detects a 9-digit alphanumeric identifier without checksum validation.
See “Spanish Tax Identification (CIF) wide breadth” on page 1487.
■ The medium breadth detects a 9-digit alphanumeric identifier with checksum validation.
See “Spanish Tax Identification (CIF) medium breadth” on page 1488.
■ The narrow breadth detects a 9-digit alphanumeric identifier with checksum validation. It
also requires the presence of CIF-related keywords.
See “Spanish Tax Identification (CIF) narrow breadth” on page 1489.

Spanish Tax Identification (CIF) wide breadth


The wide breadth detects a 9-digit alphanumeric identifier without checksum validation.

Table 45-1036 Spanish Tax Identification (CIF) wide-breadth patterns

Pattern

[KPQS]\d{7}[A-J]
Library of system data identifiers 1488
Spanish Tax Identification (CIF)

Table 45-1036 Spanish Tax Identification (CIF) wide-breadth patterns (continued)

Pattern

[KPQS]-\d{7}[A-J]

[ABEH]\d{7}[0-9]

[ABEH]-\d{7}[0-9]

[CDFGJLMNRUVW]\d{7}[A-J0-9]

[CDFGJLMNRUVW]-\d{7}[A-J0-9]

Table 45-1037 Spanish Tax Identification (CIF) wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Spanish Tax Identification (CIF) medium breadth


The medium breadth detects a 9-digit alphanumeric identifier with checksum validation.

Table 45-1038 Spanish Tax Identification (CIF) medium-breadth patterns

Pattern

[KPQS]\d{7}[A-J]

[KPQS]-\d{7}[A-J]

[ABEH]\d{7}[0-9]

[ABEH]-\d{7}[0-9]

[CDFGJLMNRUVW]\d{7}[A-J0-9]

[CDFGJLMNRUVW]-\d{7}[A-J0-9]

Table 45-1039 Spanish Tax Identification (CIF) medium-breadth validators

Mandatory validator Description

Number Delimiter Validates a match by checking the surrounding characters.

Spanish Tax ID Number Validation Check Computes the checksum and validates the pattern against
it.
Library of system data identifiers 1489
Spanish Tax Identification (CIF)

Spanish Tax Identification (CIF) narrow breadth


The narrow breadth detects a 9-digit alphanumeric identifier with checksum validation. It also
requires the presence of CIF-related keywords.

Table 45-1040 Spanish Tax Identification (CIF) narrow-breadth patterns

Pattern

[KPQS]\d{7}[A-J]

[KPQS]-\d{7}[A-J]

[ABEH]\d{7}[0-9]

[ABEH]-\d{7}[0-9]

[CDFGJLMNRUVW]\d{7}[A-J0-9]

[CDFGJLMNRUVW]-\d{7}[A-J0-9]

Table 45-1041 Spanish Tax Identification (CIF) narrow-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Number Delimiter Validates a match by checking the surrounding characters.

Spanish Tax ID Number Validation Check Computes the checksum and validates the pattern against
it.

Find Keywords At least one of the following keywords or key phrases must
be present for the data to be matched when you use this
option.

Inputs:

tax ID, tax ID number, CIF ID, CIF no, spanish CIF ID,
cif, tax file no, spanish CIF number, tax file number,
spanish CIF no, tax no, tax number, tax id, taxid#,
taxno#, CIFid#, CIFID#, spanishCIFID#, spanishCIFno#,
cifid#, número de contribuyente, número de impuesto
corporativo, número de Identificación fiscal, CIF
número, CIFnúmero#
Library of system data identifiers 1490
Sri Lanka National Identity Number

Sri Lanka National Identity Number


The National Identity Card (NIC) is the identity document used in Sri Lanka. It is compulsory
for all Sri Lankan citizens who are 16 years of age and older to have their NICs. NICs are
issued by the Department for Registration of Persons.
The Sri Lanka National Identity Number data identifier detects a 10- or 12-character
alphanumeric pattern that matches the Sri Lankan National Identity Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects a 10- or 12-character alphanumeric pattern that matches the Sri
Lanka National Identity Number format without checksum validation. It checks for common
test patterns.
See “Sri Lanka National Identity Number wide breadth” on page 1490.
■ The medium breadth detects a 10- or 12-character alphanumeric pattern that matches the
Sri Lanka National Identity Number format with checksum validation.
See “Sri Lanka National Identity Number medium breadth” on page 1491.
■ The narrow breadth detects a 10- or 12-character alphanumeric pattern that matches the
Sri Lanka National Identity Number format with checksum validation. It checks for common
test patterns, and also requires the presence of related keywords.
See “Sri Lanka National Identity Number narrow breadth” on page 1491.

Sri Lanka National Identity Number wide breadth


The wide breadth detects a 10- or 12-character alphanumeric pattern that matches the Sri
Lanka National Identity Number format without checksum validation. It checks for common
test patterns.

Table 45-1042 Sri Lanka National Identity Number wide-breadth patterns

Pattern

\d\d\d\d\d\d\d\d\d[VvXx]

[2-9]\d\d\d\d\d\d\d\d\d\d\d

Table 45-1043 Sri Lanka National Identity Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Duplicate digits Ensures that a string of numbers is not all the same.
Library of system data identifiers 1491
Sri Lanka National Identity Number

Table 45-1043 Sri Lanka National Identity Number wide-breadth validators (continued)

Mandatory validator Description

Exclude ending characters Data ending with any of the following list of values is not
matched:

111111111, 222222222, 333333333, 444444444,


555555555, 666666666, 777777777, 888888888,
999999999

Sri Lanka National Identity Number medium breadth


The medium breadth detects a 10- or 12-character alphanumeric pattern that matches the Sri
Lanka National Identity Number format with checksum validation.

Table 45-1044 Sri Lanka National Identity Number medium-breadth patterns

Pattern

\d\d\d\d\d\d\d\d\d[VvXx]

[2-9]\d\d\d\d\d\d\d\d\d\d\d

Table 45-1045 Sri Lanka National Identity Number medium-breadth validators

Mandatory validator Description

Sri Lanka National Identification Number Validation Computes the checksum and validates the pattern against
Check it.

Sri Lanka National Identity Number narrow breadth


The narrow breadth detects a 10- or 12-character alphanumeric pattern that matches the Sri
Lanka National Identity Number format with checksum validation. It checks for common test
patterns, and also requires the presence of related keywords.

Table 45-1046 Sri Lanka National Identity Number narrow-breadth patterns

Pattern

\d\d\d\d\d\d\d\d\d[VvXx]

[2-9]\d\d\d\d\d\d\d\d\d\d\d
Library of system data identifiers 1492
Sweden Driver's Licence Number

Table 45-1047 Sri Lanka National Identity Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Duplicate digits Ensures that a string of numbers is not all the same.

Exclude ending characters Data ending with any of the following list of values is not
matched:

111111111, 222222222, 333333333, 444444444,


555555555, 666666666, 777777777, 888888888,
999999999

Sri Lanka National Identification Number Validation Computes the checksum and validates the pattern against
Check it.

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

identity card number, national identity card number,


national identification number, personal identification
number, nic

Sweden Driver's Licence Number


In Sweden, a driving license is required when operating a car, motorcycle or moped on public
roads. Driving licenses are issued by the prefectural governments public safety commissions
and are overseen on a nationwide basis by the National Police Agency.
The Sweden Driver's Licence Number data identifier detects a 10-digit number that matches
the Sweden Driver's License Number format.
The Sweden Driver's Licence Number data identifier provides three breadths of detection:
■ The wide breadth detects a 10-digit number without checksum validation.
See “Sweden Driver's Licence Number wide breadth” on page 1493.
■ The medium breadth detects a 10-digit number with checksum validation.
See “Sweden Driver's Licence Number medium breadth” on page 1493.
■ The narrow breadth detects a 10-digit number with checksum validation. It also requires
the presence of related keywords.
See “Sweden Driver's Licence Number narrow breadth” on page 1493.
Library of system data identifiers 1493
Sweden Driver's Licence Number

Sweden Driver's Licence Number wide breadth


The wide breadth detects a 10-digit number without checksum validation.

Table 45-1048 Sweden Driver's Licence Number wide-breadth patterns

Patterns

\d{6}-\d{4}

\d{6}+\d{4}

Table 45-1049 Sweden Driver's Licence Number wide-breadth validators

Mandatory validators Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

Sweden Driver's Licence Number medium breadth


The medium breadth detects a 10-digit number with checksum validation.

Table 45-1050 Sweden Driver's Licence Number medium-breadth patterns

Patterns

\d{6}-\d{4}

\d{6}+\d{4}

Table 45-1051 Sweden Driver's Licence Number medium-breadth validator

Mandatory validator Description

Sweden Tax Identification Number Validation Check Computes the checksum and validates the pattern against
it.

Sweden Driver's Licence Number narrow breadth


The narrow breadth detects a 10-digit number with checksum validation. It also requires the
presence of related keywords.
Library of system data identifiers 1494
Sweden Tax Identification Number

Table 45-1052 Sweden Driver's Licence Number narrow-breadth patterns

Patterns

\d{6}-\d{4}

\d{6}+\d{4}

Table 45-1053 Sweden Driver's Licence Number narrow-breadth validators

Mandatory validators Description

Duplicate digits Ensures that a string of digits is not all the same.

Sweden Tax Identification Number Validation Check Computes the checksum and validates the pattern against
it.

Number delimiter Validates a match by checking the surrounding characters.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

driver license, driver license number, drivers lic.,


drivers license, drivers license number, driving license
number, DLNo#, dlno#

ajokortti, permis de conducere, ajokortin numero,


kuljettajat lic., drivere lic., körkort, numărul permisului
de conducere, ‫שאָפער דערלויבעניש נומער‬, körkort
nummer, förare lic., ‫דריווערס דערלויבעניש‬,
körkortsnummer

Sweden Tax Identification Number


Sweden uses tax identification numbers (TINs) to identify taxpayers and facilitate the
administration of their national tax affairs. TINs are also useful for identifying taxpayers who
invest in other EU countries and are more reliable than other identifiers such as name and
address.
The Sweden Tax Identification Number data identifier detects a 10- or 12-digit number that
matches the Sweden Tax Identification Number format.
The Sweden Tax Identification Number data identifier provides three breadths of detection:
■ The wide breadth detects a 10- or 12-digit number without checksum validation.
See “ Sweden Tax Identification Number wide breadth” on page 1495.
Library of system data identifiers 1495
Sweden Tax Identification Number

■ The medium breadth detects a 10- or 12-digit number with checksum validation.
See “Sweden Tax Identification Number medium breadth” on page 1495.
■ The narrow breadth detects a 10- or 12-digit number with checksum validation. It also
requires the presence of related keywords.
See “Sweden Tax Identification Number narrow breadth” on page 1496.

Sweden Tax Identification Number wide breadth


The wide breadth detects a 10- or 12-digit number without checksum validation.

Table 45-1054 Sweden Tax Identification Number wide-breadth patterns

Patterns

\d{8}-\d{4}

\d{6}-\d{4}

\d{8}+\d{4}

\d{6}+\d{4}

Table 45-1055 Sweden Tax Identification Number wide-breadth validator

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Sweden Tax Identification Number medium breadth


The medium breadth detects a 10- or 12-digit number with checksum validation.

Table 45-1056 Sweden Tax Identification Number medium-breadth patterns

Patterns

\d{8}-\d{4}

\d{6}-\d{4}

\d{8}+\d{4}

\d{6}+\d{4}
Library of system data identifiers 1496
Sweden Value Added Tax (VAT) Number

Table 45-1057 Sweden Tax Identification Number medium-breadth validator

Mandatory validator Description

Sweden Tax Identification Number Validation Check Computes the checksum and validates the pattern against
it.

Sweden Tax Identification Number narrow breadth


The narrow breadth detects a 10- or 12-digit number with checksum validation. It also requires
the presence of related keywords.

Table 45-1058 Sweden Tax Identification Number narrow-breadth patterns

Patterns

\d{8}-\d{4}

\d{6}-\d{4}

\d{8}+\d{4}

\d{6}+\d{4}

Table 45-1059 Sweden Tax Identification Number narrow-breadth validators

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding characters.

Sweden Tax Identification Number Validation Check Computes the checksum and validates the pattern against
it.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

tin, tin number, tin no, tin#, sweden tin, sweden tin
number, sweden tin no, sweden tin#

skattebetalarens identifikationsnummer, sverige TIN,T


IN-nummer, nummer

Sweden Value Added Tax (VAT) Number


VAT is a consumption tax that is borne by the end consumer. VAT is paid for each transaction
in the manufacturing and distribution process.
Library of system data identifiers 1497
Sweden Value Added Tax (VAT) Number

The Sweden Value Added Tax (VAT) Number data identifier detects a 14-character
alphanumeric pattern that matches the Sweden VAT Number format.
The Sweden Value Added Tax (VAT) Number data identifier provides three breadths of
detection:
■ The wide breadth detects a 14-character alphanumeric pattern beginning with SE and
followed by 12 digits without checksum validation.
See “Sweden Value Added Tax (VAT) Number wide breadth” on page 1497.
■ The medium breadth detects a 14-character alphanumeric pattern beginning with SE and
followed by 12 digits with checksum validation.
See “Sweden Value Added Tax (VAT) Number medium breadth” on page 1497.
■ The narrow breadth detects a 14-character alphanumeric pattern beginning with SE and
followed by 12 digits with checksum validation. It also requires the presence of related
keywords.
See “Sweden Value Added Tax (VAT) Number narrow breadth” on page 1498.

Sweden Value Added Tax (VAT) Number wide breadth


The wide breadth detects a 14-character alphanumeric pattern beginning with SE and followed
by 12 digits without checksum validation.

Table 45-1060 Sweden Value Added Tax (VAT) Number wide-breadth pattern

Pattern

[Ss][Ee]\d{12}

Table 45-1061 Sweden Value Added Tax (VAT) Number wide-breadth validators

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

000000000000, 111111111111, 222222222222,


333333333333, 444444444444, 555555555555,
666666666666, 777777777777, 888888888888,
999999999999

Sweden Value Added Tax (VAT) Number medium breadth


The medium breadth detects a 14-character alphanumeric pattern beginning with SE and
followed by 12 digits with checksum validation.
Library of system data identifiers 1498
Sweden Value Added Tax (VAT) Number

Table 45-1062 Sweden Value Added Tax (VAT) Number medium-breadth pattern

Pattern

[Ss][Ee]\d{12}

Table 45-1063 Sweden Value Added Tax (VAT) Number medium-breadth validator

Mandatory validator Description

Sweden Value Added Tax Number Validation Check Computes the checksum and validates the pattern against
it.

Sweden Value Added Tax (VAT) Number narrow breadth


The narrow breadth detects a 14-character alphanumeric pattern beginning with SE and followed
by 12 digits with checksum validation. It also requires the presence of related keywords.

Table 45-1064 Sweden Value Added Tax (VAT) Number narrow-breadth pattern

Pattern

[Ss][Ee]\d{12}

Table 45-1065 Sweden Value Added Tax (VAT) Number narrow-breadth validatorsa

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

000000000000, 111111111111, 222222222222,


333333333333, 444444444444, 555555555555,
666666666666, 777777777777, 888888888888,
999999999999

Sweden Value Added Tax Number Validation Check Computes the checksum and validates the pattern against
it.
Library of system data identifiers 1499
Swedish Passport Number

Table 45-1065 Sweden Value Added Tax (VAT) Number narrow-breadth validatorsa (continued)

Mandatory validators Description

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

vat no., vat number, moms#, vat#, value added tax


number, vat no, sweden vat, sverige moms, sweden
vat number, sverige momsnummer, sweden vat no,
sverige moms nr, sweden vat#, sweden vat nummer,
sweden momsnummmer, sweden vat no., sweden
value added tax number, momsregistreringsnummer

Swedish Passport Number


Swedish passports are issued to nationals of Sweden for the purpose of international travel.
Besides serving as proof of Swedish citizenship, they facilitate the process of securing
assistance from Swedish consular officials abroad or other European Union member states
in case a Swedish consular is absent, if needed.
The Swedish Passport Number data identifier detects a valid Swedish Passport Number
pattern.
The Swedish Passport Number data identifier provides two breadths of detection:
■ The wide breadth detects a valid Swedish Passport Number pattern.
See “Swedish Passport Number wide breadth” on page 1499.
■ The narrow breadth detects a valid Swedish Passport Number pattern. It also requires the
presence of related keywords.
See “Swedish Passport Number narrow breadth” on page 1500.

Swedish Passport Number wide breadth


The wide breadth detects a valid Swedish Passport Number pattern.

Table 45-1066 Swedish Passport Number wide-breadth patterns

Patterns

\d{8}

\d{2}-\d{6}
Library of system data identifiers 1500
Swedish Passport Number

Table 45-1066 Swedish Passport Number wide-breadth patterns (continued)

Patterns

\l{2}-\d{6}

Table 45-1067 Swedish Passport Number wide-breadth validator

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Swedish Passport Number narrow breadth


The narrow breadth detects a valid Swedish Passport Number pattern. It also requires the
presence of related keywords.

Table 45-1068 Swedish Passport Number narrow-breadth patterns

Patterns

\d{8}

\d{2}-\d{6}

\l{2}-\d{6}

Table 45-1069 Swedish Passport Number narrow-breadth validators

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding characters.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

passport, Passport, Sweden Passport, Swedish


passport, passport number, passport no, Passport
Number

Passnummer, pass, sverige pass, SVERIGE PASS,


sverige Passnummer
Library of system data identifiers 1501
Sweden Personal Identification Number

Sweden Personal Identification Number


The Sweden Personal Identification Number is the unique national identification for Swedish
every citizen. The number is used by authorities, health care, schools, universities, banks, and
insurance companies for customer identification.
The Sweden Personal Identification Number data identifier detects a 10- or 12-digit number
that matches the Sweden Personal Identification Number format.
The Sweden Personal Identification Number system data identifier provides three breadths of
detection:
■ The wide breadth detects a 10- or 12-digit number without checksum validation.
See “Sweden Personal Identification Number wide breadth” on page 1501.
■ The medium breadth detects a 10- or 12-digit number with checksum validation.
See “Sweden Personal Identification Number medium breadth ” on page 1502.
■ The narrow breadth detects a 10- or 12-digit number with checksum validation. It also
requires the presence of related keywords.
See “Sweden Personal Identification Number narrow breadth” on page 1502.

Sweden Personal Identification Number wide breadth


The wide breadth detects a 10- or 12-digit number without checksum validation.

Table 45-1070 Sweden Personal Identification Number wide-breadth patterns

Pattern

\d\d[01]\d[01236789]\d[-]\d\d\d\d

\d\d[01]\d[01236789]\d[+]\d\d\d\d

\d\d[01]\d[01236789]\d\d\d\d\d

[12][098]\d\d[01]\d[01236789]\d[-]\d\d\d\d

[12][098]\d\d[01]\d[01236789]\d[+]\d\d\d\d

[12][098]\d\d[01]\d[01236789]\d\d\d\d\d

Table 45-1071 Sweden Personal Identification Number wide-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.
Library of system data identifiers 1502
Sweden Personal Identification Number

Sweden Personal Identification Number medium breadth


The medium breadth detects a 10- or 12-digit number with checksum validation.

Table 45-1072 Sweden Personal Identification Number medium-breadth patterns

Pattern

\d\d[01]\d[01236789]\d[-]\d\d\d\d

\d\d[01]\d[01236789]\d[+]\d\d\d\d

\d\d[01]\d[01236789]\d\d\d\d\d

[12][098]\d\d[01]\d[01236789]\d[-]\d\d\d\d

[12][098]\d\d[01]\d[01236789]\d[+]\d\d\d\d

[12][098]\d\d[01]\d[01236789]\d\d\d\d\d

Table 45-1073 Sweden Personal Identification Number medium-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Sweden Personal Identification Number Validation Computes the checksum and validates the pattern against
Check it.

Sweden Personal Identification Number narrow breadth


The narrow breadth detects a 10- or 12-digit number with checksum validation. It also requires
the presence of related keywords.

Table 45-1074 Sweden Personal Identification Number narrow-breadth patterns

Pattern

\d\d[01]\d[01236789]\d[-]\d\d\d\d

\d\d[01]\d[01236789]\d[+]\d\d\d\d

\d\d[01]\d[01236789]\d\d\d\d\d

[12][098]\d\d[01]\d[01236789]\d[-]\d\d\d\d

[12][098]\d\d[01]\d[01236789]\d[+]\d\d\d\d

[12][098]\d\d[01]\d[01236789]\d\d\d\d\d
Library of system data identifiers 1503
SWIFT Code

Table 45-1075 Sweden Personal Identification Number narrow-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

Sweden Personal Identification Number Validation Computes the checksum and validates the pattern against
Check it.

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched when you use this
option.

Inputs:

personal ID number, identification number, personal


ID no, personal id no, identity no, identification no,
personal identification no, person id no

personnummer ID, personligt id-nummer, unikt


id-nummer, personnummer, identifikationsnumret,
personnummer#, identifikationsnumret#

SWIFT Code
The SWIFT Code is a unique identifier for banks and is managed by the Society for Worldwide
Interbank Financial Telecommunications (SWIFT). The SWIFT Code is required for monetary
transfers between financial institutions. It is also known as the Bank Identifier Code (BIC).
The SWIFT Code data identifier detects an 8- or 11-character alphanumeric pattern that
matches the SWIFT Code format.
This data identifier provides two breadths of validation:
■ The wide breadth detects an detects an 8- or 11-character alphanumeric pattern without
checksum validation. It requires the presence of related keywords.
See “SWIFT Code wide breadth” on page 1503.
■ The narrow breadth detects an 8- or 11-character alphanumeric pattern without checksum
validation. It requires the presence of related keywords.
See “SWIFT Code narrow breadth” on page 1504.

SWIFT Code wide breadth


The wide breadth of the SWIFT Code data identifier detects 8- or 11-character alphanumeric
patterns. The 5th and 6th characters are the country code. This breadth also requires the
presence of a SWIFT-related keyword.
Library of system data identifiers 1504
SWIFT Code

Table 45-1076 SWIFT Code wide-breadth patterns

Pattern

[A-Z]{6}\w{2}

[A-Z]{6}\w{5}

Table 45-1077 SWIFT Code wide-breadth validators

Mandatory validators Description

Require beginning characters With this option selected, any of the following list of values are required at the
beginning of the matched data.

Inputs:

af, ax, al, dz, as, ad, ao, ai, aq, ag, ar, am, aw, au, at, az, bs, bh, bd, bb, by,
be, bz, bj, bm, bt, bo, ba, bw, bv, br, io, bn, bg, bf, bi, kh, cm, ca, cv, ky, cf,
td, cl, cn, cx, cc, co, km, cg, cd, ck, cr, ci, hr, cu, cy, cz, dk, dj, dm, do, ec,
eg, sv, gq, er, ee, et, fk, fo, fj, fi, fr, gf, pf, tf, ga, gm, ge, de, gh, gi, gr, gl, gd,
gp, gu, gt, gg, gn, gw, gy, ht, hm, va, hn, hk, hu, is, in, id, ir, iq, ie, im, il, it,
jm, jp, je, jo, kz, ke, ki, kp, kr, kw, kg, la, lv, lb, ls, lr, ly, li, lt, lu, mo, mk, mg,
mw, my, mv, ml, mt, mh, mq, mr, mu, yt, mx, md, mc, mn, me, ms, ma, mz,
mm, na, nr, np, nl, an, nc, nz, ni, ne, ng, nu, nf, mp, no, om, pk, pw, ps, pa,
pg, py, pe, ph, pn, pl, pt, pr, qa, re, ro, ru, rw, sh, kn, lc, pm, vc, ws, sm, st,
sa, sn, rs, sc, sl, sg, sk, si, sb, so, za, gs, es, lk, sd, sr, sj, sz, se, ch, sy, tw,
tj, tz, th, tl, tg, tk, to, tt, tn, tr, tm, tc, tv, ug, ua, ae, gb, us, um, uy, uz, vu, ve,
vn, vg, vi, wf, eh, ye, zm, zw

Find keywords With this option selected, at least one of the following keywords or key phrases
must be present for the data to be matched.

Inputs:

bic, bic#, international organization for standardization 9362, iso 9362,


iso9362, swift, swift#, swiftcode, swiftnumber, swiftroutingnumber

SWIFT Code narrow breadth


The narrow breadth of the SWIFT Code data identifier detects 8- or 11-character strings. The
5th and 6th characters are letters referring to a country code. This breadth also requires the
presence of specific SWIFT-related keywords.

Table 45-1078 SWIFT Code narrow- breadth patterns

Patterns

[A-Z]{6}\w{2}
Library of system data identifiers 1505
Swiss AHV Number

Table 45-1078 SWIFT Code narrow- breadth patterns (continued)

Patterns

[A-Z]{6}\w{5}

Table 45-1079 SWIFT Code narrow-breadth validators

Validators Description

Require beginning characters With this option selected, any of the following list of values are required at the
beginning of the matched data.

Inputs:

af, ax, al, dz, as, ad, ao, ai, aq, ag, ar, am, aw, au, at, az, bs, bh, bd, bb, by,
be, bz, bj, bm, bt, bo, ba, bw, bv, br, io, bn, bg, bf, bi, kh, cm, ca, cv, ky, cf,
td, cl, cn, cx, cc, co, km, cg, cd, ck, cr, ci, hr, cu, cy, cz, dk, dj, dm, do, ec,
eg, sv, gq, er, ee, et, fk, fo, fj, fi, fr, gf, pf, tf, ga, gm, ge, de, gh, gi, gr, gl, gd,
gp, gu, gt, gg, gn, gw, gy, ht, hm, va, hn, hk, hu, is, in, id, ir, iq, ie, im, il, it,
jm, jp, je, jo, kz, ke, ki, kp, kr, kw, kg, la, lv, lb, ls, lr, ly, li, lt, lu, mo, mk, mg,
mw, my, mv, ml, mt, mh, mq, mr, mu, yt, mx, md, mc, mn, me, ms, ma, mz,
mm, na, nr, np, nl, an, nc, nz, ni, ne, ng, nu, nf, mp, no, om, pk, pw, ps, pa,
pg, py, pe, ph, pn, pl, pt, pr, qa, re, ro, ru, rw, sh, kn, lc, pm, vc, ws, sm, st,
sa, sn, rs, sc, sl, sg, sk, si, sb, so, za, gs, es, lk, sd, sr, sj, sz, se, ch, sy, tw,
tj, tz, th, tl, tg, tk, to, tt, tn, tr, tm, tc, tv, ug, ua, ae, gb, us, um, uy, uz, vu, ve,
vn, vg, vi, wf, eh, ye, zm, zw

Find keywords With this option selected, at least one of the following keywords or keyphrases
must be present for the data to be matched.
Inputs:

bic#, international organization for standardization 9362, iso 9362, iso9362,


swift#, swiftcode, swiftnumber, swiftroutingnumber, swift code, swift
number, swift routing number, bic number, bic code, bic #

Swiss AHV Number


The Swiss Alters- und Hinterlassenenversicherungsnummer (AHV) is an identifier for the social
security system in Switzerland for the aged and bereaved. The AHV also serves as a tax
identification number for individuals.
The Swiss AHV Number data identifier detects an 11-digit number that matches the AHV
Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects an 11-digit number with checksum validation.
See “Swiss AHV Number wide breadth” on page 1506.
Library of system data identifiers 1506
Swiss AHV Number

■ The narrow breadth detects an 11-digit number with checksum validation. It also requires
the presence of related keywords.
See “Swiss AHV Number narrow breadth” on page 1506.

Swiss AHV Number wide breadth


The wide breadth detects an 11-digit number with checksum validation.

Table 45-1080 Swiss AHV Number wide-breadth patterns

Pattern

\d{3}.\d{2}.\d{3}.\d{3}

\d{11}

Table 45-1081 Swiss AHV Number wide-breadth validators

Mandatory validator Description

Swiss AHV Computes the checksum and validates the pattern against
it.

Number delimiter Validates a match by checking the surrounding characters.

Swiss AHV Number narrow breadth


The narow breadth detects an 11-digit number with checksum validation. It also requires the
presence of related keywords.

Table 45-1082 Swiss AHV Number narrow-breadth patterns

Pattern

\d{3}.\d{2}.\d{3}.\d{3}

\d{11}

Table 45-1083 Swiss AHV Number narrow-breadth validators

Mandatory validator Description

Swiss AHV Computes the checksum and validates the pattern against
it.

Number delimiter Validates a match by checking the surrounding characters.


Library of system data identifiers 1507
Swiss Social Security Number (AHV)

Table 45-1083 Swiss AHV Number narrow-breadth validators (continued)

Mandatory validator Description

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched when you use this
option.

Inputs:

Numéro AVS, identifiant national, numéro de sécurité


sociale, Numéro AVH, AVS number, insurance number,
national identifier, national insurance number, social
security number, AVH number, AHV-Nummer,
Personenidentifikationsnummer, Schweizer
Registrierungsnummer, AHV number, Swiss
registration number, PIN, AVH, AVS, numéro
d'assurance vieillesse, numéro d'assuré

Swiss Social Security Number (AHV)


The Swiss Alters- und Hinterlassenenversicherungsnummer (AHV) is an identifier for the social
security system in Switzerland for the aged and bereaved. The AHV also serves as a tax
identification number for individuals.
The Swiss Social Security Number (AHV) data identifier detects a 13-digit number that matches
the AHV format.
The Swiss Social Security Number system data identifier provides three breadths of detection:
■ The wide breadth detects a 13-digit number without checksum validation.
See “Swiss Social Security Number (AHV) wide breadth” on page 1507.
■ The medium breadth detects a 13-digit number without checksum validation.
See “Swiss Social Security Number (AHV) medium breadth” on page 1508.
■ The narrow breadth detects a 13-digit number with checksum validation. It also requires
the presence of related keywords.
See “Swiss Social Security Number (AHV) narrow breadth” on page 1508.

Swiss Social Security Number (AHV) wide breadth


The wide breadth detects a 13-digit number without checksum validation.
Library of system data identifiers 1508
Swiss Social Security Number (AHV)

Table 45-1084 Swiss Social Security Number (AHV) wide-breadth patterns

Pattern

[7][5][6]\d{10}

[7][5][6][.]\d{4}[.]\d{4}[.]\d{2}

[Cc][Hh][Ee]-\d\d\d[.]\d\d\d[.]\d\d\d

[Cc][Hh][Ee]\d\d\d\d\d\d\d\d\d

Table 45-1085 Swiss Social Security Number (AHV) wide-breadth validator

Validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Swiss Social Security Number (AHV) medium breadth


The medium breadth detects a 13-digit number with checksum validation.

Table 45-1086 Swiss Social Security Number (AHV) medium-breadth patterns

Pattern

[7][5][6]\d{10}

[7][5][6][.]\d{4}[.]\d{4}[.]\d{2}

[Cc][Hh][Ee]-\d\d\d[.]\d\d\d[.]\d\d\d

[Cc][Hh][Ee]\d\d\d\d\d\d\d\d\d

Table 45-1087 Swiss Social Security Number (AHV) medium-breadth validators

Validator Description

Number delimiter Validates a match by checking the surrounding numbers.

Swiss Social Security Number Validation Check Computes the checksum and validates the pattern against
it.

Swiss Social Security Number (AHV) narrow breadth


The narrow breadth detects a 13-digit number with checksum validation. It also requires the
presence related keywords.
Library of system data identifiers 1509
Switzerland Health Insurance Card Number

Table 45-1088 Swiss Social Security Number (AHV) narrow-breadth patterns

Pattern

[7][5][6]\d{10}

[7][5][6][.]\d{4}[.]\d{4}[.]\d{2}

[Cc][Hh][Ee]-\d\d\d[.]\d\d\d[.]\d\d\d

[Cc][Hh][Ee]\d\d\d\d\d\d\d\d\d

Table 45-1089 Swiss Social Security Number (AHV) narrow-breadth validators

Validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding numbers.

Swiss Social Security Number Validation Check Computes the checksum and validates the pattern against
it.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

AHV, SSN, PID, insurance number, personalIdno#,


social security number, personal ID number, personal
identification no., insuranceno#, uniqueIdno#, unique
identification no., AVS, AHV number, AVS number,
social security number, personalidno#, personal
identity no, tax identification number, tax number, tax
id, tax identification no., tax no.

Identifikationsnummer, sozialversicherungsnummer,
identification personnelle ID,
Steueridentifikationsnummer, Steuer ID, codice fiscale,
Steuernummer

Switzerland Health Insurance Card Number


Swiss insurance providers issue health insurance cards to their customers. Swiss health
insurance cards can also be used to access European health services.
The Switzerland Health Insurance Card Number data identifier detects a 20-digit number that
matches the Swiss health insurance card number format.
Library of system data identifiers 1510
Switzerland Health Insurance Card Number

This data identifier provides the following breadths of detection:


■ The wide breadth detects a 20-digit number that matches the Swiss health insurance card
number format. It checks for common test numbers.
See “Switzerland Health Insurance Card Number wide breadth” on page 1510.
■ The narrow breadth detects a 20-digit number that matches the Swiss health insurance
card number format. It checks for common test numbers, and also requires the presence
of related keywords.
See “Switzerland Health Insurance Card Number narrow breadth” on page 1510.

Switzerland Health Insurance Card Number wide breadth


The wide breadth detects a 20-digit number that matches the Swiss health insurance card
number format. It checks for common test numbers.

Table 45-1090 Switzerland Health Insurance Card Number wide-breadth patterns

Pattern

807560\d{14}

Table 45-1091 Switzerland Health Insurance Card Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

80756000000000000000, 80756011111111111111,
80756022222222222222, 80756033333333333333,
80756044444444444444, 80756055555555555555,
80756066666666666666, 80756077777777777777,
80756088888888888888, 80756099999999999999

Switzerland Health Insurance Card Number narrow breadth


The narrow breadth detects a 20-digit number that matches the Swiss health insurance card
number format. It checks for common test numbers, and also requires the presence of related
keywords.
Library of system data identifiers 1511
Switzerland Passport Number

Table 45-1092 Switzerland Health Insurance Card Number narrow-breadth patterns

Pattern

807560\d{14}

Table 45-1093 Switzerland Health Insurance Card Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

80756000000000000000, 80756011111111111111,
80756022222222222222, 80756033333333333333,
80756044444444444444, 80756055555555555555,
80756066666666666666, 80756077777777777777,
80756088888888888888, 80756099999999999999

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

medical account number, health insurance card


number, insurance card number, hic, hic#,
medizinische Kontonummer,
Krankenversicherungskarte Nummer, insurance card
number, numero conto medico, tessera sanitaria
assicurazione numero, assicurazione sanitaria numero,
ehic, ehic#, ehic number

Switzerland Passport Number


Swiss passports are issued to citizens of Switzerland to facilitate international travel.
The Switzerland Passport Number data identifier detects an eight-character alphanumeric
pattern that matches the Swiss passport number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects an eight-character alphanumeric pattern that matches the Swiss
passport number format. It checks for common test patterns.
See “Switzerland Passport Number wide breadth” on page 1512.
Library of system data identifiers 1512
Switzerland Passport Number

■ The narrow breadth detects an eight-character alphanumeric pattern that matches the
Swiss passport number format. It checks for common test patterns, and also requires the
presence of related keywords.
See “Switzerland Passport Number narrow breadth” on page 1512.

Switzerland Passport Number wide breadth


The wide breadth detects an eight-character alphanumeric pattern that matches the Swiss
passport number format. It checks for common test patterns.

Table 45-1094 Switzerland Passport Number wide-breadth patterns

Pattern

[a-zA-Z]\d{7}

Table 45-1095 Switzerland Passport Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

1111111, 2222222, 3333333, 4444444, 5555555,


6666666, 7777777, 8888888, 9999999

Switzerland Passport Number narrow breadth


The narrow breadth detects an eight-character alphanumeric pattern that matches the Swiss
passport number format. It checks for common test patterns, and also requires the presence
of related keywords.

Table 45-1096 Switzerland Passport Number narrow-breadth patterns

Pattern

[a-zA-Z]\d{7}

Table 45-1097 Switzerland Passport Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.


Library of system data identifiers 1513
Switzerland Value Added Tax (VAT) Number

Table 45-1097 Switzerland Passport Number narrow-breadth validators (continued)

Mandatory validator Description

Exclude ending characters Data ending with any of the following list of values is not
matched:

1111111, 2222222, 3333333, 4444444, 5555555,


6666666, 7777777, 8888888, 9999999

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

Passport, passport number, passport, passport no,


passport#, passportno, Passport No., Passport No,
PASSPORT

Passeport, passeport, numéro passeport, numéro de


passeport,passeport#, No de passeport, No de
passeport., Numéro de passeport

PASSEPORT, LIVRE DE PASSEPORT, Pass

Passnummer, Pass#, Pass Nr., Pass Nr, PASS

Passaporto, Numero di passaporto, passaporto,


Passaporto n,Passaporto n., passaporto#, Passaport,
numero passaporto, numero di passaporto, numero
passaporto, passaporto n, PASSAPORTO

Reisepass, Reisepass#, REISEPASS

Switzerland Value Added Tax (VAT) Number


Value Added Tax (VAT) is a consumption tax that is borne by the end consumer. VAT is paid
for each transaction in the manufacturing and distribution process. For Switzerland, VAT is
administered by the Federal Statistical Office for the region in which the business is established.
The Switzerland Value Added Tax (VAT) Number data identifier detects a 15- or 16-character
alphanumeric pattern that matches the Swiss VAT number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects a 15- or 16-character alphanumeric pattern that matches the
Swiss VAT number format without checksum validation.
See “Switzerland Value Added Tax (VAT) Number wide breadth” on page 1514.
■ The medium breadth detects a 15- or 16-character alphanumeric pattern that matches the
Swiss VAT number format with checksum validation.
See “Switzerland Value Added Tax (VAT) Number medium breadth” on page 1514.
Library of system data identifiers 1514
Switzerland Value Added Tax (VAT) Number

■ The narrow breadth detects a 15- or 16-character alphanumeric pattern that matches the
Swiss VAT number format with checksum validation. It checks for common test patterns,
and also requires the presence of related keywords.
See “Switzerland Value Added Tax (VAT) Number narrow breadth” on page 1515.

Switzerland Value Added Tax (VAT) Number wide breadth


The wide breadth detects a 15- or 16-character alphanumeric pattern that matches the Swiss
VAT number format without checksum validation.

Table 45-1098 Switzerland Value Added Tax (VAT) Number wide-breadth patterns

Pattern

[Cc][Hh][Ee]-\d{3}[.]\d{3}[.]\d{3} [Tt][Vv][Aa]

[Cc][Hh][Ee]-\d{3}[.]\d{3}[.]\d{3} [Mm][Ww][Ss][Tt]

[Cc][Hh][Ee]-\d{3}[.]\d{3}[.]\d{3} [Ii][Vv][Aa]

Table 45-1099 Switzerland Value Added Tax (VAT) Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Switzerland Value Added Tax (VAT) Number medium breadth


The medium breadth detects a 15- or 16-character alphanumeric pattern that matches the
Swiss VAT number format with checksum validation.

Table 45-1100 Switzerland Value Added Tax (VAT) Number medium-breadth patterns

Pattern

[Cc][Hh][Ee]-\d{3}[.]\d{3}[.]\d{3} [Tt][Vv][Aa]

[Cc][Hh][Ee]-\d{3}[.]\d{3}[.]\d{3} [Mm][Ww][Ss][Tt]

[Cc][Hh][Ee]-\d{3}[.]\d{3}[.]\d{3} [Ii][Vv][Aa]

Table 45-1101 Switzerland Value Added Tax (VAT) Number medium-breadth validators

Mandatory validator Description

Switzerland Value Added Tax (VAT) Number Validation Computes the checksum and validates the pattern against
Check it.
Library of system data identifiers 1515
Taiwan ROC ID

Switzerland Value Added Tax (VAT) Number narrow breadth


The narrow breadth detects a 15- or 16-character alphanumeric pattern that matches the Swiss
VAT number format with checksum validation. It requires the presence of related keywords.

Table 45-1102 Switzerland Value Added Tax (VAT) Number narrow-breadth patterns

Pattern

[Cc][Hh][Ee]-\d{3}[.]\d{3}[.]\d{3} [Tt][Vv][Aa]

[Cc][Hh][Ee]-\d{3}[.]\d{3}[.]\d{3} [Mm][Ww][Ss][Tt]

[Cc][Hh][Ee]-\d{3}[.]\d{3}[.]\d{3} [Ii][Vv][Aa]

Table 45-1103 Switzerland Value Added Tax (VAT) Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Switzerland Value Added Tax (VAT) Number Validation Computes the checksum and validates the pattern against
Check it.

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

vat, vat number, vat#, value added tax number, VAT,


VAT#, vat registration number, VAT Number

T.V.A, numéro TVA, T.V.A#, numéro taxe valeur


ajoutée, T.V.A., taxe sur la valeur ajoutée, T.V.A#,
numéro enregistrement TVA, Numéro TVA

I.V.A, Partita IVA, I.V.A#, numero IVA

MwSt, Umsatzsteuer-Identifikationsnummer, MwSt#,


Mehrwertsteuer-Nummer, Mehrwertsteuer, VAT
Registrierungsnummer,
Umsatzsteuer-Identifikationsnummer

Taiwan ROC ID
In Taiwan an ID card is mandatory for all citizens who are over 14-years old. The ID card has
been uniformly numbered since 1965.
The Taiwan ROC ID data identifier detects the presence of Taiwan identification number based
on two types of common ID patterns. The last character matched is used to validate a checksum.
Library of system data identifiers 1516
Taiwan ROC ID

The Taiwan ROC ID data identifier provides two breadths of detection:


■ The wide breadth detects a Taiwan ROC ID number with checksum validation.
See “Taiwan ROC ID wide breadth” on page 1516.
■ The narrow breadth detects a Taiwan ROC ID number with checksum validation. It also
requires the presence of related keywords.
See “Taiwan ROC ID narrow breadth” on page 1516.

Taiwan ROC ID wide breadth


The wide breadth detects a Taiwan ROC ID number with checksum validation.

Table 45-1104 Taiwan ROC ID wide-breadth patterns

Patterns

[A-Z][12][0-3]\d{7}

[A-Z][ABCD]\d{8}

Table 45-1105 Taiwan ROC ID wide-breadth validator

Validator Description

Taiwan ID Computes the checksum and validates the pattern against


it.

Taiwan ROC ID narrow breadth


The narrow breadth detects a Taiwan ROC ID number with checksum validation. It also requires
the presence of Taiwan ROC ID-related keywords.

Table 45-1106 Taiwan ROC ID narrow-breadth patterns

Patterns

[A-Z][12][0-3]\d{7}

[A-Z][ABCD]\d{8}

Table 45-1107 Taiwan ROC ID narrow-breadth validators

Validator Description

Taiwan ID Computes the checksum and validates the pattern against


it.
Library of system data identifiers 1517
Thailand Passport Number

Table 45-1107 Taiwan ROC ID narrow-breadth validators (continued)

Validator Description

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched when you use this
option.

Inputs:

中華民國國民身分證

ROC ID, National Identification Card, ROCID#

Thailand Passport Number


The Thai passport is issued to citizens and nationals of Thailand by the Passport Division of
the Department of Consular Affairs of the Ministry of Foreign Affairs.
The Thailand Passport Number data identifier detects a seven-, eight-, or nine-character
alphanumeric pattern that matches the Thailand Passport Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects a seven-, eight-, or nine-character alphanumeric pattern that
matches the Thailand Passport Number format. It checks for common test patterns.
See “Thailand Passport Number wide breadth” on page 1517.
■ The narrow breadth detects a seven-, eight-, or nine-character alphanumeric pattern that
matches the Thailand Passport Number format. It checks for common test patterns, and
also requires the presence of related keywords.
See “Thailand Passport Number narrow breadth” on page 1518.

Thailand Passport Number wide breadth


The wide breadth detects a seven-, eight-, or nine-character alphanumeric pattern that matches
the Thailand Passport Number format. It checks for common test patterns.

Table 45-1108 Thailand Passport Number wide-breadth patterns

Pattern

[A-za-z]\d\d\d\d\d\d\d

[A-Za-z][A-Za-z]\d\d\d\d\d\d\d

[A-Za-z]\d\d\d\d\d\d
Library of system data identifiers 1518
Thailand Passport Number

Table 45-1109 Thailand Passport Number wide-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

000000, 111111, 222222, 333333, 444444, 555555,


666666, 777777, 888888, 999999

0000000, 1111111, 2222222, 3333333, 4444444,


5555555, 6666666, 7777777, 8888888, 9999999

Thailand Passport Number narrow breadth


The narrow breadth detects a seven-, eight-, or nine-character alphanumeric pattern that
matches the Thailand Passport Number format. It checks for common test patterns, and also
requires the presence of related keywords.

Table 45-1110 Thailand Passport Number narrow-breadth patterns

Pattern

[A-za-z]\d\d\d\d\d\d\d

[A-Za-z][A-Za-z]\d\d\d\d\d\d\d

[A-Za-z]\d\d\d\d\d\d

Table 45-1111 Thailand Passport Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

000000, 111111, 222222, 333333, 444444, 555555,


666666, 777777, 888888, 999999

0000000, 1111111, 2222222, 3333333, 4444444,


5555555, 6666666, 7777777, 8888888, 9999999
Library of system data identifiers 1519
Thailand Personal Identification Number

Table 45-1111 Thailand Passport Number narrow-breadth validators (continued)

Mandatory validator Description

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

passport, passport number, passport no, passportno,


passport no., passport#, passportno#

หนังสือเดิน ทาง,หมายเลขหนังสือเดินทาง

Thailand Personal Identification Number


The Thailand Personal Identification Number is a unique personal identifier assigned at birth
or upon receiving Thai citizenship.
The Thailand Personal Identification Number data identifier detects a 13-digit number that
matches the Thailand Personal Identification Number format.
The Thailand Personal Identification Number data identifier provides three breadths of detection:
■ The wide breadth detects a 13-digit number without checksum validation.
See “Thailand Personal Identification Number wide breadth” on page 1519.
■ The medium breadth detects a 13-digit number with checksum validation.
See “Thailand Personal Identification Number medium breadth” on page 1520.
■ The narrow breadth detects a 13-digit number with checksum validation. It also requires
the presence of related keywords.
See “Thailand Personal Identification Number narrow breadth” on page 1520.

Thailand Personal Identification Number wide breadth


The wide breadth detects a 13-digit number without checksum validation.

Table 45-1112 Thailand Personal Identification Number wide-breadth patterns

Pattern

[1-8]\d{12}

[1-8][ -]\d{4}[ -]\d{5}[ -]\d{2}[ -]\d


Library of system data identifiers 1520
Thailand Personal Identification Number

Table 45-1113 Thailand Personal Identification Number wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Thailand Personal Identification Number medium breadth


The wide breadth detects a 13-digit number with checksum validation.

Table 45-1114 Thailand Personal Identification Number medium-breadth patterns

Pattern

[1-8]\d{12}

[1-8][ -]\d{4}[ -]\d{5}[ -]\d{2}[ -]\d

Table 45-1115 Thailand Personal ID Number medium-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Thailand Personal ID Number Validation Check Computes the checksum and validates the pattern against
it.

Thailand Personal Identification Number narrow breadth


The narrow breadth detects a 13-digit number with checksum validation. It also requires the
presence of a Thai Personal ID Number-related keyword.

Table 45-1116 Thailand Personal Identification Number narrow-breadth patterns

Pattern

[1-8]\d{12}

[1-8][ -]\d{4}[ -]\d{5}[ -]\d{2}[ -]\d

Table 45-1117 Thailand Personal Identification Number narrow-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.


Library of system data identifiers 1521
Turkish Identification Number

Table 45-1117 Thailand Personal Identification Number narrow-breadth validators (continued)

Mandatory validator Description

Thailand Personal ID Number Validation Check Computes the checksum and validates the pattern against
it.

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched when you use this
option.

Inputs:

PID, Insurance Number, Personal ID Number, personal


identification no., unique identification no.,
personalidno#, insuranceno#, personalIdno#,
uniqueIdno#, personal identity no

ประกันภัยจำนวน, หมายเลขประจำตัวส่วนบุคคล,
หมายเลขประจำตัวที่ไม่ซ้ำกัน, ประกันภัยจำนวน#,
หมายเลขประจำตัวส่วนบุคคล#, หมายเลขประจำตัวทีไ ่ มซ้ำกัน#

Turkish Identification Number


The Turkish Identification Number (T.C. Kimlik No.) is a unique 11-digit personal identification
number that is assigned to every citizen of Turkey.
The Turkish Identification Number data identifier detects an 11-digit number that matches the
Turkish Identification Number format.
The Turkish Identification Number data identifier provides three breadths of detection:
■ The wide breadth detects an 11-digit number without checksum validation.
See “ Turkish Identification Number wide breadth” on page 1521.
■ The medium breadth detects an 11-digit number with checksum validation.
See “Turkish Identification Number medium breadth” on page 1522.
■ The narrow breadth detects an 11-digit number with checksum validation. It also requires
the presence of related keywords
See “Turkish Identification Number narrow breadth” on page 1522.

Turkish Identification Number wide breadth


The wide breadth detects an 11-digit number without checksum validation.
Library of system data identifiers 1522
Turkish Identification Number

Table 45-1118 Turkish Identification Number wide-breadth pattern

Pattern

[123456789]\d{10}

Table 45-1119 Turkish Identification Number wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Turkish Identification Number medium breadth


The medium breadth detects an 11-digit number with checksum validation.

Table 45-1120 Turkish Identification Number medium-breadth pattern

Pattern

[123456789]\d{10}

Table 45-1121 Turkish Identification Number medium-breadth validators

Mandatory validator Description

Turkish Identification Number Validation Check Computes the checksum and validates the pattern against
it.

Number delimiter Validates a match by checking the surrounding numbers.

Turkish Identification Number narrow breadth


The narrow breadth detects an 11-digit number with checksum validation. It also requires the
presence of related keywords

Table 45-1122 Turkish Identification Number narrow-breadth patterns

Pattern

[123456789]\d{10}
Library of system data identifiers 1523
UK Bank Account Number Sort Code

Table 45-1123 Turkish Identification Number narrow-breadth validators

Mandatory validator Description

Turkish Identification Number Validation Check Computes the checksum and validates the pattern against
it.

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding numbers.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

Identification Number, Personal identification number,


Citizen ID, personal id no, id no#, citizen id no, identity
number, Personal identity no.

Kimlik Numarası, Türkiye Cumhuriyeti Kimlik


Numarası, vatandaş kimliği, kişisel kimlik no, kimlik
Numarası#, vatandaş kimlik numarası, Kişisel kimlik
Numarası

UK Bank Account Number Sort Code


Sort codes are bank codes used to route money transfers between banks within their respective
countries via their respective clearance organizations.
The UK Bank Account Number Sort Code data identifier detects a six-digit number that matches
the UK Bank Account Number Sort Code format.
The UK Bank Account Number Sort Code data identifier provides three breadths of detection:
■ The wide breadth detects a six-digit number without checksum validation.
See “UK Bank Account Number Sort Code wide breadth” on page 1523.
■ The medium breadth detects a six-digit number with checksum validation.
See “UK Bank Account Number Sort Code medium breadth” on page 1524.
■ The narrow breadth detects a six-digit number with checksum validation. It also requires
the presence of related keywords.
See “UK Bank Account Number Sort Code narrow breadth” on page 1524.

UK Bank Account Number Sort Code wide breadth


The wide breadth detects a six-digit number without checksum validation.
Library of system data identifiers 1524
UK Bank Account Number Sort Code

Table 45-1124 UK Bank Account Number Sort Code wide-breadth patterns

Patterns

\d{2}-\d{2}-\d{2}

\d{6}

Table 45-1125 UK Bank Account Number Sort Code wide-breadth validators

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding numbers.

Duplicate digits Ensures that a string of digits is not all the same.

UK Bank Account Number Sort Code medium breadth


The medium breadth detects a six-digit number with checksum validation.

Table 45-1126 UK Bank Account Number Sort Code medium-breadth patterns

Patterns

\d{2}-\d{2}-\d{2}

\d{6}

Table 45-1127 UK Bank Account Number Sort Code medium-breadth validator

Mandatory validator Description

UK Bank Account Number Sort Code Check Computes the checksum and validates the pattern against
it.

UK Bank Account Number Sort Code narrow breadth


The narrow breadth detects a six-digit number with checksum validation. It also requires the
presence of related keywords.

Table 45-1128 UK Bank Account Number Sort Code narrow-breadth patterns

Patterns

\d{2}-\d{2}-\d{2}

\d{6}
Library of system data identifiers 1525
UK Drivers Licence Number

Table 45-1129 UK Bank Account Number Sort Code narrow-breadth validators

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding numbers.

UK Bank Account Number Sort Code Check Computes the checksum and validates the pattern against
it.

Duplicate digits Ensures that a string of digits is not all the same.

Find keywords At least one of the following keywords or key phrases must
be present for the data to match:

Inputs:

uk bank account sort code, uk bank sortcode, sort


code, sortcode, sorting code

UK Drivers Licence Number


The UK Drivers Licence Number is the identification number for an individual's driver's license
issued by the Driver and Vehicle Licensing Agency of the United Kingdom.
The UK Drivers Licence Number data identifier detects a 16-character alphanumeric pattern
that matches the UK Drivers Licence number format.
The UK Drivers Licence Number data identifier provides three breadths of validation:
■ The wide breadth detects a 16-character alphnanumeric pattern without validation.
See “UK Drivers Licence Number wide breadth” on page 1525.
■ The medium breadth detects a 16-character alphanumeric pattern with checksum validation.
See “UK Drivers Licence Number medium breadth” on page 1526.
■ The narrow breadth detects a 16-character alphanumeric pattern with checksum validation.
It also requires the presence of related keywords.
See “UK Drivers Licence Number narrow breadth” on page 1526.

UK Drivers Licence Number wide breadth


The wide breadth detects 16-character alphanumeric patterns in the following format:
AAAAAD[0,1,5,6]DDDDAAALL, where A is an alphanumeric character, D a digit, and L a letter.

Note: This breadth option does not include any validators.


Library of system data identifiers 1526
UK Drivers Licence Number

Table 45-1130 UK Drivers Licence Number wide-breadth patterns

Pattern

\w{5}\d[0156]\d{4}\w{3}\l{2}

\w{5} \d[0156]\d{4} \w{3}\l{2}

\w{5}\d[0156]\d{4}\w{3}\l{2}\d{2}

\w{5} \d[0156]\d{4} \w{3}\l{2}\d{2}

UK Drivers Licence Number medium breadth


The medium breadth detects 16-character alphanumeric patterns in the following format:
AAAAAD[0,1,5,6]DDDDAAALL, where A is an alphanumeric character, D a digit, and L a letter.
The first digit in the numeric section is restricted to 0,1,5, or 6. In addition, the 4th and 5th
digits in the numeric section must be between 01 and 31, inclusive.

Table 45-1131 UK Drivers Licence Number medium-breadth patterns

Pattern

\w{5}\d[0156]\d{4}\w{3}\l{2}

\w{5} \d[0156]\d{4} \w{3}\l{2}

\w{5}\d[0156]\d{4}\w{3}\l{2}\d{2}

\w{5} \d[0156]\d{4} \w{3}\l{2}\d{2}

Table 45-1132 UK Drivers Licence Number medium-breadth validator

Mandatory validator Description

UK Drivers License Every UK drivers license must be 16 characters and the number at the 8th and 9th
position must be larger than 00 and smaller than 32.

UK Drivers Licence Number narrow breadth


The narrow breadth detects 16-character alphanumeric patterns in the following format:
AAAAAD[0,1,5,6]DDDDAAALL, where A is an alphanumeric character, D is a digit, and L is
a letter.
The first digit is restricted to 0,1,5, or 6. In addition, the 4th and 5th digits in the numeric section
must be between 01 and 31, inclusive.
Library of system data identifiers 1527
UK Electoral Roll Number

In addition, the narrow breadth also requires the presence of both a driver's license-related
keyword AND a UK-related keyword.

Table 45-1133 UK Drivers Licence Number narrow-breadth patterns

Pattern

\w{5}\d[0156]\d{4}\w{3}\l{2}

\w{5} \d[0156]\d{4} \w{3}\l{2}

\w{5}\d[0156]\d{4}\w{3}\l{2}\d{2}

\w{5} \d[0156]\d{4} \w{3}\l{2}\d{2}

Table 45-1134 UK Drivers Licence Number narrow-breadth validators

Mandatory validator Description

UK Drivers License Every UK drivers license must be 16 characters and the number at the 8th and 9th
position must be larger than 00 and smaller than 32.

Find keywords: driver's At least one of the following keywords or key phrases must be present for the data
license-related to match:

Inputs:

driver license, drivers license, driver's license, driver licenses, drivers


licenses, driver's licenses, driver licence, drivers licence, driver's licence,
driver licences, drivers licences, driver's licences, dl#, dls#, lic#, lics#

Find keywords: UK-related At least one of the following keywords or keyphrases must be present for the data
to match:

Inputs:

british, the united kingdom, uk, united kingdom, unitedkingdom

UK Electoral Roll Number


The Electoral Roll Number is the identification number issued to an individual for UK election
registration. The format of this number is specified by the UK Government Standards of the
UK Cabinet Office.
The UK Electoral Roll Number data identifier detects the presence of UK Electoral Roll Number.
It implements a pattern to detect strings consisting of 2 to 3 letters, followed by 1 to 4 digits.
Library of system data identifiers 1528
UK National Health Service (NHS) Number

Table 45-1135 UK Electoral Roll Number narrow-breadth pattern

Pattern

\l{2,3}\d{1,4}

The narrow breadth of the Electoral Roll Number data identifier implements two validators to
require the presence of an electoral number-related keyword and a UK-related keyword.

Table 45-1136 UK Electoral Roll Number narrow-breadth validators

Mandatory validators Description

Find keywords: electoral At least one of the following keywords or key phrases must be present for the data
number-related to match:

electoral #, electoral number, electoral roll #, electoral roll no., electoral roll
number, electoral roll#, electoral#, electoralnumber, electoralroll#,
electoralrollno

Find keywords: UK-related At least one of the following keywords or key phrases must be present for the data
to match:

british, the united kingdom, uk, united kingdom, unitedkingdom

UK National Health Service (NHS) Number


The UK National Health Service (NHS) Number is the personal identification number issued
by the U.K. National Health Service (NHS) for administration of medical care.
The UK National Health Service (NHS) Number data identifier detects a 10-digit number that
matches the UK National Health Service number format.
This data identifier provides two breadths of validation:
■ The medium breadth detects a 10-digit number with checksum validation. It also requires
the presence of related keywords.
See “UK National Health Service (NHS) Number medium breadth” on page 1529.
■ The narrow breadth detects a 10-digit number with checksum validation. It also requires
the presence of related keywords.
See “UK National Health Service (NHS) Number narrow breadth” on page 1529.

Note: This data identifier does not provide a wide breadth option.
Library of system data identifiers 1529
UK National Health Service (NHS) Number

UK National Health Service (NHS) Number medium breadth


The medium breadth implements patterns to detect numbers in the currently defined NHS
format, DDD-DDD-DDDD (where D is a digit), with various separators.

Table 45-1137 UK National Health Service (NHS) Number medium-breadth patterns

Pattern Description

\d{3}.\d{3}.\d{4} Pattern for detecting the format DDD-DDD-DDDD


separated by periods.

\d{3} \d{3} \d{4} Pattern for detecting the format DDD-DDD-DDDD


separated by spaces.

\d{3}-\d{3}-\d{4} Pattern for detecting the format DDD-DDD-DDDD


separated by dashes.

The medium breadth implements three validators: one to validate the NHS checksum, another
to perform numerical validation using the final digit, and a third to check for the presence of
an NHS-related keyword.

Table 45-1138 UK National Health Service (NHS) Number medium-breadth validators

Validator Description

UK NHS Computes the checksum and validates the pattern against


it.

Number delimiter Validates a match by checking the surrounding numbers.

Find keywords: NHS-related At least one of the following keywords or key phrases must
be present for the data to match:

national health service, NHS

UK National Health Service (NHS) Number narrow breadth


The narrow breadth implements patterns to detect numbers in the currently defined format:
DDD-DDD-DDDD (where D is a digit), separated with dashes or spaces.

Table 45-1139 UK National Health Service (NHS) Number narrow-breadth patterns

Pattern Description

\d{3} \d{3} \d{4} Pattern for detecting the format DDD-DDD-DDDD


separated by spaces.
Library of system data identifiers 1530
UK National Insurance Number

Table 45-1139 UK National Health Service (NHS) Number narrow-breadth patterns (continued)

Pattern Description

\d{3}-\d{3}-\d{4} Pattern for detecting the format DDD-DDD-DDDD


separated by dashes.

The narrow breadth implements four validators: one to validate the NHS checksum, another
to perform numerical validation using the final digit, a third to require the presence of an
NHS-related keyword, and a fourth to require the presence of a UK-related keyword.

Table 45-1140 UK National Health Service (NHS) Number narrow-breadth validators

Mandatory validator Description

UK NHS Computes the checksum and validates the pattern against


it.

Number delimiter Validates a match by checking the surrounding numbers.

Find keywords: NHS-related At least one of the following keywords or key phrases must
be present for the data to match.

Inputs:

national health service, NHS

Find keywords: UK-related At least one of the following keywords or key phrases must
be present for the data to match.

Inputs:

uk, united kingdom, britain, england, gb

UK National Insurance Number


The UK National Insurance Number is issued by the United Kingdom Department for Work
and Pensions (DWP) to identify an individual for the national insurance program. It is also
known as a NI number, NINO or NINo.
The UK National Insurance Number data identifier detects a nine-character alphanumeric
pattern that matches the UK National Insurance Number format.
The UK National Insurance Number data identifier provides three breadths of validation:
■ The wide breadth detects nine-character alphanumeric patterns without validation.
See “UK National Insurance Number wide breadth” on page 1531.
■ The medium breadth detects nine-character alphanumeric patterns without validation.
See “UK National Insurance Number medium breadth” on page 1531.
Library of system data identifiers 1531
UK National Insurance Number

■ The narrow breadth detects nine-character alphanumeric patterns without validation. It


requires the presence of related keywords.
See “UK National Insurance Number narrow breadth” on page 1531.

UK National Insurance Number wide breadth


The wide breadth detects nine-character alphanumeric patterns in the format LL DD DD DD
L (where L is a letter and D is a digit), separated by spaces, periods, dashes, or together in a
string.
The first and second letter cannot be D, F, I, Q, U and V. The second letter also cannot be O.

Table 45-1141 UK National Insurance Number wide-breadth patterns

Patterns Description

[A-CEGHJ-PR-TW-Z][A-CEGHJ-NPR-TW-Z].\d{2}.\d{2}.\d{2}-[ABCD] Separated by periods.

[A-CEGHJ-PR-TW-Z][A-CEGHJ-NPR-TW-Z]\d{2}\d{2}\d{2}[ABCD] Not separated.

[A-CEGHJ-PR-TW-Z][A-CEGHJ-NPR-TW-Z] \d{2} \d{2} \d{2} [ABCD] Separated by spaces.

[A-CEGHJ-PR-TW-Z][A-CEGHJ-NPR-TW-Z]-\d{2}-\d{2}-\d{2}-[ABCD] Separated by dashes.

[A-CEGHJ-PR-TW-Z][A-CEGHJ-NPR-TW-Z] \d{6} [ABCD] Digits in a string.

UK National Insurance Number medium breadth


The medium breadth detects nine-character alphanumeric patterns in the format LL DD DD
DD L (where L is a letter and D is a digit), separated by spaces or together in a string.
The first and second letter cannot be D, F, I, Q, U and V; the second letter cannot be O.

Table 45-1142 UK National Insurance Number medium-breadth patterns

Patterns Description

[A-CEGHJ-PR-TW-Z][A-CEGHJ-NPR-TW-Z]\d{2}\d{2}\d{2}[ABCD] Not delimited.

[A-CEGHJ-PR-TW-Z][A-CEGHJ-NPR-TW-Z] \d{2} \d{2} \d{2} [ABCD] Separated by spaces.

[A-CEGHJ-PR-TW-Z][A-CEGHJ-NPR-TW-Z] \d{6} [ABCD] Characters in a string.

UK National Insurance Number narrow breadth


The narrow breadth detects nine-character alphanumeric patterns in the format LL DD DD DD
L (where L is a letter and D is a digit), separated by spaces or together in a string.
Library of system data identifiers 1532
UK Passport Number

The first and second letter cannot be D, F, I, Q, U and V. The second letter also cannot be O.

Table 45-1143 UK National Insurance Number narrow-breadth patterns

Pattern Description

[A-CEGHJ-PR-TW-Z][A-CEGHJ-NPR-TW-Z]\d{2}\d{2}\d{2}[ABCD] Not delimited.

[A-CEGHJ-PR-TW-Z][A-CEGHJ-NPR-TW-Z] \d{2} \d{2} \d{2} [ABCD] Separated by spaces.

[A-CEGHJ-PR-TW-Z][A-CEGHJ-NPR-TW-Z] \d{6} [ABCD] Characters in a string.

Table 45-1144 UK National Insurance Number narrow-breadth validator

Mandatory validator Description

Find keywords: Insurance-related At least one of the following keywords or key phrases must be present for the
data to match:

insurance no., insurance number, insurance#, insurancenumber, national


insurance number, nationalinsurance#, nationalinsurancenumber, nin,
nino

UK Passport Number
The UK Passport Number identifies a United Kingdom passport using the current official
specification of the UK Government Standards of the UK Cabinet Office.
The UK Passport Number data identifier detects a nine-digit number that matches the UK
Passport Number format.
This data identifier provides three breadths of validation:
■ The wide breadth detects a nine-digit number without validation.
See “UK Passport Number wide breadth” on page 1532.
■ The medium breadth detects a nine-digit number without checksum validation. It requires
the presence of related keywords.
See “UK Passport Number medium breadth” on page 1533.
■ The narrow breadth detects a nine-digit number without checksum validation. It requires
the presence of related keywords.
See “UK Passport Number narrow breadth” on page 1533.

UK Passport Number wide breadth


The wide breadth detects a nine-digit number without validation.
Library of system data identifiers 1533
UK Passport Number

Note: The wide breadth does not include any validators.

Table 45-1145 UK Passport Number wide-breadth pattern

Pattern Description

\d{9} Pattern for detecting 9-digit numbers.

UK Passport Number medium breadth


The medium breadth detects a nine-digit number without checksum validation. It requires the
presence of related keywords.

Table 45-1146 UK Passport Number medium-breadth pattern

Pattern Description

\d{9} Pattern for detecting 9-digit numbers.

Table 45-1147 UK Passport Number medium-breadth validators

Mandatory validator Description

Exclude beginning characters Data beginning with any of the following list of values is not matched:

123456789

Duplicate digits Ensures that a string of digits is not all the same.

Find keywords: Passport-related At least one of the following keywords or key phrases must be present for the
data to match.

Inputs:

passport, passport#, passportID, passportno, passportnumber

UK Passport Number narrow breadth


The narrow breadth detects a nine-digit number without checksum validation. It requires the
presence of related keywords.

Table 45-1148 UK Passport Number narrow-breadth pattern

Pattern Description

\d{9} Pattern for detecting 9-digit numbers.


Library of system data identifiers 1534
UK Tax ID Number

Table 45-1149 UK Passport Number narrow-breadth validators

Mandatory validator Description

Exclude beginning characters Data beginning with any of the following list of values is not matched:
123456789

Duplicate digits Ensures that a string of digits is not all the same.

Find keywords: Passport-related At least one of the following keywords or key phrases must be present for the
data to match.

Inputs:

passport, passport#, passportID, passportno, passportnumber

Find keywords: UK-related At least one of the following keywords or key phrases must be present for the
data to match.

Inputs:

uk, united kingdom, britain, england, gb

UK Tax ID Number
The UK Tax ID Number is a personal identification number provided by the UK Government
Standards of the UK Cabinet Office.
The UK Tax ID Number data identifier detects a 10-digit number that matches the UK Tax ID
number format.
The UK Tax ID Number data identifier provides three breadths of validation:
■ The wide breadth detects a 10-digit number without validation.
See “UK Tax ID Number wide breadth” on page 1534.
■ The medium breadth detects a 10-digit number without checksum validation.
See “UK Tax ID Number medium breadth” on page 1535.
■ The narrow breadth detects a 10-digit number without checksum validation. It requires the
presence of related keywords.
See “UK Tax ID Number narrow breadth” on page 1535.

UK Tax ID Number wide breadth


The wide breadth detects a 10-digit number without validation.

Note: The wide breadth of the UK Tax ID Number data identifier does not include any validators.
Library of system data identifiers 1535
UK Tax ID Number

Table 45-1150 UK Passport Number wide-breadth pattern

Pattern Description

\d{10} Pattern for detecting 10-digit numbers.

UK Tax ID Number medium breadth


The medium breadth detects a 10-digit number without checksum validation.

Table 45-1151 UK Tax ID Number medium-breadth pattern

Pattern Description

\d{10} Pattern for detecting 10-digit numbers.

Table 45-1152 UK Tax ID Number medium-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Exclude beginning characters Data beginning with any of the following list of values is
not matched:

0123456789, 1234567890, 9876543210, 0987654321

UK Tax ID Number narrow breadth


The narrow breadth detects a 10-digit number without checksum validation. It requires the
presence of related keywords.

Table 45-1153 UK Tax ID Number narrow-breadth pattern

Pattern Description

\d{10} Pattern for detecting 10-digit numbers.

Table 45-1154 UK Tax ID Number narrow-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Exclude beginning characters Data beginning with any of the following list of values is
not matched:

0123456789, 1234567890, 9876543210, 0987654321


Library of system data identifiers 1536
UK Value Added Tax (VAT) Number

Table 45-1154 UK Tax ID Number narrow-breadth validators (continued)

Mandatory validator Description

Find keywords: Tax ID-related At least one of the following keywords or key phrases must
be present for the data to match:

tax id, tax id no., tax id number, tax identification, tax


identification#, tax no., tax#, taxid#

UK Value Added Tax (VAT) Number


VAT is a consumption tax that is borne by the end consumer. VAT is paid for each transaction
in the manufacturing and distribution process. For the United Kingdom, the VAT number is
issued by the VAT office for the region in which the business is established.
The UK Value Added Tax (VAT) Number data identifier detects a 7- to 14-character
alphanumeric pattern that matches the UK Value Added Tax (VAT) Number format.
The UK Value Added Tax (VAT) Number data identifier provides three breadths of detection:
■ The wide breadth detects a 7- to 14-character alphanumeric pattern beginning with GB
without checksum validation.
See “UK Value Added Tax (VAT) Number wide breadth” on page 1536.
■ The medium breadth detects a 7- to 14-character alphanumeric pattern beginning with GB
with checksum validation.
See “UK Value Added Tax (VAT) Number medium breadth” on page 1537.
■ The narrow breadth detects a 7- to 14-character alphanumeric pattern beginning with GB
with checksum validation. It also requires the presence of related keywords.
See “UK Value Added Tax (VAT) Number narrow breadth” on page 1538.

UK Value Added Tax (VAT) Number wide breadth


The wide breadth detects a 7- to 14-character alphanumeric pattern without checksum
validation.

Table 45-1155 UK Value Added Tax (VAT) Number wide-breadth patterns

Patterns

[Gg][Bb][Gg][Dd]\d{3}

[Gg][Bb][Hh][Aa]\d{3}

[Gg][Bb][Gg][Dd] \d{3}
Library of system data identifiers 1537
UK Value Added Tax (VAT) Number

Table 45-1155 UK Value Added Tax (VAT) Number wide-breadth patterns (continued)

Patterns

[Gg][Bb][Hh][Aa] \d{3}

[Gg][Bb]\d{9}

[Gg][Bb]\d{12}

[Gg][Bb] \d{9}

[Gg][Bb] \d{12}

[Gg][Bb]\d{3} \d{4} \d{2}

[Gg][Bb]\d{3} \d{4} \d{2} \d{3}

Table 45-1156 UK Value Added Tax (VAT) Number wide-breadth validators

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

000000000, 111111111, 222222222, 333333333,


444444444, 555555555, 666666666, 777777777,
888888888, 999999999, 000000000000, 111111111111,
222222222222, 333333333333, 444444444444,
555555555555, 666666666666, 777777777777,
888888888888, 999999999999, 000, 111, 222, 333, 444,
555, 666, 777, 888, 999

UK Value Added Tax (VAT) Number medium breadth


The medium breadth detects a 7- to 14-character alphanumeric pattern with checksum
validation.

Table 45-1157 UK Value Added Tax (VAT) Number medium-breadth patterns

Patterns

[Gg][Bb][Gg][Dd]\d{3}

[Gg][Bb][Hh][Aa]\d{3}

[Gg][Bb][Gg][Dd] \d{3}
Library of system data identifiers 1538
UK Value Added Tax (VAT) Number

Table 45-1157 UK Value Added Tax (VAT) Number medium-breadth patterns (continued)

Patterns

[Gg][Bb][Hh][Aa] \d{3}

[Gg][Bb]\d{9}

[Gg][Bb]\d{12}

[Gg][Bb] \d{9}

[Gg][Bb] \d{12}

[Gg][Bb]\d{3} \d{4} \d{2}

[Gg][Bb]\d{3} \d{4} \d{2} \d{3}

Table 45-1158 UK Value Added Tax (VAT) Number medium-breadth validator

Mandatory validator Description

UK VAT Number Validation Check Computes the checksum and validates the pattern against
it.

UK Value Added Tax (VAT) Number narrow breadth


The narrow breadth detects a 7- to 14-character alphanumeric pattern with checksum validation.
It also requires the presence of related keywords.

Table 45-1159 UK Value Added Tax (VAT) Number narrow-breadth patterns

Pattern

[Gg][Bb][Gg][Dd]\d{3}

[Gg][Bb][Hh][Aa]\d{3}

[Gg][Bb][Gg][Dd] \d{3}

[Gg][Bb][Hh][Aa] \d{3}

[Gg][Bb]\d{9}

[Gg][Bb]\d{12}

[Gg][Bb] \d{9}

[Gg][Bb] \d{12}
Library of system data identifiers 1539
Ukraine Identity Card

Table 45-1159 UK Value Added Tax (VAT) Number narrow-breadth patterns (continued)

Pattern

[Gg][Bb]\d{3} \d{4} \d{2}

[Gg][Bb]\d{3} \d{4} \d{2} \d{3}

Table 45-1160 UK Value Added Tax (VAT) Number narrow-breadth validators

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Exclude ending characters Data ending with any of the following list of values is not
matched:

000000000, 111111111, 222222222, 333333333,


444444444, 555555555, 666666666, 777777777,
888888888, 999999999

000000000000, 111111111111, 222222222222,


333333333333, 444444444444, 555555555555,
666666666666, 777777777777, 888888888888,
999999999999

000, 111, 222, 333, 444, 555, 666, 777, 888, 999

UK VAT Number Validation Check Computes the checksum and validates the pattern against
it.

Find keywords At least one of the following keywords or key phrases must
be present for the data to match:

Inputs:

vat no., vat number, vat#, value added tax number, vat
no

Ukraine Identity Card


A Ukraine Identity Card has a 15-digit record number issued to citizens of Ukraine. It is used
as a form of identification taking the place of Ukraine's domestic passport as of January 2016.
The Ukraine Identity Card data identifier detects a 15-digit number that matches the Ukraine
Identity Card format.
The Ukraine Identity Card data identifier provides three breadths of detection:
■ The wide breadth detects a 15-digit number without checksum validation.
See “Ukraine Identity Card wide breadth” on page 1540.
Library of system data identifiers 1540
Ukraine Identity Card

■ The medium breadth detects a 15-digit number with checksum validation.


See “Ukraine Identity Card medium breadth” on page 1540.
■ The narrow breadth detects a 15-digit number with checksum validation. It also requires
the presence of related keywords.
See “Ukraine Identity Card narrow breadth” on page 1541.

Ukraine Identity Card wide breadth


The wide breadth detects a 15-digit number without checksum validation.

Table 45-1161 Ukraine Identity Card wide-breadth patterns

Pattern

\d{4}[01]\d[0123]\d-\d{7}

\d{4}[01]\d[0123]\d{8}

\d{4}[01]\d[0123]\d \d{7}

Table 45-1162 Ukraine Identity Card wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits are not all the same.

Ukraine Identity Card medium breadth


The medium breadth detects a 15-digit number with checksum validation.

Table 45-1163 Ukraine Identity Card medium-breadth patterns

\d{4}[01]\d[0123]\d-\d{7}

\d{4}[01]\d[0123]\d{8}

\d{4}[01]\d[0123]\d \d{7}

Table 45-1164 Ukraine Identity Card medium-breadth validators

Duplicate digits Ensures that a string of digits are not all the same.

Number delimiter Validates a match by checking the surrounding numbers.


Library of system data identifiers 1541
Ukraine Passport (Domestic)

Table 45-1164 Ukraine Identity Card medium-breadth validators (continued)

Ukraine Identity Card Check Computes the checksum and validates the pattern against
it.

Ukraine Identity Card narrow breadth


The narrow breadth detects a 15-digit number with checksum validation. It also requires the
presence of related keywords.

Table 45-1165 Ukraine Identity Card narrow-breadth patterns

Pattern

\d{4}[01]\d[0123]\d-\d{7}

\d{4}[01]\d[0123]\d{8}

\d{4}[01]\d[0123]\d \d{7}

Table 45-1166 Ukraine Identity Card narrow-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits are not all the same.

Number delimiter Validates a match by checking the surrounding numbers.

Ukraine Identity Card Check Computes the checksum and validates the pattern against
it.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

Ukraine Identity Card, identity card

посвідчення особи України

Ukraine Passport (Domestic)


An identity document issued to citizens of Ukraine for domestic use. It has been replaced by
the Ukraine Identity Card as of 2016, but any existing passports are still valid.
Library of system data identifiers 1542
Ukraine Passport (Domestic)

The Ukraine Passport (Domestic) data identifier detects a nine-digit number that matches the
Ukraine Passport (Domestic) format.
The Ukraine Passport (Domestic) data identifier provides two breadths of detection:
■ The wide breadth detects a nine-digit number without checksum validation.
See “Ukraine Passport (Domestic) wide breadth” on page 1542.
■ The narrow breadth detects a nine-digit number. It also requires the presence of related
keywords.
See “Ukraine Passport (Domestic) narrow breadth” on page 1542.

Ukraine Passport (Domestic) wide breadth


The wide breadth detects a nine-digit number without checksum validation.

Table 45-1167 Ukraine Passport (Domestic) wide-breadth pattern

Pattern

\d{9}

Table 45-1168 Ukraine Passport (Domestic) wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits are not all the same.

Ukraine Passport (Domestic) narrow breadth


The narrow breadth detects a nine-digit number. It also requires the presence of related
keywords.

Table 45-1169 Ukraine Passport (Domestic) narrow-breadth pattern

Pattern

\d{9}

Table 45-1170 Ukraine Passport (Domestic) narrow-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits are not all the same.

Number delimiter Validates a match by checking the surrounding numbers.


Library of system data identifiers 1543
Ukraine Passport (International)

Table 45-1170 Ukraine Passport (Domestic) narrow-breadth validators (continued)

Mandatory validator Description

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

passport, Ukraine Passport, passport number,


passport no

паспорт, паспорт України, номер паспорта,


персональний

Ukraine Passport (International)


The Ukraine international passport is a document used by citizens of Ukraine to travel outside
of Ukraine.
The Ukraine Passport (International) data identifier detects an eight-character alphanumeric
pattern that matches the Ukraine Passport (International) format.
The Ukraine Passport (International) data identifier provides two breadths of detection:
■ The wide breadth detects an eight-character alphanumeric pattern without checksum
validation.
See “Ukraine Passport (International) wide breadth” on page 1543.
■ The narrow breadth detects an eight-character alphanumeric pattern without checksum
validation. It also requires the presence of related keywords.
See “Ukraine Passport (International) narrow breadth” on page 1544.

Ukraine Passport (International) wide breadth


The wide breadth detects an eight-character alphanumeric pattern without checksum validation.

Table 45-1171 Ukraine Passport (International) wide-breadth pattern

Pattern

\w{2}\d{6}

Table 45-1172 Ukraine Passport (International) wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits are not all the same.
Library of system data identifiers 1544
United Arab Emirates Personal Number

Ukraine Passport (International) narrow breadth


The narrow breadth detects an eight-character alphanumeric pattern without checksum
validation. It also requires the presence of related keywords.

Table 45-1173 Ukraine Passport (International) narrow-breadth pattern

Pattern

\w{2}\d{6}

Table 45-1174 Ukraine Passport (International) narrow-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits are not all the same.

Number delimiter Validates a match by checking the surrounding numbers.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

passport, Ukraine Passport, passport number,


passport no

паспорт, паспорт України, номер паспорта

United Arab Emirates Personal Number


In United Arab Emirates, every citizen or resident has a unique personal identification number.
The United Arab Emirates Personal Number is used for identity verification by the government
and some private entities.
The United Arab Emirates Personal Number data identifier detects a 15-digit number that
matches the United Arab Emirates Personal Number format.
The United Arab Emirates Number system data identifier provides three breadths of detection:
■ The wide breadth detects a 15-digit number without checksum validation.
See “United Arab Emirates Personal Number wide breadth ” on page 1545.
■ The medium breadth detects a 15-digit number with checksum validation.
See “United Arab Emirates Personal Number medium breadth” on page 1545.
■ The narrow breadth detects a 15-digit number with checksum validation. It also requires
the presence of related keywords.
See “United Arab Emirates Personal Number narrow breadth” on page 1545.
Library of system data identifiers 1545
United Arab Emirates Personal Number

United Arab Emirates Personal Number wide breadth


The wide breadth detects a 15-digit number without checksum validation.

Table 45-1175 United Arab Emirates Personal Number wide-breadth patterns

Pattern

\d{15}

\d{3}-\d{4}-\d{7}-\d{1}

Table 45-1176 United Arab Emirates Personal Number wide breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

United Arab Emirates Personal Number medium breadth


The medium breadth detects a 15-digit number with checksum validation.

Table 45-1177 United Arab Emirates Personal Number medium breadth patterns

Pattern

\d{15}

\d{3}-\d{4}-\d{7}-\d{1}

Table 45-1178 United Arab Emirates Personal Number medium breadth validator

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Luhn Check Computes the checksum and validates the pattern against
it.

United Arab Emirates Personal Number narrow breadth


The narrow breadth detects a 15-digit number with checksum validation. It also requires the
presence of related keywords.
Library of system data identifiers 1546
US Individual Tax Identification Number (ITIN)

Table 45-1179 United Arab Emirates Personal Number narrow-breadth patterns

Pattern

\d{15}

\d{3}-\d{4}-\d{7}-\d{1}

Table 45-1180 United Arab Emirates Personal Number narrow-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

Luhn Check Computes the checksum and validates the pattern against
it.

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched when you use this
option.

Inputs:

PID, Insurance Number, Personal ID Number, personal


identification no., unique identification no., personal
identity no, personalidno#, insuranceno#,
personalIdno#, uniqueIdno#

,‫ هوية فريدة‬,‫ التأمين رقم‬,‫ فريدة من نوعها هوية رقم‬,‫الهوية الشخصية رقم‬
‫التأمينرقم‬#

US Individual Tax Identification Number (ITIN)


The US Individual Tax Identification Number (ITIN) is used for tax processing and is issued
by the United States Internal Revenue Service (IRS). The IRS issues ITINs to track individuals
who are not eligible to obtain Social Security numbers.
The US Individual Tax Identification Number (ITIN) data identifier detects nine-digit number
that match the US ITIN format.
The US Individual Tax Identification Number (ITIN) data identifier provides three breadths of
validation:
■ The wide breadth detects nine-digit numbers without validation.
See “US Individual Tax Identification Number (ITIN) wide breadth” on page 1547.
■ The medium breadth detects nine-digit numbers without checksum validation.
See “US Individual Tax Identification Number (ITIN) medium breadth” on page 1547.
Library of system data identifiers 1547
US Individual Tax Identification Number (ITIN)

■ The narrow breadth detects nine-digit numbers without checksum validation. It requires
the presence of related keywords.
See “US Individual Tax Identification Number (ITIN) narrow breadth” on page 1548.

US Individual Tax Identification Number (ITIN) wide breadth


The wide breadth detects nine-digit numbers with the pattern DDD-DD-DDDD separated with
dashes, spaces, periods, slashes, or without separators.
The number must begin with a 9 and have a 7 or 8 as the fourth digit.

Note: The wide breadth of the US Individual Tax Identification Number (ITIN) data identifier
does not include any validators.

Table 45-1181 US Individual Tax Identification Number (ITIN) wide-breadth patterns

Patterns

9\d\d[78]\d\d\d\d\d

9\d\d[.- ][78]\d[.- ]\d\d\d\d

9\d\d[/][78]\d[/]\d\d\d\d

9\d\d[\\][78]\d[\\]\d\d\d\d

US Individual Tax Identification Number (ITIN) medium breadth


The medium breadth detects nine-digit numbers with the pattern DDD-DD-DDDD separated
with dashes, spaces, or periods.
The number must begin with a 9 and have a 7 or 8 as the fourth digit.

Table 45-1182 US Individual Tax Identification Number (ITIN) medium-breadth patterns

Patterns

9\d\d[78]\d\d\d\d\d

9\d\d[.- ][78]\d[.- ]\d\d\d\d

9\d\d[/][78]\d[/]\d\d\d\d

9\d\d[\\][78]\d[\\]\d\d\d\d
Library of system data identifiers 1548
US Passport Number

Table 45-1183 US Individual Tax Identification Number (ITIN) medium-breadth validator

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

US Individual Tax Identification Number (ITIN) narrow breadth


The narrow breadth detects nine-digit numbers with the pattern DDD-DD-DDDD separated
with dashes or spaces.
The number must begin with a 9 and have a 7 or 8 as the fourth digit.

Table 45-1184 US Individual Tax Identification Number (ITIN) narrow-breadth patterns

Patterns

9\d\d[78]\d\d\d\d\d

9\d\d[.- ][78]\d[.- ]\d\d\d\d

9\d\d[/][78]\d[/]\d\d\d\d

9\d\d[\\][78]\d[\\]\d\d\d\d

Table 45-1185 US Individual Tax Identification Number (ITIN) narrow-breadth validators

Mandatory validators Description

Number delimiter Validates a match by checking the surrounding characters.

Duplicate digits Ensures that a string of digits is not all the same.

Find keywords: ITIN-related At least one of the following keywords or key phrases must
be present for the data to be matched.

Inputs:

individual taxpayer identification number, itin, i.t.i.n.

US Passport Number
United States passports are passports issued to citizens and non-citizen nationals of the United
States of America. They are issued exclusively by the U.S. Department of State.
The US Passport Number data identifier detects an eight- or nine-digit number that matches
the US Passport Number format.
The US Passport Number data identifier provides two breadths of detection:
Library of system data identifiers 1549
US Passport Number

■ The wide breadth detects a valid US Passport Number pattern.


See “US Passport Number wide breadth” on page 1549.
■ The narrow breadth detects a valid US Passport Number pattern. It also requires the
presence of related keywords.
See “US Passport Number narrow breadth” on page 1549.

US Passport Number wide breadth


The wide breadth detects a valid US Passport Number pattern.

Table 45-1186 US Passport Number wide-breadth patterns

Patterns

\d{8}

\d{9}

Table 45-1187 US Passport Number wide-breadth validators

Mandatory validators Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

US Passport Number narrow breadth


The narrow breadth detects a valid US Passport Number pattern. It also requires the presence
of related keywords.

Table 45-1188 US Passport Number narrow-breadth patterns

Patterns

\d{8}

\d{9}

Table 45-1189 US Passport Number narrow-breadth validators

Mandatory validators Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.


Library of system data identifiers 1550
US Social Security Number (SSN)

Table 45-1189 US Passport Number narrow-breadth validators (continued)

Mandatory validators Description

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

passport, Passport, U.S. Passport, u.s. passport,


Passport Card, Passport Book, passport card, passport
book

US Social Security Number (SSN)


Note: Starting with Symantec Data Loss Prevention version 12.5, the US Social Security
Number (SSN) data identifier is replaced by the Randomized US Social Security Number
(SSN) data identifier. Policy templates that use the US SSN data identifier are updated to use
the Randomized US SSN data identifier. Symantec recommends that you update your SSN
policies to use the Randomized US SSN data identifier. See “Randomized US Social Security
Number (SSN)” on page 1414.

The US Individual Tax Identification Number (ITIN) is a personal identification number issued
by the Social Security Administration of the United States government. Although primarily used
for administering the Social Security program, it is widely used as a personal identification
number in many purposes.
The US Social Security Number (SSN) data identifier detects nine-digit numbers that match
the US SSN format..
The US Social Security Number (SSN) data identifier provides three breadths of validation:
■ The wide breadth detects a nine-digit number without checksum validation.
See “US Social Security Number (SSN) wide breadth” on page 1551.
■ The medium breadth detects a nine-digit number without checksum validation.
See “US Social Security Number (SSN) medium breadth” on page 1551.
■ The narrow breadth detects a nine-digit number without checksum validation. It requires
the presence of related keywords.
See “US Social Security Number (SSN) narrow breadth” on page 1552.
Library of system data identifiers 1551
US Social Security Number (SSN)

US Social Security Number (SSN) wide breadth


The wide breadth detects nine-digit numbers with the pattern DDD-DD-DDDD separated with
dashes, spaces, periods, slashes, or without separators.
The number must begin with a 9 and have a 7 or 8 as the fourth digit.

Table 45-1190 Social Security Number (SSN) wide-breadth patterns

Pattern Description

\d{3}-\d{2}-\d{4} Matches the standard SSN format, which is any three digits followed by a hyphen,
two digits, a hyphen, and any four digits.

\d{3}.\d{2}.\d{4} Matches the SSN format delimited by periods.

\d{3} \d{2} \d{4} Matches the SSN format delimited by spaces.

\d{3}\\\d{2}\\\d{4} Matches the SSN format delimited by backslashes.

\d{3}/\d{2}/\d{4} Matches the SSN format delimited by forward slashes.

\d{9} Matches any 9-digit number that is not delimited.

Table 45-1191 Social Security Number (SSN) wide-breadth validators

Validator Description

Number delimiter Validates a match by checking the surrounding characters.

Advanced SSN Checks whether SSN contains zeros in any group, the area number (first group)
is less than 773 and not 666, the delimiter between the groups is the same, the
number does not consist of all the same digits, and the number is not reserved
for advertising (123-45-6789, 987-65-432x).

SSN Area-Group number For a given area number (first group), not all group numbers (second group) might
have been assigned by the SSA. Validator eliminates SSNs with invalid group
numbers.

US Social Security Number (SSN) medium breadth


The medium breadth detects nine-digit numbers with the pattern DDD-DD-DDDD separated
with dashes, spaces, or periods.
Library of system data identifiers 1552
US Social Security Number (SSN)

Table 45-1192 Social Security Number (SSN) medium-breadth patterns

Pattern Description

\d{3}-\d{2}-\d{4} Matches the standard SSN format, which is any three digits followed by a hyphen,
two digits, a hyphen, and any four digits.

\d{3}.\d{2}.\d{4} Matches the SSN format delimited by periods.

\d{3} \d{2} \d{4} Matches the SSN format delimited by spaces.

Table 45-1193 Social Security Number (SSN) medium-breadth validators

Validator Description

Number delimiter Validates a match by checking the surrounding characters.

Advanced SSN Checks whether SSN contains zeros in any group, the area number (first group)
is less than 773 and not 666, the delimiter between the groups is the same, the
number does not consist of all the same digits, and the number is not reserved
for advertising (123-45-6789, 987-65-432x).

SSN Area-Group number For a given area number (first group), not all group numbers (second group) might
have been assigned by the SSA. Validator eliminates SSNs with invalid group
numbers.

US Social Security Number (SSN) narrow breadth


The narrow breadth detects nine-digit numbers with the pattern DDD-DD-DDDD separated
with dashes or spaces or without separators.

Table 45-1194 US Social Security Number (SSN) narrow-breadth patterns

Pattern Description

\d{3}-\d{2}-\d{4} Matches the standard SSN format, which is any three digits followed by a hyphen,
two digits, a hyphen, and any four digits.

\d{3} \d{2} \d{4} Matches the SSN format delimited by spaces.

\d{9} Matches any 9-digit number not delimited.

Table 45-1195 Social Security Number (SSN) narrow-breadth validators

Mandatory Validator Description

Number Delimiter Validates a match by checking the surrounding characters.


Library of system data identifiers 1553
US ZIP+4 Postal Codes

Table 45-1195 Social Security Number (SSN) narrow-breadth validators (continued)

Mandatory Validator Description

Advanced SSN Checks whether SSN contains zeros in any group, the area number (first group)
is less than 773 and not 666, the delimiter between the groups is the same, the
number does not consist of all the same digits, and the number is not reserved
for advertising (123-45-6789, 987-65-432x).

SSN Area-Group number For a given area number (first group), not all group numbers (second group)
might have been assigned by the SSA. Validator eliminates SSNs with invalid
group numbers.

Find keywords: Social At least one of the following keywords or key phrases must be present for the
security-related data to be matched:

social security number, ssn, ss#

US ZIP+4 Postal Codes


In the United States, a ZIP+4 code uses the basic 5-digit code plus 4 additional digits to identify
a geographic segment within the 5-digit delivery area that could use an extra identifier to aid
in efficient mail sorting and delivery.
The US ZIP+4 Postal Codes data identifier detects valid US ZIP+4 Postal Code patterns.
The US ZIP+4 Postal Codes data identifier provides three breadths of detection:
■ The wide breadth detects a valid US ZIP+4 Postal Code pattern without checksum validation.
See “US ZIP+4 Postal Codes wide breadth” on page 1553.
■ The medium breadth detects a valid US ZIP+4 Postal Code pattern with checksum validation.
See “US ZIP+4 Postal Codes medium breadth” on page 1554.
■ The narrow breadth detects a valid US ZIP+4 Postal Code pattern with checksum validation.
It also requires the presence of related keywords.
See “US ZIP+4 Postal Codes narrow breadth” on page 1554.

US ZIP+4 Postal Codes wide breadth


The wide breadth detects a valid US ZIP+4 Postal Code pattern without checksum validation.

Table 45-1196 US ZIP+4 Postal Codes wide-breadth patterns

Pattern

\l{2}[ ]\d{5}[-]\d{4}

\l{2}[ ]\d{9}
Library of system data identifiers 1554
US ZIP+4 Postal Codes

Table 45-1197 US ZIP+4 Postal Codes wide-breadth validator

Mandatory validator Description

Exclude ending characters Data ending with any of the following list of values is not
matched:

000000000, 111111111, 222222222, 333333333,


444444444, 555555555, 666666666, 777777777,
888888888, 999999999

US ZIP+4 Postal Codes medium breadth


The medium breadth detects a valid US ZIP+4 Postal Code pattern with checksum validation.

Table 45-1198 US ZIP+4 Postal Codes medium-breadth patterns

Patterns

\l{2}[ ]\d{5}[-]\d{4}

\l{2}[ ]\d{9}

Table 45-1199 US ZIP+4 Postal Codes medium-breadth validators

Mandatory validator Description

Exclude ending characters Data ending with any of the following list of values is not
matched:

000000000, 111111111, 222222222, 333333333,


444444444, 555555555, 666666666, 777777777,
888888888, 999999999

Zip+4 Postal Codes Validation Check Computes the checksum and validates the pattern against
it.

US ZIP+4 Postal Codes narrow breadth


The narrow breadth detects a valid US ZIP+4 Postal Code pattern with checksum validation.
It also requires the presence of related keywords.

Table 45-1200 US ZIP+4 Postal Codes narrow-breadth patterns

Patterns

\l{2}[ ]\d{5}[-]\d{4}
Library of system data identifiers 1555
Venezuela National Identification Number

Table 45-1200 US ZIP+4 Postal Codes narrow-breadth patterns (continued)

Patterns

\l{2}[ ]\d{9}

Table 45-1201 US ZIP+4 Postal Codes narrow breadth validators

Mandatory validator Description

Exclude ending characters Data ending with any of the following list of values is not
matched:

000000000, 111111111, 222222222, 333333333,


444444444, 555555555, 666666666, 777777777,
888888888, 999999999

Zip+4 Postal Codes Validation Check Computes the checksum and validates the pattern against
it.

Find keywords With this option selected, at least one of the following
keywords or key phrases must be present for the data to
be matched.

Inputs:

US zip code, zip code, zip+4 code, US zip+4 code

Venezuela National Identification Number


In Venezuela, every citizen and resident has a unique Venezuela National Identification Number
(Venezuela Cédula de Identidad). The Venezuela National Identification Number is used on
identity documents, making it possible to match the number to a person.
The Venezuela National Identification Number data identifier detects a 10-character
alphanumeric pattern that matches the Venezuela National Identification Number format.
This data identifier provides the following breadths of detection:
■ The wide breadth detects a 10-character alphanumeric pattern without checksum validation.
See “Venezuela National Identification Number wide breadth” on page 1556.
■ The medium breadth detects a 10-character alphanumeric pattern with checksum validation.
See “Venezuela National Identification Number medium breadth ” on page 1556.
■ The narrow breadth detects a 10-character alphanumeric pattern that passes checksum
validation. It also requires the presence of related keywords.
See “Venezuela National Identification Number narrow breadth” on page 1556.
Library of system data identifiers 1556
Venezuela National Identification Number

Venezuela National Identification Number wide breadth


The wide breadth detects a 10-character alphanumeric pattern without checksum validation.

Table 45-1202 Venezuela National Identification Number wide-breadth patterns

Pattern

[VEJPGvejpg][-]\d{2}.\d{3}.\d{3}[-]\d

[VEJPGvejpg][-]\d{8}[-]\d

[VEJPGvejpg]\d{9}

Table 45-1203 Venezuela National Identification Number wide-breadth validator

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Venezuela National Identification Number medium breadth


The medium breadth detects a 10-character alphanumeric pattern with checksum validation.

Table 45-1204 Venezuela National Identification Number medium-breadth patterns

Pattern

[VEJPGvejpg][-]\d{2}.\d{3}.\d{3}[-]\d

[VEJPGvejpg][-]\d{8}[-]\d

[VEJPGvejpg]\d{9}

Table 45-1205 Venezuela National Identification Number medium-breadth validator

Mandatory validator Description

Number delimiter Validates a match by checking the surrounding characters.

Venezuela National ID Number Validation Check Computes the checksum and validates the pattern against
it.

Venezuela National Identification Number narrow breadth


The narrow breadth detects a 10-character alphanumeric pattern with checksum validation. It
also requires the presence of related keywords.
Library of system data identifiers 1557
Venezuela National Identification Number

Table 45-1206 Venezuela National Identification Number narrow-breadth patterns

Pattern

[VEJPGvejpg][-]\d{2}.\d{3}.\d{3}[-]\d

[VEJPGvejpg][-]\d{8}[-]\d

[VEJPGvejpg]\d{9}

Table 45-1207 Venezuela National Identification Number narrow-breadth validators

Mandatory validator Description

Duplicate digits Ensures that a string of digits is not all the same.

Number delimiter Validates a match by checking the surrounding characters.

Venezuela National ID Number Validation Check Computes the checksum and validates the pattern against
it.

Find keywords At least one of the following keywords or key phrases must
be present for the data to be matched when you use this
option.

Inputs:

national ID number, NID, national identification


number, national ID no, PID, insurance number,
personal ID number, personal identification no, unique
identification no, personalidno#, uniqueIDno#,
nationalidno#, nationalidentityno#

cédula de identidad número, clave única de identidad,


personal de identidad clave, personal de identidad,
número de identificación nacional, número ID nacional
Chapter 46
Library of policy templates
This chapter includes the following topics:

■ Caldicott Report policy template

■ Canadian Social Insurance Numbers policy template

■ CAN-SPAM Act policy template

■ Colombian Personal Data Protection Law 1581 policy template

■ Common Spyware Upload Sites policy template

■ Competitor Communications policy template

■ Confidential Documents policy template

■ Credit Card Numbers policy template

■ Customer Data Protection policy template

■ Data Protection Act 1998 policy template

■ Data Protection Directives (EU) policy template

■ Defense Message System (DMS) GENSER Classification policy template

■ Design Documents policy template

■ Employee Data Protection policy template

■ Encrypted Data policy template

■ Export Administration Regulations (EAR) policy template

■ FACTA 2003 (Red Flag Rules) policy template

■ Financial Information policy template


Library of policy templates 1559

■ Forbidden Websites policy template

■ Gambling policy template

■ General Data Protection Regulation (Banking and Finance)

■ General Data Protection Regulation (Digital Identity)

■ General Data Protection Regulation (Government Identification)

■ General Data Protection Regulation (Healthcare and Insurance)

■ General Data Protection Regulation (Personal Profile)

■ General Data Protection Regulation (Travel)

■ Gramm-Leach-Bliley policy template

■ HIPAA and HITECH (including PHI) policy template

■ Human Rights Act 1998 policy template

■ Illegal Drugs policy template

■ Individual Taxpayer Identification Numbers (ITIN) policy template

■ International Traffic in Arms Regulations (ITAR) policy template

■ Media Files policy template

■ Medicare and Medicaid (including PHI)

■ Merger and Acquisition Agreements policy template

■ NASD Rule 2711 and NYSE Rules 351 and 472 policy template

■ NASD Rule 3010 and NYSE Rule 342 policy template

■ NERC Security Guidelines for Electric Utilities policy template

■ Network Diagrams policy template

■ Network Security policy template

■ Offensive Language policy template

■ Office of Foreign Assets Control (OFAC) policy template

■ OMB Memo 06-16 and FIPS 199 Regulations policy template

■ Password Files policy template

■ Payment Card Industry (PCI) Data Security Standard policy template


Library of policy templates 1560

■ PIPEDA policy template

■ Price Information policy template

■ Project Data policy template

■ Proprietary Media Files policy template

■ Publishing Documents policy template

■ Racist Language policy template

■ Restricted Files policy template

■ Restricted Recipients policy template

■ Resumes policy template

■ Sarbanes-Oxley policy template

■ SEC Fair Disclosure Regulation policy template

■ Sexually Explicit Language policy template

■ Source Code policy template

■ State Data Privacy policy template

■ SWIFT Codes policy template

■ Symantec DLP Awareness and Avoidance policy template

■ UK Drivers License Numbers policy template

■ UK Electoral Roll Numbers policy template

■ UK National Health Service (NHS) Number policy template

■ UK National Insurance Numbers policy template

■ UK Passport Numbers policy template

■ UK Tax ID Numbers policy template

■ US Intelligence Control Markings (CAPCO) and DCID 1/7 policy template

■ US Social Security Numbers policy template

■ Violence and Weapons policy template

■ Webmail policy template

■ Yahoo Message Board Activity policy template


Library of policy templates 1561
Caldicott Report policy template

■ Yahoo and MSN Messengers on Port 80 policy template

Caldicott Report policy template


The UK Chief Medical Officer commissioned the Caldicott Report in December 1997 to improve
the way the National Health Service handles and protects patient information. The Caldicott
Committee reviewed the confidentiality of data throughout the NHS for purposes other than
direct care, medical research, or where there is a statutory requirement for information. Its
recommendations are now being put into practice throughout the NHS and in the Health
Protection Agency.
The Drug, and Disease, and the Treatment keyword lists are updated with recent keywords
based on information from the U.S. Federal Drug Administration (FDA) and other sources.
See “Keep the keyword lists for your HIPAA and Caldicott policies up to date” on page 850.

Table 46-1 Caldicott Report policy template rules

Rule Type Description

Patient Data and Compound EDM and This compound rule looks for a match among the following EDM data
Drug Keywords Keyword Rule fields in combination with a keyword from the "Prescription Drug
Names" dictionary. Both conditions must be satisfied for the rule to
trigger an incident.

■ Account number
■ Email
■ ID card number
■ Last name
■ Phone
■ UK NHS (National Health Service) number
■ UK NIN (National Insurance Number)

Patient Data and Compound EDM and This compound rule looks for a match among the following EDM data
Disease Keywords Keyword Rule fields in combination with a keyword from the "Disease Names"
dictionary. Both conditions must be satisfied for the rule to trigger an
incident.

■ Account number
■ Email
■ ID card number
■ Last name
■ Phone
■ UK NHS (National Health Service) number
■ UK NIN (National Insurance Number)
Library of policy templates 1562
Canadian Social Insurance Numbers policy template

Table 46-1 Caldicott Report policy template rules (continued)

Rule Type Description

Patient Data and Compound EDM and This compound rule looks for a match among the following EDM data
Treatment Keyword Rule fields in combination with a keyword from the "Medical Treatment
Keywords Keywords" dictionary. Both conditions must be satisfied for the rule
to trigger an incident:

■ Account number
■ Email
■ ID card number
■ Last name
■ Phone
■ UK NHS (National Health Service) number
■ UK NIN (National Insurance Number)

UK NHS Number Simple DCM Rule This rule looks for a keyword from "UK NIN Keywords" dictionary in
and Drug Keywords combination with a pattern matching the UK NIN data identifier and a
keyword from the "Prescription Drug Names" dictionary.

UK NHS Number Simple DCM Rule This rule looks for a keyword from "UK NIN Keywords" dictionary in
and Disease combination with a pattern matching the UK NIN data identifier and a
Keywords keyword from the "Disease Names" dictionary.

UK NHS Number Simple DCM Rule This rule looks for a keyword from "UK NIN Keywords" dictionary in
and Treatment combination with a pattern matching the UK NIN data identifier and a
Keywords keyword from the "Medical Treatment Keywords" dictionary.

See “Choosing an Exact Data Profile” on page 409.


See “Configuring policies” on page 413.
See “Exporting policy detection as a template” on page 442.

Canadian Social Insurance Numbers policy template


This policy detects patterns indicating Canadian social insurance numbers (SINs) at risk of
exposure.

DCM Rule Canadian Social Insurance Numbers

This rule looks for a match to the Canadian Social Insurance Number data identifier
and a keyword from the "Canadian Social Ins. No. Words" dictionary.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.
Library of policy templates 1563
CAN-SPAM Act policy template

CAN-SPAM Act policy template


The Controlling the Assault of Non-Solicited Pornography and Marketing Act (CAN-SPAM)
establishes requirements for those who send commercial email.
The CAN-SPAM Act template detects activity from an organization's bulk mailer to help ensure
compliance with the CAN-SPAM Act requirements.
The detection exception Exclude emails that contain the mandated keywords allows
messages to pass that have one or more keywords from the user-defined "CAN-SPAM
Exception Keywords" dictionary.

Table 46-2 Detection exception: Exclude emails that contain the mandated keywords

Method Condition Configuration

Simple exception Content Matches Exclude emails that contain the mandated keywords (Keyword Match):
Keyword (DCM)
■ Match keyword from "[physical postal address]" or "advertisement".
■ Look in envelope, subject, body, attachments.
■ Case insensitive.
■ Match on whole words only.
Note: After you define the keywords, you can choose to count all
matches and require 2 keywords from the list to be matched.

The detection exception CAN-SPAM Compliant Emails excludes from detection document
content from the selected IDM index with at least 90% match.

Table 46-3 Detection exception: CAN-SPAM Compliant Emails

Method Condition Configuration

Simple exception Content Matches Exception for CAN-SPAM compliant emails (IDM):
Document Profile
■ Exact content match (90%)
(IDM)
■ Look in the message body and attachments.
■ Check for existence.

See “Choosing an Indexed Document Profile” on page 411.

If an exception is not met, the detection rule Monitor Email From Bulk Mailer looks for a
sender's email address that matches one from the "Bulk Mailer Email Address" list, which is
user-defined.
Library of policy templates 1564
Colombian Personal Data Protection Law 1581 policy template

Table 46-4 Detection rule: Monitor Email From Bulk Mailer

Method Condition Configuration

Simple rule Sender/User Matches Monitor Email From Bulk Mailer (Sender):
Pattern (DCM)
■ Match sender pattern(s): [[email protected]] (user defined)
■ Severity: High.

See “Creating a policy from a template” on page 397.


See “Exporting policy detection as a template” on page 442.

Colombian Personal Data Protection Law 1581 policy


template
The Colombian Personal Data Protection Law 1581 policy template detects the personal data
of Colombian citizens at risk of exposure.

Table 46-5
Rule Type Description

Colombian Address Number DCM Rule This rule detects Colombian street addresses using the Colombian
(Data Identifiers) Addresses data identifier.

Colombian Cell Phone Number DCM Rule This rule detects Colombian cell phone numbers using the Colombian
(Data Identifiers) Cell Phone Number data identifier.

Colombian Personal DCM Rule This rule detects Colombian personal identification numbers using the
Identification Number (Data Colombian Personal Identification Number data identifier.
Identifiers)

Colombian Tax Identification DCM Rule This rule detects Colombian tax identification numbers using the
Number (Data Identifiers) Colombian Tax Identification Number data identifier.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

Common Spyware Upload Sites policy template


The Common Spyware Upload Sites policy detects access to common spyware upload Web
sites.
Library of policy templates 1565
Competitor Communications policy template

DCM Rule Forbidden Websites 1

This is a compound rule that looks for either specified IP addresses or URLs in the
"Forbidden Websites 1" dictionary.

DCM Rule Forbidden Websites 2

This rule looks for a match of a specified URL in the "Forbidden Websites 2"
dictionary.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

Competitor Communications policy template


The Competitor Communications policy detects forbidden communications with competitors.

DCM Rule Competitor List

This rule looks for keywords (domains) from the "Competitor Domains" dictionary,
which is user-defined.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

Confidential Documents policy template


This policy detects company-confidential documents at risk of exposure.

Table 46-6 Rules comprising the Confidential Documents template

Rule Type Description

Confidential Documents, Simple IDM Rule with one This rule looks for content from specific documents
Indexed condition registered as confidential; returns a match if 80% or more
of the source document is found. If you do not have an
Indexed Document Profile configured this rule is dropped.
Library of policy templates 1566
Credit Card Numbers policy template

Table 46-6 Rules comprising the Confidential Documents template (continued)

Rule Type Description

Confidential Documents Compound DCM Rule: This rule looks for a combination of keywords from the
Attachment/File Type and "Confidential Keywords" list and the following file types:
Keyword Match. Both
■ Microsoft Excel Macro
conditions must match for
■ Microsoft Excel
the rule to trigger an
incident. ■ Microsoft Works Spreadsheet
■ SYLK Spreadsheet
■ Corel Quattro Pro
■ Multiplan Spreadsheet
■ Comma Separate Values
■ Applix Spreadsheets
■ Lotus 1-2-3
■ Microsoft Word
■ Adobe PDF
■ Microsoft PowerPoint

Proprietary Documents Compound DCM Rule: This compound rule looks for a combination of keywords
Attachment/File Type and from the "Proprietary Keywords" dictionary and the above
Keyword Match referenced file types.

Internal Use Only Compound DCM Rule: This compound rule looks for a combination of keywords
Documents Attachment/File Type and from the "Internal Use Only Keywords" dictionary and the
Keyword Match above referenced file types.

Documents Not For Compound DCM Rule: This compound rule looks for a combination of keywords
Distribution Attachment/File Type and from the "Not For Distribution Words" dictionary and the
Keyword Match above referenced file types.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

Credit Card Numbers policy template


This policy detects patterns indicating credit card numbers at risk of exposure.

DCM Rule Credit Card Numbers, All

This rule looks for a match to the credit card number system pattern and a keyword
from the "Credit Card Number Keywords" dictionary.

See “Configuring policies” on page 413.


Library of policy templates 1567
Customer Data Protection policy template

See “Exporting policy detection as a template” on page 442.

Customer Data Protection policy template


This policy detects customer data at risk of exposure.

Table 46-7 EDM conditions for the Customer Data Protection policy template

Rule name Type Description Details

Username/Password EDM Rule This rule looks for usernames and However, the following
Combinations passwords in combination with three or combinations are not a
more of the following fields: violation:

■ SSN ■ Phone, email, and last


■ Phone name
■ Email ■ Email, first name, and
■ First Name last name
■ Last Name ■ Phone, first name, and
last name
■ Bank Card number
■ Account Number
■ ABA Routing Number
■ Canadian Social Insurance Number
■ UK National Insurance Number

Date of Birth EDM Rule This rule looks for any three of the However, the following
following data fields in combination: combinations are not a
violation:
■ SSN
■ Phone ■ Phone, email, and first
■ Email name
■ First Name ■ Phone, email, and last
name
■ Last Name
■ Email, first name, and
■ Bank Card number
last name
■ Account Number
■ Phone, first name, and
■ ABA Routing Number
last name
■ Canadian Social Insurance Number
■ UK National Insurance Number
■ Date of Birth

Exact SSN or CCN EDM Rule This rule looks for an exact social
security number or bank card number.

Customer Directory EDM Rule This rule looks for Phone or Email.
Library of policy templates 1568
Data Protection Act 1998 policy template

Table 46-8 DCM conditions for the Customer Data Protection policy template

Rule name Type Description Details

US Social Security Compound DCM This rule looks for a match to the See “Randomized US Social
Number Patterns Rule Randomized US Social Security Security Number (SSN)”
number data identifier and a keyword on page 1414.
from the "US SSN Keywords"
dictionary.

Credit Card Numbers, All Compound DCM This rule looks for a match to the credit See “Credit Card Number”
Rule card number system pattern and a on page 1095.
keyword from the "Credit Card Number
Keywords" dictionary.

ABA Routing Numbers Compound DCM This rule looks for a match to the ABA See “ABA Routing Number”
Rule Routing number data identifier and a on page 1013.
keyword from the "ABA Routing
Number Keywords" dictionary.

See “About the Exact Data Profile and index” on page 528.
See “Configuring policies” on page 413.
See “Exporting policy detection as a template” on page 442.

Data Protection Act 1998 policy template


The Data Protection Act 1998 (replacement of Data Protection Act 1984) set standards which
must be satisfied when obtaining, holding, using, or disposing of personal data in the UK. The
Data Protection Act 1998 covers anything with personal identifiable information (such as data
about personal health, employment, occupational health, finance, suppliers, and contractors).
Library of policy templates 1569
Data Protection Act 1998 policy template

Table 46-9 Data Protection Act 1998, Personal Data detection rule

Description

This EDM rule looks for three of the following columns of data: However, the following combinations are not an
incident:
■ NIN (National Insurance Number)
■ Account number ■ First name, last name, pin
■ Pin ■ First name, last name, password
■ Bank card number ■ First name, last name, email
■ First name ■ First name, last name, phone
■ Last name ■ First name, last name, mother's maiden name
■ Drivers license
■ Password
■ Tax payer ID
■ UK NHS number
■ Date of birth
■ Mother's maiden name
■ Email address
■ Phone number

Table 46-10 Additional detection rules in the Data Protection Act 1998 policy template

Description

The UK Electoral Roll Numbers rule implements the UK Electoral Roll Number data identifier.

See “UK Electoral Roll Number” on page 1527.

The UK National Insurance Numbers rule implements the narrow breadth edition of the UK National Insurance
Number data identifier.

See “UK National Insurance Number” on page 1530.

The UK Tax ID Numbers rule implements the narrow edition of the UK Tax ID Number data identifier.

See “UK Tax ID Number” on page 1534.

The UK Drivers License Numbers rule implements the narrow breadth edition of the UK Driver's License number
data identifier.

See “UK Drivers Licence Number” on page 1525.

The UK Passport Numbers rule implements the narrow breadth edition of the UK Passport Number data identifier.

See “UK Passport Number” on page 1532.


Library of policy templates 1570
Data Protection Directives (EU) policy template

Table 46-10 Additional detection rules in the Data Protection Act 1998 policy template
(continued)

Description

The UK NHS Numbers rule implements the narrow breadth edition of the UK National Health Service (NHS) Number
data identifier.

See “UK National Health Service (NHS) Number” on page 1528.

See “Choosing an Exact Data Profile” on page 409.


See “Configuring policies” on page 413.
See “Exporting policy detection as a template” on page 442.

Data Protection Directives (EU) policy template


Directives 95/46/EC of the European Parliament deal with the protection of individuals with
regard to the processing and free movement of personal data. This policy detects personal
data specific to the EU directives.

Note: The General Data Protection Regulation (GDPR) replaces the EU Data Protection
Directives as of 25 May 2018.
Library of policy templates 1571
Data Protection Directives (EU) policy template

Table 46-11 Data Protection Directives (EU) detection rule

Method Description

EDM Rule EU Data Protection Directives


This rule looks for any two of the following data columns:

■ Last Name
■ Bank Card number
■ Drivers license number
■ Account Number
■ PIN
■ Medical account number
■ Medical ID card number
■ User name
■ Password
■ ABA Routing Number
■ Email
■ Phone
■ Mother's maiden name
However, the following combinations do not create a match:

■ Last name, email


■ Last name, phone
■ Last name, account number
■ Last name, username

EDM Rule EU Data Protection, Contact Info

This rule looks for any two of the following data columns: last name, phone, account number,
username, and email.

Exception Except for email internal to the EU

This rule is an exception if the recipient is within the EU. This covers recipients with any of the country
codes from the "EU Country Codes" dictionary.

See “Choosing an Exact Data Profile” on page 409.


See “Configuring policies” on page 413.
See “Exporting policy detection as a template” on page 442.
Library of policy templates 1572
Defense Message System (DMS) GENSER Classification policy template

Defense Message System (DMS) GENSER


Classification policy template
The Defense Information Systems Agency has established guidelines for Defense Message
System (DMS) General Services (GENSER) message classifications, categories, and markings.
These standards specify how to mark classified and sensitive documents according to U.S.
standards. These standards also provide interoperability with NATO countries and other U.S.
allies.
The GENSER policy template enforces GENSER guidelines by detecting information that is
classified as confidential. The template contains four simple (single condition) keyword matching
(DCM) detection rules. If any rule condition matches, the policy reports an incident.
The detection rule Top Secret Information (Keyword Match) looks for any keywords in the
"Top Secret Information" dictionary.

Table 46-12 Detection rule: Top Secret Information (Keyword Match)

Method Condition Configuration

Simple rule Content Matches Top Secret Information (Keyword Match):


Keyword (DCM)
■ Keyword dictionary: "TOP SECRET//"
■ Severity: High
■ Check for existence.
■ Look in envelope, subject, body, attachments.
■ Case sensitive.
■ Match on whole or partial words.

The detection rule Secret Information (Keyword Match) looks for any keywords in the "Secret
Information" dictionary.

Table 46-13 Detection rule: Secret Information (Keyword Match)

Method Condition Configuration

Simple rule Content Matches Secret Information (Keyword Match):


Keyword (DCM)
■ Keyword dictionary: "SECRET//"
■ Severity: High
■ Check for existence
■ Look in envelope, subject, body, attachments
■ Case sensitive
■ Match on whole or partial words.
Library of policy templates 1573
Design Documents policy template

The detection rule Classified or Restricted Information (Keyword Match) looks for any
keywords in the "Classified or Restricted Information" dictionary.

Table 46-14 Detection rule: Classified or Restricted Information (Keyword Match)

Method Condition Configuration

Simple rule Content Matches Classified or Restricted Information (Keyword Match):


Keyword (DCM)
■ Keyword dictionary: "CLASSIFIED//,//RESTRICTED//"
■ Severity: High
■ Check for existence.
■ Look in envelope, subject, body, attachments.
■ Case sensitive.
■ Match on whole or partial words.

The detection rule Other Sensitive Information looks for any keywords in the "Other Sensitive
Information" dictionary.

Table 46-15 Other Sensitive Information detection rule

Method Condition Configuration

Simple rule Content Matches Other Sensitive Information (Keyword Match):


Keyword (DCM)
■ Keyword dictionary: FOR OFFICIAL USE ONLY, SENSITIVE BUT
UNCLASSIFIED,DOD UNCLASSIFIED CONTROLLED NUCLEAR
INFORMATION
■ Severity: High
■ Check for existence.
■ Look in envelope, subject, body, attachments.
■ Case sensitive.
■ Match on whole words only.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

Design Documents policy template


This policy detects various types of design documents, such as CAD/CAM, at risk of exposure.

IDM Rule Design Documents, Indexed

This rule looks for content from specific design documents registered as proprietary.
It returns a match if the engine detects 80% or more of the source document.
Library of policy templates 1574
Employee Data Protection policy template

DCM Rule Design Document Extensions

This rule looks for the specified file name extensions found in the "Design Document
Extensions" dictionary.

DCM Rule Design Documents

This rule looks for the following specified file types:

■ cad_draw
■ dwg

Note: Both file types and file name extensions are used because the policy does not detect
the true file type for all the required documents.

See “Choosing an Indexed Document Profile” on page 411.


See “Configuring policies” on page 413.
See “Exporting policy detection as a template” on page 442.

Employee Data Protection policy template


This policy detects employee data at risk of exposure.

Table 46-16 EDM rules for Employee Data Protection

Name Type Description

Username/Password Combinations EDM Rule This rule looks for usernames and passwords in
combination with any three of the following data fields.

■ SSN
■ Phone
■ Email
■ First Name
■ Last Name
■ Bank Card Number
■ Account Number
■ ABA Routing Number
■ Canadian Social Insurance Number
■ UK National Insurance Number
■ Date of Birth

Employee Directory EDM Rule This rule looks for Phone or Email.
Library of policy templates 1575
Encrypted Data policy template

Table 46-17 DCM rules for Employee Data Protection

Name Type Description

US Social Security Number Patterns DCM Rule This rule looks for a match from the Randomized US Social
Security Number (SSN) data identifier and a keyword from
the "US SSN Keywords" dictionary.

See “Randomized US Social Security Number (SSN)”


on page 1414.

Credit Card Numbers, All DCM Rule This rule looks for a match from the credit card number
system pattern and a keyword from the "Credit Card
Number Keywords" dictionary.

See “Credit Card Number” on page 1095.

ABA Routing Numbers DCM Rule This rule looks for a match from the ABA Routing number
data identifier and a keyword from the "ABA Routing
Number Keywords" dictionary.

See “ABA Routing Number” on page 1013.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

Encrypted Data policy template


This policy detects the use of encryption by a variety of methods including S/MIME, PGP,
GPG, and file password protection.

DCM Rule Password Protected Files

This rule looks for the following file types: encrypted_zip, encrypted_doc,
encrypted_xls, or encrypted_ppt.

DCM Rule PGP Files

This rule looks for the following file type: pgp.

DCM Rule GPG Files

This rule looks for a keyword from the "GPG Encryption Keywords" dictionary.

DCM Rule S/MIME

This rule looks for a keyword from the "S/MIME Encryption Keywords" dictionary.

DCM Rule HushMail Transmissions

This rule looks for a match from a list of recipient URLs.


Library of policy templates 1576
Export Administration Regulations (EAR) policy template

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

Export Administration Regulations (EAR) policy


template
The U.S. Department of Commerce enforces the Export Administration Regulations (EAR).
These regulations primarily cover technologies and technical information with commercial and
military applicability. These technologies are also known as dual-use technologies, for example,
chemicals, satellites, software, computers, and so on.
This Export Administration Regulations (EAR) template detects violations from regulated
countries and controlled technologies.
The detection rule Indexed EAR Commerce Control List Items and Recipients looks for a
country code in the recipient from the "EAR Country Codes" dictionary and for a specific "SKU"
from an Exact Data Profile index (EDM). Both conditions must match to trigger an incident.

Table 46-18 Detection rule: Indexed EAR Commerce Control List Items and Recipients

Method Condition Configuration

Compound rule Content Matches Exact See “Choosing an Exact Data Profile” on page 409.
Data (EDM)

Content Matches Keyword See “Configuring the Content Matches Keyword condition”
(DCM) on page 844.

The detection rule EAR Commerce Control List and Recipients looks for a country code in
the recipient from the "EAR Country Codes" list and a keyword from the "EAR CCL Keywords"
dictionary. Both conditions must match to trigger an incident.
Library of policy templates 1577
FACTA 2003 (Red Flag Rules) policy template

Table 46-19 Detection rule: EAR Commerce Control List and Recipients

Method Condition Configuration

Compound rule Recipient Matches Pattern EAR Commerce Control List and Recipients (Recipient):
(DCM)
■ Match: Email address OR URL domain suffixes.
■ Severity: High.
■ Check for existence.
■ At least 1 recipient(s) must match.
■ Matches on entire message.

Content Matches Keyword EAR Commerce Control List and Recipients (Keyword Match):
(DCM)
■ Match: EAR CCL Keywords
■ Severity: High.
■ Check for existence.
■ Look in envelope, subject, body, attachments.
■ Case insensitive.
■ Match on whole words only.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

FACTA 2003 (Red Flag Rules) policy template


This policy helps to address sections 114 and 315 (or Red Flag Rules) of the Fair and Accurate
Credit Transactions Act (FACTA) of 2003. These rules specify that a financial institution or
creditor that offers or maintains covered accounts must develop and implement an identity
theft prevention program. FACTA is designed to detect, prevent, and mitigate identity theft in
connection with the opening of a covered account or any existing covered account.
The Username/Password Combinations detection rule detects the presence of both a user
name and password from a profiled database index.

Table 46-20 Username/Password Combinations detection rule

Method Condition Configuration

Simple rule Content Matches This condition detects exact data containing both of the following data
Exact Data (EDM) items:

■ User name
■ Password

See “Choosing an Exact Data Profile” on page 409.


Library of policy templates 1578
FACTA 2003 (Red Flag Rules) policy template

The Exact SSN or CCN detection rule detects the presence of either a social security number
or a credit card number from a profiled database.

Table 46-21 Exact SSN or CCN detection rule

Method Condition Configuration

Simple rule Content Matches This condition detects exact data containing either of the following data
Exact Data (EDM) columns:

■ Social security number (Taxpayer ID)


■ Bank Card Number

See “Choosing an Exact Data Profile” on page 409.

The Customer Directory detection rule detects the presence of either an email address or a
phone number from a profiled database.

Table 46-22 Customer Directory detection rule

Method Condition Configuration

Simple rule Content Matches This condition detects exact data containing either of the following data
Exact Data (EDM) columns:

■ Email address
■ Phone number

See “Choosing an Exact Data Profile” on page 409.

The Three or More Data Columns detection rule detects exact data containing three or more
of data items from a profiled database index.
Library of policy templates 1579
FACTA 2003 (Red Flag Rules) policy template

Table 46-23 Three or More Data Columns detection rule

Method Condition Configuration

Simple rule Content Matches Detects exact data containing three or more of the following data items:
Exact Data (EDM)
■ ABA Routing Number
■ Account Number
■ Bank Card Number
■ Birth Date
■ Email address
■ First Name
■ Last Name
■ National Insurance Number
■ Password
■ Phone Number
■ Social Insurance Number
■ Social security number (Taxpayer ID)
■ User name

However, the following combinations are not a match:

■ Phone Number, Email, First Name


■ Phone Number, First Name, Last Name

See “Choosing an Exact Data Profile” on page 409.

The US Social Security Number Patterns detection rule implements the narrow breadth
edition of the Randomized US Social Security Number (SSN) system data identifier.
See “Randomized US Social Security Number (SSN)” on page 1414.
This data identifier detects nine-digit numbers with the pattern DDD-DD-DDDD separated with
dashes or spaces or without separators. The number must be in valid assigned number ranges.
This condition eliminates common test numbers, such as 123456789 or all the same digit. It
also requires the presence of a Social Security keyword.

Table 46-24 US Social Security Number Patterns detection rule

Method Condition Configuration

Simple rule Content Matches ■ Data Identifier: Randomized US Social Security Number (SSN) narrow
Data Identifier (DCM) breadth
■ Severity: High.
■ Count all matches.
■ Look in envelope, subject, body, attachments.
Library of policy templates 1580
FACTA 2003 (Red Flag Rules) policy template

The Credit Card Numbers, All detection rule implements the narrow breadth edition of the
Credit Card Number system Data Identifier.
See “Credit Card Number” on page 1095.
This data identifier detects valid credit card numbers that are separated by spaces, dashes,
periods, or without separators. This condition performs Luhn check validation and includes
formats for American Express, Diner's Club, Discover, Japan Credit Bureau (JCB), MasterCard,
and Visa. It eliminates common test numbers, including those reserved for testing by credit
card issuers. It also requires the presence of a credit card keyword.

Table 46-25 Credit Card Numbers, All detection rule

Method Condition Configuration

Simple rule Content Matches ■ Data Identifier: Credit Card Number narrow breadth
Data Identifier (DCM) See “Credit Card Number narrow breadth” on page 1100.
■ Severity: High.
■ Count all matches.
■ Look in envelope, subject, body, attachments.

The ABA Routing Numbers detection rule implements the narrow breadth edition of the ABA
Routing Number system Data Identifier.
See “ABA Routing Number” on page 1013.
This data identifier detects nine-digit numbers. It validates the number using the final check
digit. This condition eliminates common test numbers, such as 123456789, number ranges
that are reserved for future use, and all the same digit. This condition also requires the presence
of an ABA keyword.

Table 46-26 ABA Routing Numbers detection rule

Method Condition Configuration

Simple rule Content Matches ■ Data Identifier: ABA Routing Number narrow breadth
Data Identifier (DCM) See “ABA Routing Number” on page 1013.
■ Severity: High.
■ Count all matches.
■ Look in envelope, subject, body, attachments.

See “Creating a policy from a template” on page 397.


See “Exporting policy detection as a template” on page 442.
Library of policy templates 1581
Financial Information policy template

Financial Information policy template


The Financial Information policy detects financial data and information.

IDM Rule Financial Information, Indexed

This rule looks for content from specific financial information files registered as
proprietary; returns a match if 80% or more of the source document is found.

DCM Rule Financial Information

This rule looks for the combination of specified file types, keywords from the
"Financial Keywords" dictionary, and keywords from the "Confidential/Proprietary
Words" dictionary.
The specified file types are as follows:

■ excel_macro
■ xls
■ works_spread
■ sylk
■ quattro_pro
■ mod
■ csv
■ applix_spread
■ 123

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

Forbidden Websites policy template


The Forbidden Websites policy template is designed to detect access to specified web sites.

Note: To process HTTP GET requests appropriately, you may need to configure the Network
Prevent for Web server. See “To enable a Forbidden Website policy to process GET requests
appropriately” on page 1582.

Table 46-27 Forbidden Websites policy template

DCM Keyword Rule Description

Forbidden Websites This rule looks for any keywords in the "Forbidden
Websites" dictionary, which is user-defined.
Library of policy templates 1582
Gambling policy template

To enable a Forbidden Website policy to process GET requests appropriately


1 Configure your web proxy server to forward GET requests to the Network Prevent for Web
server.
2 Set the L7.processGets Advanced Server Setting on the Network Prevent for Web server
to "true" (which is the default).
3 Reduce the L7.minSizeofGetURL Advanced Server Setting on the Network Prevent for
Web server from the default of 100 to a number of bytes (characters) smaller than the
length of the shortest web site that the policy specifies

Note: Reducing the minimum size of GETs increases the number of URLs that have to
be processed, which increases server traffic load. One approach is to calculate the number
of characters in the shortest URL specified in the list of forbidden URLs and set the
minimum size to that number. Another approach is to set the minimum URL size to 10 as
that should cover all cases.

4 You may need to adjust the "Ignore Requests Smaller Than" setting in the ICAP
configuration of the Network Prevent server from the default 4096 bytes. This value stops
processing of incoming web pages that contain fewer bytes than the number specified. If
a page of a forbidden web site URL might be smaller than that number, the setting should
be reduced appropriately.
See “Configuring policies” on page 413.
See “Exporting policy detection as a template” on page 442.

Gambling policy template


This policy detects any reference to gambling.

Table 46-28 Gambling policy template

DCM Keyword Rule DCM Rule

Suspicious Gambling Keywords This rule looks for five instances of keywords from the "Gambling
Keywords, Confirmed" dictionary.

Less Suspicious Gambling Keywords This rule looks for ten instances of keywords from the "Gambling
Keywords, Suspect" dictionary.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.
Library of policy templates 1583
General Data Protection Regulation (Banking and Finance)

General Data Protection Regulation (Banking and


Finance)
This template focuses on General Data Protection Regulation (GDPR) banking and finance
related keywords, Data Identifiers and an EDM profile with related columns.
The GDPR is a regulation by which the European Commission intends to strengthen and unify
data protection for individuals within the EU. It also addresses export of personal data outside
the EU. The primary objectives of the GDPR are to give citizens back the control of their
personal data and to simplify the regulatory environment for international business by unifying
the regulation within the EU. The GDPR replaces the EU Data Protection Directives as of 25
May 2018.
Library of policy templates 1584
General Data Protection Regulation (Banking and Finance)

Table 46-29 General Data Protection Regulations (Banking and Finance) detection rules

Name Type Description

GDPR Banking and Finance Related Keyword Match Matches a list of related keywords:
Keywords
account number, bank card number,
driver license number, ID card
number, Kontonummer,
Bankkartennummer,
Führerscheinnummer,
Ausweisnummer, Numéro de
compte, numéro carte bancaire,
numéro de permis de conduire,
numéro de carte d'identité, numero
di conto, banca carta numero, carta
d'identità numero, patente guida
numero, Número cuenta, número
tarjeta bancaria, número licencia
conducir, número tarjeta de
identificación, rekeningnummer,
bank kaart aantal, rijbewijs nummer,
ID-kaartnummer, bankkortnummer,
körkort nummer,
identitetskortnummer,
førerkortnummer, ID-kortnummer,
tilinumero, pankkikortin numero,
ajokortin numero, Henkilökortin
numero, uimhir chuntais, uimhir
chárta bainc, uimhir ceadúnas
tiomána, Uimhir chárta aitheantais,
Kontosnummer,
Identifikatiounskaart, número de
conta, número cartão bancário,
número licença motorista, Número
do cartão de identificação

Credit Card Number Data Identifiers Account number needed to process


credit card transactions. Often
abbreviated as CCN. Also known as
a Primary Account Number (PAN).

See “Credit Card Number” on page 1095.


Library of policy templates 1585
General Data Protection Regulation (Banking and Finance)

Table 46-29 General Data Protection Regulations (Banking and Finance) detection rules
(continued)

Name Type Description

UK Driver's Licence Number Data Identifiers The UK Drivers Licence Number is the
identification number for an individual's
driver's license issued by the Driver
and Vehicle Licensing Agency of the
United Kingdom.

See “UK Drivers Licence Number”


on page 1525.

UK Passport Number Data Identifiers The UK Passport Number identifies a


United Kingdom passport using the
current official specification of the UK
Government Standards of the UK
Cabinet Office.

See “UK Passport Number”


on page 1532.

UK Tax ID Number Data Identifiers The UK Tax ID Number is a personal


identification number provided by the
UK Government Standards of the UK
Cabinet Office.

See “UK Tax ID Number” on page 1534.

Credit Card Magnetic Stripe Data Data Identifiers The magnetic stripe of a credit card
contains information about the card.
Storage of the complete version of this
data is a violation of the Payment Card
Industry (PCI) Data Security Standard.

See “Credit Card Magnetic Stripe


Data” on page 1092.
Library of policy templates 1586
General Data Protection Regulation (Banking and Finance)

Table 46-29 General Data Protection Regulations (Banking and Finance) detection rules
(continued)

Name Type Description

French Passport Number Data Identifiers The French passport is an identity


document issued to French citizens.
Besides enabling the bearer to travel
internationally and serving as
indication of French citizenship, the
passport facilitates the process of
securing assistance from French
consular officials abroad or other
European Union member states in
case a French consular is absent, if
needed.

See “French Passport Number”


on page 1187.

Belgian National Number Data Identifiers All citizens of Belgium have a National
Number. Belgians 12 years of age and
older are issued a Belgian identity
card.

See “Belgian National Number”


on page 1039.

Czech Personal Identification Data Identifiers All citizens of the Czech Republic are
Number issued a unique personal identification
number by the Ministry of Interior.

See “Czech Republic Personal


Identification Number” on page 1114.

French INSEE code Data Identifiers The INSEE code in France is used as
a social insurance number, a national
identification number, and for taxation
and employment purposes.

See “French INSEE Code”


on page 1185.

French Social Security Number Data Identifiers The French Social Security Number
(FSSN) is a unique number assigned
to each French citizen or resident
foreign national. It serves as a national
identification number.

See “French Social Security Number”


on page 1188.
Library of policy templates 1587
General Data Protection Regulation (Banking and Finance)

Table 46-29 General Data Protection Regulations (Banking and Finance) detection rules
(continued)

Name Type Description

Greek Tax Identification Number Data Identifiers The Arithmo Forologiko Mitro (AFM)
is a unique personal tax identification
number assigned to any individual
resident in Greece or person who
owns property in Greece.

See “Greek Tax Identification Number”


on page 1204.

Hungarian Social Security Number Data Identifiers The Hungarian Social Security
Number (TAJ) is a unique identifier
issued by the Hungarian government.

See “Hungarian Social Security


Number” on page 1221.

Hungarian Tax Identification Data Identifiers The Hungarian Tax Identification


Number Number is a 10-digit number that
always begins with the digit "8."

See “Hungarian Tax Identification


Number” on page 1223.

Hungarian VAT Number Data Identifiers All Hungarian businesses (including


non-profit organizations) upon
registration at the court of Registry are
granted a value-added tax (VAT)
number.

See “Hungarian VAT Number”


on page 1225.

Irish Personal Public Service Data Identifiers The format of the number is a unique
Number 8-character alphanumeric string
ending with a letter, such as
8765432A. The number is assigned at
the registration of birth of the child and
is issued on a Public Services Card
and is unique to every person.

See “Irish Personal Public Service


Number” on page 1274.
Library of policy templates 1588
General Data Protection Regulation (Banking and Finance)

Table 46-29 General Data Protection Regulations (Banking and Finance) detection rules
(continued)

Name Type Description

Luxembourg National Register of Data Identifiers The Luxembourg National Register of


Individuals Number Individuals Number is an 11-digit
identification number issued to all
Luxembourg citizens at age 15.

See “Luxembourg National Register


of Individuals Number” on page 1320.

Polish Identification Number Data Identifiers Every Polish citizen 18 years of age
or older residing permanently in
Poland must have an Identity Card,
with a unique personal number. The
number is used as identification for
almost all purposes.

See “Polish Identification Number”


on page 1394.

Polish REGON Number Data Identifiers Each national economy entity is


obligated to register in the register of
business entities called REGON in
Poland. It is the only integrated
register in Poland covering all of the
national economy entities. Each
company has a unique REGON
number.

See “Polish REGON Number”


on page 1396.

Polish Social Security Number Data Identifiers The Polish Social Security Number
(PESEL) (PESEL) is the national identification
number used in Poland. The PESEL
number is mandatory for all permanent
residents of Poland and for temporary
residents living in Poland. It uniquely
identifies a person and cannot be
transferred to another.

See “Polish Social Security Number


(PESEL)” on page 1398.
Library of policy templates 1589
General Data Protection Regulation (Banking and Finance)

Table 46-29 General Data Protection Regulations (Banking and Finance) detection rules
(continued)

Name Type Description

Polish Tax Identification Number Data Identifiers The Polish Tax Identification Number
(NIP) is a number the government
gives to every Poland citizen who
works or does business in Poland. All
taxpayers have a tax identification
number called NIP.

See “Polish Tax Identification Number”


on page 1400.

Romanian Numerical Personal Code Data Identifiers In Romania, each citizen has a unique
numerical personal code (Code
Numeric Personal, or CNP). The
number is used by authorities, health
care, schools, universities, banks, and
insurance companies for customer
identification.

See “Romanian Numerical Personal


Code” on page 1425.

Spanish DNI ID Data Identifiers The Spanish DNI ID appears on the


Documento nacional de identidad
(DNI) and is issued by the Spanish
Hacienda Publica to every citizen of
Spain. It is the most important unique
identifier in Spain used for opening
accounts, signing contracts, taxes, and
elections.

See “Spanish DNI ID” on page 1481.

Spanish Social Security Number Data Identifiers The Spanish Social Security Number
is a 12-digit number assigned to
Spanish workers to allow access to
the Spanish healthcare system.

See “Spanish Social Security Number


” on page 1485.

Spanish Customer Account Number Data Identifiers The Spanish customer account
number is the standard customer bank
account number used across Spain.

See “Spanish Customer Account


Number” on page 1479.
Library of policy templates 1590
General Data Protection Regulation (Banking and Finance)

Table 46-29 General Data Protection Regulations (Banking and Finance) detection rules
(continued)

Name Type Description

Spanish Tax ID (CIF) Data Identifiers The Spanish Tax Identification


corporate tax identifier (CIF) is
equivalent to the VAT number,
required for running a business in
Spain. This identifier is a company's
identification for tax purposes and is
required for any legal transactions.

See “Spanish Tax Identification (CIF)”


on page 1487.

German Passport Number Data Identifiers The German passport number is


issued to German nationals for the
purpose of international travel. A
German passport is an officially
recognized document that German
authorities accept as proof of identity
from German citizens.

See “German Passport Number”


on page 1190.

Bulgarian Uniform Civil Number Data Identifiers The uniform civil number (EGN) is
unique number assigned to each
Bulgarian citizen or resident foreign
national. It serves as a national
identification number. An EGN is
assigned to Bulgarians at birth, or
when a birth certificate is issued.

See “Bulgarian Uniform Civil Number


- EGN” on page 1063.

Austrian Social Security Number Data Identifiers A social security number is allocated
to Austrian citizens who receive
available social security benefits. It is
allocated by the umbrella association
of the Austrian social security
authorities.

See “Austrian Social Security Number”


on page 1036.
Library of policy templates 1591
General Data Protection Regulation (Banking and Finance)

Table 46-29 General Data Protection Regulations (Banking and Finance) detection rules
(continued)

Name Type Description

Spanish Passport Number Data Identifiers Spanish passports are issued to


Spanish citizens for the purpose of
travel outside Spain.

See “Spanish Passport Number”


on page 1483.

Swedish Passport Number Data Identifiers Swedish passports are issued to


nationals of Sweden for the purpose
of international travel. Besides serving
as proof of Swedish citizenship, they
facilitate the process of securing
assistance from Swedish consular
officials abroad or other European
Union member states in case a
Swedish consular is absent, if needed.

See “Swedish Passport Number”


on page 1499.

German Personal ID Number Data Identifiers The German Personal ID Number is


issued to all German citizens.

See “German Personal ID Number”


on page 1192.

IBAN Central Data Identifiers The International Bank Account


Number (IBAN) is an international
standard for identifying bank accounts
across national borders.

The IBAN Central data identifier


detects IBAN numbers for Andorra,
Austria, Belgium, Germany, Italy,
Liechtenstein, Luxembourg, Malta,
Monaco, San Marino, and Switzerland.

See “IBAN Central” on page 1227.


Library of policy templates 1592
General Data Protection Regulation (Banking and Finance)

Table 46-29 General Data Protection Regulations (Banking and Finance) detection rules
(continued)

Name Type Description

IBAN East Data Identifiers The International Bank Account


Number (IBAN) is an international
standard for identifying bank accounts
across national borders.

The IBAN East data identifier detects


IBAN numbers for Bosnia, Bulgaria,
Croatia, Cyprus, Czech Republic,
Estonia, Greece, Hungary, Israel,
Latvia, Lithuania, Macedonia,
Montenegro, Poland, Romania, Serbia,
Slovakia, Slovenia, Turkey, and
Tunisia.

See “IBAN East” on page 1231.

IBAN West Data Identifiers The International Bank Account


Number (IBAN) is an international
standard for identifying bank accounts
across national borders.

The IBAN West data identifier detects


IBAN numbers for Denmark, Faroe
Islands, Finland, France, Gibraltar,
Greenland, Iceland, Ireland,
Netherlands, Norway, Portugal, Spain,
Sweden, and the United Kingdom.

See “IBAN West” on page 1237.

Burgerservicenummer Data Identifiers In the Netherlands, the


Burgerservicenummer is used to
uniquely identify citizens and is printed
on driving licenses, passports and
international ID cards under the
header Personal Number.

See “Burgerservicenummer”
on page 1066.
Library of policy templates 1593
General Data Protection Regulation (Banking and Finance)

Table 46-29 General Data Protection Regulations (Banking and Finance) detection rules
(continued)

Name Type Description

Codice Fiscale Data Identifiers The Codice Fiscale uniquely identifies


an Italian citizen or permanent resident
alien and issuance of the code is
centralized to the Ministry of Treasure.
The Codice Fiscale is issued to every
Italian at birth.

See “Codice Fiscale” on page 1081.

Finnish Personal Identification Data Identifiers The Finnish Personal Identification


Number Number or Personal Identity Code is
a unique personal identifier used for
identifying citizens in government and
many other transactions.

See “Finnish Personal Identification


Number” on page 1175.

Swedish Personal Identification Data Identifiers The Swedish Personal Identification


Number Number is the unique national
identification for Swedish every citizen.
The number is used by authorities,
health care, schools, universities,
banks, and insurance companies for
customer identification.

See “Sweden Personal Identification


Number” on page 1501.

Austria Passport Number Data Identifiers Austrian passports are travel


documents issued to Austrian citizens
by the Austrian Passport Office of the
Department of Foreign Affairs and
Trade, both in Austria and overseas,
and enable the passport holder to
travel internationally.

See “Austria Passport Number”


on page 1030.
Library of policy templates 1594
General Data Protection Regulation (Banking and Finance)

Table 46-29 General Data Protection Regulations (Banking and Finance) detection rules
(continued)

Name Type Description

Austria Tax Identification Number Data Identifiers Austria issues tax identification
numbers to individuals based on their
area of residence to identify taxpayers
and facilitate national taxes.

See “Austria Tax Identification


Number” on page 1031.

Belgium Passport Number Data Identifiers Belgian passports are passports


issued by the Belgian state to its
citizens to facilitate international travel.
The Federal Public Service Foreign
Affairs, formerly known as the Ministry
of Foreign Affairs, is responsible for
issuing and renewing Belgian
passports.

See “Belgium Passport Number”


on page 1044.

Belgium Tax Identification Number Data Identifiers Belgium issues a tax identification
number for persons who has
obligations to declare taxes in
Belgium.

See “Belgium Tax Identification


Number” on page 1045.

Belgium Value Added Tax (VAT) Data Identifiers VAT is a consumption tax that is borne
Number by the end consumer. VAT is paid for
each transaction in the manufacturing
and distribution process. For Belgium,
the Value Added Tax is issued by VAT
office for the region in which the
business is established.

See “Belgium Value Added Tax (VAT)


Number” on page 1047.

Belgium Driver's License Number Data Identifiers Identification number for an individual's
driver's licence issued by the Driver
and Vehicle Licensing Agency of
Belgium.

See “Belgium Driver's Licence


Number” on page 1042.
Library of policy templates 1595
General Data Protection Regulation (Banking and Finance)

Table 46-29 General Data Protection Regulations (Banking and Finance) detection rules
(continued)

Name Type Description

Denmark Personal Identification Data Identifiers In Denmark, every citizen has a


Number national identification number. The
number serves as proof of
identification for almost all purposes.

See “Denmark Personal Identification


Number” on page 1126.

Netherlands Driver's License Data Identifiers Identification number for an individual's


Number driver's licence issued by the RDW
government agency of the
Netherlands.

See “Netherlands Driver's License


Number” on page 1362.

Netherlands Passport Number Data Identifiers Dutch passports are issued to


Netherlands citizens for the purpose
of international travel.

See “Netherlands Passport Number”


on page 1363.

Netherlands Value Added Tax (VAT) Data Identifiers VAT is a consumption tax that is borne
Number by the end consumer. VAT is paid for
each transaction in the manufacturing
and distribution process. For the
Netherlands, the Value Added Tax is
issued by VAT office for the region in
which the business is established.

See “Netherlands Value Added Tax


(VAT) Number” on page 1367.

France Driver's License Number Data Identifiers Identification number for an individual's
driver's licence issued by the Driver
and Vehicle Licensing Agency of
France.

See “France Driver's License Number”


on page 1177.
Library of policy templates 1596
General Data Protection Regulation (Banking and Finance)

Table 46-29 General Data Protection Regulations (Banking and Finance) detection rules
(continued)

Name Type Description

France Tax Identification Number Data Identifiers France issue a tax identification
number for anyone who has
obligations to declare taxes in France.

See “France Tax Identification


Number” on page 1181.

Germany Driver's License Number Data Identifiers Identification number for an individual's
driver's licence issued by the Driver
and Vehicle Licensing Agency of
Germany.

See “Germany Driver's License


Number” on page 1194.

Italy Passport Number Data Identifiers Italian passports are issued to Italian
citizens for the purpose of international
travel.

See “Italy Passport Number”


on page 1282.

Italy Value Added Tax (VAT) Data Identifiers VAT is a consumption tax that is borne
Number by the end consumer. VAT is paid for
each transaction in the manufacturing
and distribution process. For Italy, the
Value Added Tax is issued by VAT
office for the region in which the
business is established.

See “Italy Value Added Tax (VAT)


Number” on page 1283.

Italy Driver's License Number Data Identifiers Identification number for an individual's
driver's licence issued by the Driver
and Vehicle Licensing Agency of Italy.

See “Italy Driver's Licence Number”


on page 1278.

Netherlands Tax Identification Data Identifiers The Netherlands issues a tax


Number identification number at birth or at
registration at the municipality.

See “Netherlands Tax Identification


Number” on page 1364.
Library of policy templates 1597
General Data Protection Regulation (Banking and Finance)

Table 46-29 General Data Protection Regulations (Banking and Finance) detection rules
(continued)

Name Type Description

Spain Driver's License Number Data Identifiers Identification number for an individual's
driver's licence issued by the Driver
and Vehicle Licensing Agency of
Spain.

See “Spain Driver's Licence Number”


on page 1477.

Ukraine Identity Card Data Identifiers The Ukraine Identity Card has a
15-digit record number issued to
citizens of Ukraine. It is used as a form
of identification in place of Ukraine's
domestic passport as of January 2016.

See “Ukraine Identity Card”


on page 1539.

Ukraine Domestic Passport Number Data Identifiers An identity document issued to citizens
of Ukraine for domestic use. It has
been replaced by the Ukraine Identity
Card as of 2016, but any existing
passports are still valid.

See “Ukraine Passport (Domestic)”


on page 1541.

Ukraine International Passport Data Identifiers A document used by citizens of


Number Ukraine to travel outside of Ukraine.

See “Ukraine Passport (International)”


on page 1543.

Germany Value Added Tax (VAT) Data Identifiers VAT is a consumption tax that is borne
Number by the end consumer. VAT is paid for
each transaction in the manufacturing
and distribution process. For Germany,
the Value Added Tax is issued by VAT
office for the region in which the
business is established.

See “Germany Driver's License


Number” on page 1194.
Library of policy templates 1598
General Data Protection Regulation (Banking and Finance)

Table 46-29 General Data Protection Regulations (Banking and Finance) detection rules
(continued)

Name Type Description

France Value Added Tax (VAT) Data Identifiers The Value Added Tax (VAT) is a tax
Number levied on goods and services provided
in France and is collected from the
final customer. Companies must
register with the Register of
Commerce and Companies in France
to get VAT number allocated.

See “France Value Added Tax (VAT)


Number” on page 1182.

Austria Value Added Tax (VAT) Data Identifiers VAT is a consumption tax that is borne
Number by the end consumer. VAT is paid for
each transaction in the manufacturing
and distribution process. For Austria,
the VAT number is issued by the tax
office for the region in which the
business is established.

See “Austria Value Added Tax (VAT)


Number” on page 1033.

Sweden Tax Identification Number Data Identifiers Sweden uses tax identification
numbers (TINs) to identify taxpayers
and facilitate the administration of their
national tax affairs. TINs are also
useful for identifying taxpayers who
invest in other EU countries and are
more reliable than other identifiers
such as name and address.

See “Sweden Tax Identification


Number” on page 1494.

Sweden Value Added Tax (VAT) Data Identifiers VAT is a consumption tax that is borne
Number by the end consumer. VAT is paid for
each transaction in the manufacturing
and distribution process.

See “Sweden Value Added Tax (VAT)


Number” on page 1496.
Library of policy templates 1599
General Data Protection Regulation (Banking and Finance)

Table 46-29 General Data Protection Regulations (Banking and Finance) detection rules
(continued)

Name Type Description

Denmark Value Added Tax (VAT) Data Identifiers VAT is a consumption tax that is borne
Number by the end consumer. VAT is paid for
each transaction in the manufacturing
and distribution process. For Denmark,
the VAT number is issued by the tax
office for the region in which the
business is established.

See “Denmark Value Added Tax (VAT)


Number” on page 1130.

Finland Passport Number Data Identifiers Finnish passports are issued to


nationals of Finland for the purpose of
international travel. They also facilitate
the process of securing assistance
from Finnish consular officials abroad.

See “Finland Passport Number”


on page 1169.

Finland Driver's Licence Number Data Identifiers Identification number for an individual's
driver's license issued in an EU or EEA
Member State for a Finnish license.

See “Finland Driver's Licence Number”


on page 1165.

Finland Value Added Tax (VAT) Data Identifiers VAT is a consumption tax that is borne
Number by the end consumer. VAT is paid for
each transaction in the manufacturing
and distribution process.

See “Finland Value Added Tax (VAT)


Number” on page 1173.
Library of policy templates 1600
General Data Protection Regulation (Banking and Finance)

Table 46-29 General Data Protection Regulations (Banking and Finance) detection rules
(continued)

Name Type Description

Ireland Passport Number Data Identifiers An Irish passport is the passport


issued to citizens of Ireland. An Irish
passport enables the bearer to travel
internationally and serves as evidence
of Irish citizenship and citizenship of
the European union. It also facilitates
the access to consular assistance from
both Irish embassies and any embassy
from other European union member
states while abroad.

See “Ireland Passport Number”


on page 1266.

Ireland Value Added Tax (VAT) Data Identifiers VAT is a consumption tax that is borne
Number by the end consumer. VAT is paid for
each transaction in the manufacturing
and distribution process. For Ireland,
the VAT number is issued by the Irish
tax authority.

See “Ireland Value Added Tax (VAT)


Number” on page 1271.

Ireland Tax Identification Number Data Identifiers This number is issued by department
of social protection for natural persons
and by revenue commissioner for
non-natural persons. Non-natural
persons can be companies,
partnerships, trusts, and
unincorporated bodies.

See “Ireland Tax Identification


Number” on page 1268.

Luxembourg Passport Number Data Identifiers A Luxembourg passport is an


international travel document issued
to nationals of the grand Duchy of
Luxembourg, and may also serve as
proof of Luxembourgish citizenship.

See “Luxembourg Passport Number”


on page 1322.
Library of policy templates 1601
General Data Protection Regulation (Banking and Finance)

Table 46-29 General Data Protection Regulations (Banking and Finance) detection rules
(continued)

Name Type Description

Luxembourg Value Added Tax Data Identifiers VAT is a consumption tax that is borne
(VAT) Number by the end consumer. VAT is paid for
each transaction in the manufacturing
and distribution process.

See “Luxembourg Value Added Tax


(VAT) Number” on page 1327.

Portugal National Identification Data Identifiers The national identification number is


Number a unique identification number usually
present on documents like citizen
cards which are issued by the
Portuguese government to its citizens.
It can be used as a travel document
within the EU and some other
European countries.

See “Portugal National Identification


Number” on page 1404.

Portugal Passport Number Data Identifiers Portuguese passports are issued to


citizens of Portugal for the purpose of
international travel. The passport,
along with the national identity card
allows for free rights of movement and
residence in any of the states of the
European Union and European
economic area.

See “Portugal Passport Number”


on page 1407.

Portugal Tax Identification Number Data Identifiers A fiscal number is a tax identification
number that is issued in Portugal to
anyone who wishes to undertake any
official matters in Portugal.

See “Portugal Tax Identification


Number” on page 1408.
Library of policy templates 1602
General Data Protection Regulation (Banking and Finance)

Table 46-29 General Data Protection Regulations (Banking and Finance) detection rules
(continued)

Name Type Description

Portugal Value Added Tax (VAT) Data Identifiers VAT is a consumption tax that is borne
Number by the end consumer. VAT is paid for
each transaction in the manufacturing
and distribution process.

See “Portugal Value Added Tax (VAT)


Number” on page 1411.

Portugal Driver's Licence Number Data Identifiers The Institute for Mobility and Land
Transport (IMTT) issues driver's
licenses in Portugal.

See “Portugal Driver's Licence


Number” on page 1402.

Denmark Tax Identification Number Data Identifiers Denmark issues a tax identification
number for persons who have
obligations to declare taxes in
Denmark. The tax identification
number also serves as a personal
health insurance number.

See “Denmark Tax Identification


Number” on page 1128.

Finland Tax Identification Number Data Identifiers Finland issues a tax identification
number for persons who have
obligations to declare taxes in Finland.

See “Finland Tax Identification


Number” on page 1171.

Luxembourg Tax Identification Data Identifiers This number is issued by Luxembourg


Number inland revenue (Administration des
contributions directes - ACD)
department and is used for tax related
purposes of natural and non natural
persons.

See “Luxembourg Tax Identification


Number” on page 1324.
Library of policy templates 1603
General Data Protection Regulation (Banking and Finance)

Table 46-29 General Data Protection Regulations (Banking and Finance) detection rules
(continued)

Name Type Description

Germany Tax Identification Number Data Identifiers Germany issues a tax identification
number for persons who have
obligations to declare taxes in
Germany.

See “Germany Tax Identification


Number” on page 1198.

UK Value Added Tax (VAT) Number Data Identifiers VAT is a consumption tax that is borne
by the end consumer. VAT is paid for
each transaction in the manufacturing
and distribution process. For the
United Kingdom, the VAT number is
issued by the VAT office for the region
in which the business is established.

See “UK Value Added Tax (VAT)


Number” on page 1536.

Spain Value Added Tax (VAT) Data Identifiers VAT is a consumption tax that is borne
Number by the end consumer. VAT is paid for
each transaction in the manufacturing
and distribution process. VAT in Spain
is overseen by the State Tax
Administration Agency.

See “Spain Value Added Tax (VAT)


Number” on page 1474.

UK Bank Account Number Sort Data Identifiers Sort codes are bank codes used to
Code route money transfers between banks
within their respective countries via
their respective clearance
organizations.

See “UK Bank Account Number Sort


Code” on page 1523.

Greece Social Security Number Data Identifiers The AMKA (social security number) is
(AMKA) the work and insurance identification
number of every worker, retired person
and protected family member in
Greece.

See “Greece Social Security Number


(AMKA)” on page 1202.
Library of policy templates 1604
General Data Protection Regulation (Banking and Finance)

Table 46-29 General Data Protection Regulations (Banking and Finance) detection rules
(continued)

Name Type Description

Romania National Identification Data Identifiers In Romania each citizen has a


Number personal numerical code (Cod
Numeric Personal, CNP) as unique
national identification number. This
number is also used as a tax
identification number for financial
purposes.

See “Romania National Identification


Number” on page 1419.

Slovakia National Identification Data Identifiers In Slovakia, identification cards are


Number issued by the state authorities at 15
years of age for every citizen. This
number is used in Slovak Republic as
the primary unique identifier for every
person by government institutions,
banks and so on.

See “Slovakia National Identification


Number” on page 1453.

Slovenia Unique Master Citizen Data Identifiers The unique master citizen number is
Number a unique identification number
assigned to every citizen of Slovenia
at birth or on acquiring citizenship.

See “Slovenia Unique Master Citizen


Number” on page 1465.

Latvia Personal Identification Data Identifiers The Latvian personal identification


Number number is used for national identity
and as a tax identification number for
financial purposes. It is issued by the
office of citizenship and migration
affairs of the Ministry of Interior.

See “Latvia Personal Identification


Number” on page 1306.
Library of policy templates 1605
General Data Protection Regulation (Banking and Finance)

Table 46-29 General Data Protection Regulations (Banking and Finance) detection rules
(continued)

Name Type Description

Sweden Driver's Licence Number Data Identifiers In Sweden, a driving license is


required when operating a car,
motorcycle or moped on public roads.
Driving licenses are issued by the
prefectural governments public safety
commissions and are overseen on a
nationwide basis by the National
Police Agency.

See “Sweden Driver's Licence


Number” on page 1492.

Greece Passport Number Data Identifiers Greek passports are issued to Greek
citizens for the purpose of international
travel. The passport along with the
national identity card allows for free
rights of movement and residence in
any of the states of the European
Union and European Economic Area.

See “Greece Passport Number”


on page 1200.

Greece Value Added Tax (VAT) Data Identifiers Value Added Tax (VAT) is a
Number consumption tax that is borne by the
end consumer. VAT is paid for each
transaction in the manufacturing and
distribution process. For Greece, VAT
is administered by the VAT office for
the region in which the business is
established.

See “Greece Value Added Tax (VAT)


Number” on page 1206.

Poland Passport Number Data Identifiers A Polish passport is an international


travel document issued to nationals of
Poland. It may also serve as proof of
Polish citizenship.

See “Poland Passport Number”


on page 1389.
Library of policy templates 1606
General Data Protection Regulation (Banking and Finance)

Table 46-29 General Data Protection Regulations (Banking and Finance) detection rules
(continued)

Name Type Description

Poland Value Added Tax (VAT) Data Identifiers Value Added Tax (VAT) is a
Number consumption tax that is borne by the
end consumer. VAT is paid for each
transaction in the manufacturing and
distribution process. For Poland, VAT
is administered by the VAT office for
the region in which the business is
established.

See “Poland Value Added Tax (VAT)


Number” on page 1391.

Romania Value Added Tax (VAT) Data Identifiers Value Added Tax (VAT) is a
Number consumption tax that is borne by the
end consumer. VAT is paid for each
transaction in the manufacturing and
distribution process. In Romania, it is
also called TVA or CIF.

See “Romania Value Added Tax (VAT)


Number” on page 1420.

Hungary Passport Number Data Identifiers Hungarian passports are issued to


Hungarian citizens for international
travel by the Central Data Processing,
Registration, and Election Office of the
Hungarian Ministry of the Interior.

See “Hungary Passport Number”


on page 1219.

Czech Republic Value Added Tax Data Identifiers Value Added Tax (VAT) is a
(VAT) Number consumption tax that is borne by the
end consumer. VAT is paid for each
transaction in the manufacturing and
distribution process. In the Czech
Republic, it is also called DPH.

See “Czech Republic Value Added


Tax (VAT) Number” on page 1121.
Library of policy templates 1607
General Data Protection Regulation (Banking and Finance)

Table 46-29 General Data Protection Regulations (Banking and Finance) detection rules
(continued)

Name Type Description

Slovakia Passport Number Data Identifiers Slovak passports are issued to citizens
of Slovakia to facilitate international
travel.

See “Slovakia Passport Number”


on page 1457.

Slovakia Value Added Tax (VAT) Data Identifiers Value Added Tax (VAT) is a
Number consumption tax that is borne by the
end consumer. VAT is paid for each
transaction in the manufacturing and
distribution process. For Slovakia, VAT
is administered by the tax office for the
region in which the business is
established.

See “Slovakia Value Added Tax (VAT)


Number” on page 1459.

Slovenia Passport Number Data Identifiers Slovenian passports are issued to


citizens of Slovenia to facilitate
international travel.

See “Slovenia Passport Number”


on page 1461.

Slovenia Tax Identification Number Data Identifiers The Slovenia Tax Identification
Number is a unique identifier of
individuals and legal entities for tax
purposes. The Financial Administration
of the Republic of Slovenia issues and
administers tax identification numbers
in Slovenia.

See “Slovenia Tax Identification


Number” on page 1463.
Library of policy templates 1608
General Data Protection Regulation (Banking and Finance)

Table 46-29 General Data Protection Regulations (Banking and Finance) detection rules
(continued)

Name Type Description

Slovenia Value Added Tax (VAT) Data Identifiers Value Added Tax (VAT) is a
Number consumption tax that is borne by the
end consumer. VAT is paid for each
transaction in the manufacturing and
distribution process. For Slovenia, VAT
is administered by the tax office for the
region in which the business is
established.

See “Slovenia Value Added Tax (VAT)


Number” on page 1467.

Croatia National Identification Data Identifiers The Croatian National Identification


Number number (Osobni identifikacijski broj or
OIB) is the permanent personal and
tax identifier for Croatian citizens and
residents.

See “Croatia National Identification


Number” on page 1104.

Estonia Personal Identification Data Identifiers In Estonia, the personal identification


Number code is a number based on the sex
and birth date of a person. This code
is used as a unique personal identifier
by governmental and other systems
where identification is required, as well
as for digital signatures using the
national identity card and its
associated certificates. It also serves
as tax identification number.

See “Estonia Personal Identification


Code” on page 1151.
Library of policy templates 1609
General Data Protection Regulation (Banking and Finance)

Table 46-29 General Data Protection Regulations (Banking and Finance) detection rules
(continued)

Name Type Description

Estonia Value Added Tax (VAT) Data Identifiers Value Added Tax (VAT) is a
Number consumption tax that is borne by the
end consumer. VAT is paid for each
transaction in the manufacturing and
distribution process. For Estonia, VAT
is administered by tax office for the
region in which the business is
established.

See “Estonia Value Added Tax (VAT)


Number” on page 1153.

Lithuania Personal Identification Data Identifiers In Lithuania, the personal identification


Number code is a number based on the sex
and birth date of a person. This code
is used as a unique personal identifier
by governmental and other systems
where identification is required, as well
as for digital signatures using the
national identity card and its
associated certificates.

See “Lithuania Personal Identification


Number” on page 1312.

Lithuania Tax Identification Number Data Identifiers The Lithuanian Taxpayer Identification
Number is used to identify taxpayers
and facilitate the administration of their
national tax affairs.

See “Lithuania Tax Identification


Number” on page 1315.

Estonia Passport Number Data Identifiers The Estonian passport is an


international travel document issued
to citizens of Estonia that also serves
as proof of Estonian citizenship. The
Border Guard Board in Estonia and
Estonian foreign representations
abroad are responsible for issuing
Estonian passports.

See “Estonia Passport Number”


on page 1149.
Library of policy templates 1610
General Data Protection Regulation (Banking and Finance)

Table 46-29 General Data Protection Regulations (Banking and Finance) detection rules
(continued)

Name Type Description

Lithuania Value Added Tax (VAT) Data Identifiers Value Added Tax (VAT) is a
Number consumption tax that is borne by the
end consumer. VAT is paid for each
transaction in the manufacturing and
distribution process. In Lithuania, VAT
is administered by the State Tax
Inspectorate.

See “Lithuania Value Added Tax (VAT)


Number” on page 1317.

Latvia Passport Number Data Identifiers Latvian passports are issued to


citizens of Latvia for identity and
international travel purposes. The
territorial section of The Office of
Citizenship and Migration Affairs
issues passports.

See “Latvia Passport Number”


on page 1305.

Latvia Value Added Tax (VAT) Data Identifiers Value Added Tax (VAT) is a
Number consumption tax that is borne by the
end consumer. VAT is paid for each
transaction in the manufacturing and
distribution process. In Latvia, VAT is
administered by the State Revenue
Service.

See “Latvia Value Added Tax (VAT)


Number” on page 1308.

Bulgaria Value Added Tax (VAT) Data Identifiers Value Added Tax (VAT) is a
Number consumption tax that is borne by the
end consumer. VAT is paid for each
transaction in the manufacturing and
distribution process. In Bulgaria, VAT
is administered by the National
Revenue Agency, which is overseen
by the Bulgarian Ministry of Finance.

See “Bulgaria Value Added Tax (VAT)


Number” on page 1060.
Library of policy templates 1611
General Data Protection Regulation (Banking and Finance)

Table 46-29 General Data Protection Regulations (Banking and Finance) detection rules
(continued)

Name Type Description

Malta National Identification Data Identifiers Every resident of Malta is assigned a


Number national number. For foreigners who
are authorized to reside in Malta,
National numbers for foreign resident
end with the letter A. National numbers
for Maltese citizens end with M, G, L,
H or P.

See “Malta National Identification


Number” on page 1337.

Malta Tax Identification Number Data Identifiers The Malta Tax Identification Number
is assigned by the Inland Revenue
Department as a means of
identification for income tax purposes.

See “Malta Tax Identification Number”


on page 1339.

Malta Value Added Tax (VAT) Data Identifiers Value Added Tax (VAT) is a
Number consumption tax that is borne by the
end consumer. VAT is paid for each
transaction in the manufacturing and
distribution process. In Malta, VAT is
administered by tax office for the
region in which the business is
established.

See “Malta Value Added Tax (VAT)


Number” on page 1342.

Iceland National Identification Data Identifiers The Iceland National Identification


Number Number is a unique national identifier
used by the Icelandic government to
identify individuals and organizations.
It is administered by the Registers
Iceland. Icelandic national
identification numbers are issued to
Icelandic citizens at birth and to foreign
nationals resident in Iceland upon
registration. They are also issued to
corporations and institutions.

See “Iceland National Identification


Number” on page 1241.
Library of policy templates 1612
General Data Protection Regulation (Banking and Finance)

Table 46-29 General Data Protection Regulations (Banking and Finance) detection rules
(continued)

Name Type Description

Serbia Unique Master Citizen Data Identifiers The Serbian Unique Master Citizen
Number Number is a unique identifier for
Serbian citizens. It is assigned to every
citizen of Serbia at birth or upon
acquiring citizenship.

See “Serbia Unique Master Citizen


Number” on page 1445.

Switzerland Passport Number Data Identifiers Swiss passports are issued to citizens
of Switzerland to facilitate international
travel.

See “Switzerland Passport Number”


on page 1511.

Iceland Value Added Tax (VAT) Data Identifiers Value Added Tax (VAT) is a
Number consumption tax that is borne by the
end consumer. VAT is paid for each
transaction in the manufacturing and
distribution process. For Iceland, VAT
is administered by the VAT office for
the region in which the business is
established.

See “Iceland Value Added Tax (VAT)


Number” on page 1247.

Iceland Passport Number Data Identifiers Icelandic passports are issued to


citizens of Iceland for the purpose of
international travel and may also serve
as a proof of Iceland citizenship.

See “Iceland Passport Number”


on page 1245.
Library of policy templates 1613
General Data Protection Regulation (Banking and Finance)

Table 46-29 General Data Protection Regulations (Banking and Finance) detection rules
(continued)

Name Type Description

Switzerland Value Added Tax (VAT) Data Identifiers Value Added Tax (VAT) is a
Number consumption tax that is borne by the
end consumer. VAT is paid for each
transaction in the manufacturing and
distribution process. For Switzerland,
VAT is administered by the Federal
Statistical Office for the region in which
the business is established.

See “Switzerland Value Added Tax


(VAT) Number” on page 1513.

Serbia Value Added Tax (VAT) Data Identifiers Value Added Tax (VAT) is a
Number consumption tax that is borne by the
end consumer. VAT is paid for each
transaction in the manufacturing and
distribution process. In Serbia, VAT is
administered by the Tax Administration
department of the Ministry of Finance.

See “Serbia Value Added Tax (VAT)


Number” on page 1448.

Liechtenstein Passport Number Data Identifiers Liechtenstein passports are issued to


nationals of Liechtenstein for the
purpose of international travel. The
passport may also serve as proof of
Liechtensteiner citizenship.

See “Liechtenstein Passport Number”


on page 1311.

Norway National Identification Data Identifiers The Norway National identification


Number number is assigned by the Norwegian
state to all citizens of the country. It is
administered by the Tax
Administration.

See “Norway National Identification


Number” on page 1377.
Library of policy templates 1614
General Data Protection Regulation (Banking and Finance)

Table 46-29 General Data Protection Regulations (Banking and Finance) detection rules
(continued)

Name Type Description

Norway Value Added Tax (VAT) Data Identifiers Value Added Tax (VAT) is a
Number consumption tax that is borne by the
end consumer. VAT is paid for each
transaction in the manufacturing and
distribution process. For Norway, VAT
Is administered by the VAT office for
the region in which the business is
established.

See “Norway Value Added Tax


Number” on page 1379.

Romania Driver's Licence Number Data Identifiers A driving license in Romania is a


document confirming the rights of the
holder to drive motor vehicles.

See “Romania Driver's Licence


Number” on page 1416.

Czech Republic Driver's Licence Data Identifiers The Czech Republic Ministry of
Number Transport grants driver's licenses in
the Czech Republic, confirming the
rights of the holder to drive motor
vehicles.

See “Czech Republic Driver's Licence


Number” on page 1112.

Slovakia Driver's Licence Number Data Identifiers A Slovak drivers license is a document
confirming the rights of the holder to
drive motor vehicles. Slovak driver's
licenses are granted by the Ministry of
Interior.

See “Slovakia Driver's Licence


Number” on page 1451.

Poland Driver's Licence Number Data Identifiers Poland issues driving licenses
confirming the rights of the holder to
drive motor vehicles.

See “Poland Driver's Licence Number”


on page 1386.
Library of policy templates 1615
General Data Protection Regulation (Banking and Finance)

Table 46-29 General Data Protection Regulations (Banking and Finance) detection rules
(continued)

Name Type Description

Hungary Driver's Licence Number Data Identifiers A driving license in Hungary is a


document issued by the Ministry of
Economics and Transport, confirming
the rights of the holder to drive motor
vehicles.

See “Hungary Driver's Licence


Number” on page 1217.

Latvia Driver's Licence Number Data Identifiers A driver's license in Latvia is a


document issued by the Road Traffic
Safety Directorate, confirming the
rights of the holder to drive motor
vehicles.

See “Latvia Driver's Licence Number”


on page 1303.

Norway Driver's Licence Number Data Identifiers A driver's license is required in Norway
before a person is permitted to drive
a motor vehicle of any description on
a road in Norway.

See “Norway Driver's Licence Number”


on page 1375.

Cyprus Value Added Tax (VAT) Data Identifiers Value Added Tax (VAT) is a
Number consumption tax that is borne by the
end consumer. VAT is paid for each
transaction in the manufacturing and
distribution process. For Cyprus, VAT
is administered by the tax office for the
region in which the business is
established.

See “Cyprus Value Added Tax (VAT)


Number” on page 1111.

Cyprus Tax Identification Number Data Identifiers The Cyprus Tax Identification Number
is a unique identifier for Cypriot
taxpayers.

See “Cyprus Tax Identification


Number” on page 1109.
Library of policy templates 1616
General Data Protection Regulation (Banking and Finance)

Table 46-29 General Data Protection Regulations (Banking and Finance) detection rules
(continued)

Name Type Description

Estonia Driver's Licence Number Data Identifiers The Estonian Road Administration
issues driving licenses in Estonia,
confirming the rights of the holder to
drive motor vehicles.

See “Estonia Driver's Licence Number”


on page 1147.

SEPA Creditor Identifier Number Data Identifiers The Single Euro Payment Area
North (SEPA) is a payments system created
by the European Union that
harmonizes the way cashless
payments transact between Euro
countries. SEPA North is for the United
Kingdom, Sweden, Denmark, Finland,
Ireland. European consumers,
businesses, and government agents
who make payments by direct debit,
credit card or through credit transfers
use the SEPA architecture. The Single
Euro Payment Area is approved and
regulated by European Commission.

See “SEPA Creditor Identifier Number


North” on page 1430.

SEPA Creditor Identifier Number Data Identifiers The Single Euro Payment Area
South (SEPA) is a payments system created
by the European Union that
harmonizes the way cashless
payments transact between Euro
countries. SEPA South is for Italy,
Spain, and Portugal. European
consumers, businesses, and
government agents who make
payments by direct debit, credit card
or through credit transfers use the
SEPA architecture. The Single Euro
Payment Area is approved and
regulated by European Commission.

See “SEPA Creditor Identifier Number


South” on page 1437.
Library of policy templates 1617
General Data Protection Regulation (Digital Identity)

Table 46-29 General Data Protection Regulations (Banking and Finance) detection rules
(continued)

Name Type Description

SEPA Creditor Identifier Number Data Identifiers The Single Euro Payment Area
West (SEPA) is a payments system created
by the European Union that
harmonizes the way cashless
payments transact between Euro
countries. SEPA West is for Germany,
France, Netherlands, Belgium, Austria,
and Luxembourg. European
consumers, businesses, and
government agents who make
payments by direct debit, credit card,
or through credit transfers use the
SEPA architecture. The Single Euro
Payment Area is approved and
regulated by European Commission.

See “SEPA Creditor Identifier Number


West” on page 1441.

General Data Protection Regulation (Digital Identity)


This template focuses on General Data Protection Regulation (GDPR) digital identity related
keywords, Data Identifiers and an EDM profile with related columns.
The GDPR is a regulation by which the European Commission intends to strengthen and unify
data protection for individuals within the EU. It also addresses export of personal data outside
the EU. The primary objectives of the GDPR are to give citizens back the control of their
personal data and to simplify the regulatory environment for international business by unifying
the regulation within the EU. The GDPR replaces the EU Data Protection Directives as of 25
May 2018.
Library of policy templates 1618
General Data Protection Regulation (Government Identification)

Table 46-30 General Data Protection Regulations (Digital Identity) detection rule

Name Type Description

International Mobile Equipment Data Identifiers The International Mobile Station


Identity Number Equipment Identity (IMEI) is a
unique identifier for 3GPP (GSM,
UMTS, and LTE) and iDEN mobile
phones and some satellite
phones.

See “International Mobile


Equipment Identity Number”
on page 1257.

General Data Protection Regulation (Government


Identification)
This template focuses on General Data Protection Regulation (GDPR) government identification
related keywords, data identifiers and an EDM profile with related columns.
The GDPR is a regulation by which the European Commission intends to strengthen and unify
data protection for individuals within the EU. It also addresses export of personal data outside
the EU. The primary objectives of the GDPR are to give citizens back the control of their
personal data and to simplify the regulatory environment for international business by unifying
the regulation within the EU. The GDPR replaces the EU Data Protection Directives as of 25
May 2018.
Library of policy templates 1619
General Data Protection Regulation (Government Identification)

Table 46-31 General Data Protection Regulations (Government Identification) detection


rules

Name Type Description

GDPR Government Keyword Match Matches a list of related


Identification Keywords keywords:

driver license number, id card


number, electoral roll
number,Führerscheinnummer,
ID-Kartennummer,
Stimmzettel-Nummer, Numéro
permis conduire, numéro carte
d'identité, numéro du rôle
électoral, numero patente
guida, numero carta d'identità,
elettorale rotolo numero,
Número licencia conducir,
número tarjeta de
identificación, número boleta
electoral, rijbewijs nummer,
ID-kaartnummer, kiezerslijst
nummer, körkort nummer,
identitetskort nummer,
førerkortnummer,
ID-kortnummer, ajokortin
numero, Henkilökortin numero,
vaaliluettelon numero, uimhir
ceadúnas tiomána, Uimhir
chárta aitheantais, uimhir rolla
toghcháin,
Identifikatiounskaart, número
licença motorista, Número do
cartão de identificação, número
leitoral

UK Driver's Licence Number Data Identifiers The UK Drivers Licence Number


is the identification number for an
individual's driver's license issued
by the Driver and Vehicle
Licensing Agency of the United
Kingdom.

See “UK Drivers Licence Number”


on page 1525.
Library of policy templates 1620
General Data Protection Regulation (Government Identification)

Table 46-31 General Data Protection Regulations (Government Identification) detection


rules (continued)

Name Type Description

UK Electoral Roll Number Data Identifiers The Electoral Roll Number is the
identification number issued to an
individual for UK election
registration. The format of this
number is specified by the UK
Government Standards of the UK
Cabinet Office.

See “UK Electoral Roll Number”


on page 1527.

UK National Health Service Data Identifiers The UK National Health Service


(NHS) (NHS) Number is the personal
identification number issued by
the U.K. National Health Service
(NHS) for administration of
medical care.

See “UK National Health Service


(NHS) Number” on page 1528.

UK National Insurance Number Data Identifiers The UK National Insurance


Number is issued by the United
Kingdom Department for Work
and Pensions (DWP) to identify
an individual for the national
insurance program. It is also
known as a NI number, NINO or
NINo.

See “UK National Insurance


Number” on page 1530.

UK Passport Number Data Identifiers The UK Passport Number


identifies a United Kingdom
passport using the current official
specification of the UK
Government Standards of the UK
Cabinet Office.

See “UK Passport Number”


on page 1532.
Library of policy templates 1621
General Data Protection Regulation (Government Identification)

Table 46-31 General Data Protection Regulations (Government Identification) detection


rules (continued)

Name Type Description

UK Tax ID Number Data Identifiers The UK Tax ID Number is a


personal identification number
provided by the UK Government
Standards of the UK Cabinet
Office.

See “UK Tax ID Number”


on page 1534.

French Passport Number Data Identifiers The French passport is an identity


document issued to French
citizens. Besides enabling the
bearer to travel internationally and
serving as indication of French
citizenship, the passport facilitates
the process of securing
assistance from French consular
officials abroad or other European
Union member states in case a
French consular is absent, if
needed.

See “French Passport Number”


on page 1187.

Belgian National Number Data Identifiers All citizens of Belgium have a


National Number. Belgians 12
years of age and older are issued
a Belgian identity card.

See “Belgian National Number”


on page 1039.

Czech Personal Identification Data Identifiers All citizens of the Czech Republic
Number are issued a unique personal
identification number by the
Ministry of Interior.

See “Czech Republic Personal


Identification Number”
on page 1114.
Library of policy templates 1622
General Data Protection Regulation (Government Identification)

Table 46-31 General Data Protection Regulations (Government Identification) detection


rules (continued)

Name Type Description

French INSEE code Data Identifiers The INSEE code in France is


used as a social insurance
number, a national identification
number, and for taxation and
employment purposes.

See “French INSEE Code”


on page 1185.

French Social Security Number Data Identifiers The French Social Security
Number (FSSN) is a unique
number assigned to each French
citizen or resident foreign national.
It serves as a national
identification number.

See “French Social Security


Number” on page 1188.

Greek Tax Identification Data Identifiers The Arithmo Forologiko Mitro


Number (AFM) is a unique personal tax
identification number assigned to
any individual resident in Greece
or person who owns property in
Greece.

See “Greek Tax Identification


Number” on page 1204.

Hungarian Social Security Data Identifiers The Hungarian Social Security


Number Number (TAJ) is a unique
identifier issued by the Hungarian
government.

See “Hungarian Social Security


Number” on page 1221.

Hungarian Tax Identification Data Identifiers The Hungarian Tax Identification


Number Number is a 10-digit number that
always begins with the digit "8."

See “Hungarian Tax Identification


Number” on page 1223.
Library of policy templates 1623
General Data Protection Regulation (Government Identification)

Table 46-31 General Data Protection Regulations (Government Identification) detection


rules (continued)

Name Type Description

Hungarian VAT Number Data Identifiers All Hungarian businesses


(including non-profit
organizations) upon registration
at the court of Registry are
granted a value-added tax (VAT)
number.

See “Hungarian VAT Number”


on page 1225.

Irish Personal Public Service Data Identifiers The format of the number is a
Number unique 8-character alphanumeric
string ending with a letter, such
as 8765432A. The number is
assigned at the registration of
birth of the child and is issued on
a Public Services Card and is
unique to every person.

See “Irish Personal Public Service


Number” on page 1274.

Luxembourg National Register Data Identifiers The Luxembourg National


of Individuals Number Register of Individuals Number is
an 11-digit identification number
issued to all Luxembourg citizens
at age 15.

See “Luxembourg National


Register of Individuals Number”
on page 1320.

Polish Identification Number Data Identifiers Every Polish citizen 18 years of


age or older residing permanently
in Poland must have an Identity
Card, with a unique personal
number. The number is used as
identification for almost all
purposes.

See “Polish Identification Number”


on page 1394.
Library of policy templates 1624
General Data Protection Regulation (Government Identification)

Table 46-31 General Data Protection Regulations (Government Identification) detection


rules (continued)

Name Type Description

Polish REGON Number Data Identifiers Each national economy entity is


obligated to register in the register
of business entities called
REGON in Poland. It is the only
integrated register in Poland
covering all of the national
economy entities. Each company
has a unique REGON number.

See “Polish REGON Number”


on page 1396.

Polish Social Security Number Data Identifiers The Polish Social Security
(PESEL) Number (PESEL) is the national
identification number used in
Poland. The PESEL number is
mandatory for all permanent
residents of Poland and for
temporary residents living in
Poland. It uniquely identifies a
person and cannot be transferred
to another.

See “Polish Social Security


Number (PESEL)” on page 1398.

Polish Tax Identification Data Identifiers The Polish Tax Identification


Number Number (NIP) is a number the
government gives to every Poland
citizen who works or does
business in Poland. All taxpayers
have a tax identification number
called NIP.

See “Polish Tax Identification


Number” on page 1400.
Library of policy templates 1625
General Data Protection Regulation (Government Identification)

Table 46-31 General Data Protection Regulations (Government Identification) detection


rules (continued)

Name Type Description

Romanian Numerical Personal Data Identifiers In Romania, each citizen has a


Code unique numerical personal code
(Code Numeric Personal, or
CNP). The number is used by
authorities, health care, schools,
universities, banks, and insurance
companies for customer
identification.

See “Romanian Numerical


Personal Code” on page 1425.

Spanish DNI ID Data Identifiers The Spanish DNI ID appears on


the Documento nacional de
identidad (DNI) and is issued by
the Spanish Hacienda Publica to
every citizen of Spain. It is the
most important unique identifier
in Spain used for opening
accounts, signing contracts, taxes,
and elections.

See “Spanish DNI ID” on page 1481.

Spanish Social Security Data Identifiers The Spanish Social Security


Number Number is a 12-digit number
assigned to Spanish workers to
allow access to the Spanish
healthcare system.

See “Spanish Social Security


Number ” on page 1485.

Spanish Customer Account Data Identifiers The Spanish customer account


Number number is the standard customer
bank account number used
across Spain.

See “Spanish Customer Account


Number” on page 1479.
Library of policy templates 1626
General Data Protection Regulation (Government Identification)

Table 46-31 General Data Protection Regulations (Government Identification) detection


rules (continued)

Name Type Description

Spanish Tax ID (CIF) Data Identifiers The Spanish Tax Identification


corporate tax identifier (CIF) is
equivalent to the VAT number,
required for running a business in
Spain. This identifier is a
company's identification for tax
purposes and is required for any
legal transactions.

See “Spanish Tax Identification


(CIF)” on page 1487.

German Passport Number Data Identifiers The German passport number is


issued to German nationals for
the purpose of international travel.
A German passport is an officially
recognized document that
German authorities accept as
proof of identity from German
citizens.

See “German Passport Number”


on page 1190.

Bulgarian Uniform Civil Number Data Identifiers The uniform civil number (EGN)
is unique number assigned to
each Bulgarian citizen or resident
foreign national. It serves as a
national identification number. An
EGN is assigned to Bulgarians at
birth, or when a birth certificate is
issued.

See “Bulgarian Uniform Civil


Number - EGN” on page 1063.
Library of policy templates 1627
General Data Protection Regulation (Government Identification)

Table 46-31 General Data Protection Regulations (Government Identification) detection


rules (continued)

Name Type Description

Austrian Social Security Data Identifiers A social security number is


Number allocated to Austrian citizens who
receive available social security
benefits. It is allocated by the
umbrella association of the
Austrian social security
authorities.

See “Austrian Social Security


Number” on page 1036.

Spanish Passport Number Data Identifiers Spanish passports are issued to


Spanish citizens for the purpose
of travel outside Spain.

See “Spanish Passport Number”


on page 1483.

Swedish Passport Number Data Identifiers Swedish passports are issued to


nationals of Sweden for the
purpose of international travel.
Besides serving as proof of
Swedish citizenship, they facilitate
the process of securing
assistance from Swedish consular
officials abroad or other European
Union member states in case a
Swedish consular is absent, if
needed.

See “Swedish Passport Number”


on page 1499.

German Personal ID Number Data Identifiers The German Personal ID Number


is issued to all German citizens.

See “German Personal ID


Number” on page 1192.
Library of policy templates 1628
General Data Protection Regulation (Government Identification)

Table 46-31 General Data Protection Regulations (Government Identification) detection


rules (continued)

Name Type Description

Burgerservicenummer Data Identifiers In the Netherlands, the


Burgerservicenummer is used to
uniquely identify citizens and is
printed on driving licenses,
passports and international ID
cards under the header Personal
Number.

See “Burgerservicenummer”
on page 1066.

Codice Fiscale Data Identifiers The Codice Fiscale uniquely


identifies an Italian citizen or
permanent resident alien and
issuance of the code is centralized
to the Ministry of Treasure. The
Codice Fiscale is issued to every
Italian at birth.

See “Codice Fiscale” on page 1081.

Finnish Personal Identification Data Identifiers The Finnish Personal


Number Identification Number or Personal
Identity Code is a unique personal
identifier used for identifying
citizens in government and many
other transactions.

See “Finnish Personal


Identification Number”
on page 1175.

Swedish Personal Identification Data Identifiers The Swedish Personal


Number Identification Number is the
unique national identification for
Swedish every citizen. The
number is used by authorities,
health care, schools, universities,
banks, and insurance companies
for customer identification.

See “Sweden Personal


Identification Number”
on page 1501.
Library of policy templates 1629
General Data Protection Regulation (Government Identification)

Table 46-31 General Data Protection Regulations (Government Identification) detection


rules (continued)

Name Type Description

Austria Passport Number Data Identifiers Austrian passports are travel


documents issued to Austrian
citizens by the Austrian Passport
Office of the Department of
Foreign Affairs and Trade, both in
Austria and overseas, and enable
the passport holder to travel
internationally.

See “Austria Passport Number”


on page 1030.

Austria Tax Identification Data Identifiers Austria issues tax identification


Number numbers to individuals based on
their area of residence to identify
taxpayers and facilitate national
taxes.

See “Austria Tax Identification


Number” on page 1031.

Belgium Passport Number Data Identifiers Belgian passports are passports


issued by the Belgian state to its
citizens to facilitate international
travel. The Federal Public Service
Foreign Affairs, formerly known
as the Ministry of Foreign Affairs,
is responsible for issuing and
renewing Belgian passports.

See “Belgium Passport Number”


on page 1044.

Belgium Tax Identification Data Identifiers Belgium issues a tax identification


Number number for persons who has
obligations to declare taxes in
Belgium.

See “Belgium Tax Identification


Number” on page 1045.
Library of policy templates 1630
General Data Protection Regulation (Government Identification)

Table 46-31 General Data Protection Regulations (Government Identification) detection


rules (continued)

Name Type Description

Belgium Value Added Tax (VAT) Data Identifiers VAT is a consumption tax that is
Number borne by the end consumer. VAT
is paid for each transaction in the
manufacturing and distribution
process. For Belgium, the Value
Added Tax is issued by VAT office
for the region in which the
business is established.

See “Belgium Value Added Tax


(VAT) Number” on page 1047.

Belgium Driver's License Data Identifiers Identification number for an


Number individual's driver's licence issued
by the Driver and Vehicle
Licensing Agency of Belgium.

See “Belgium Driver's Licence


Number” on page 1042.

Denmark Personal Data Identifiers In Denmark, every citizen has a


Identification Number national identification number. The
number serves as proof of
identification for almost all
purposes.

See “Denmark Personal


Identification Number”
on page 1126.

Netherlands Driver's License Data Identifiers Identification number for an


Number individual's driver's licence issued
by the RDW government agency
of the Netherlands.

See “Netherlands Driver's License


Number” on page 1362.

Netherlands Passport Number Data Identifiers Dutch passports are issued to


Netherlands citizens for the
purpose of international travel.

See “Netherlands Passport


Number” on page 1363.
Library of policy templates 1631
General Data Protection Regulation (Government Identification)

Table 46-31 General Data Protection Regulations (Government Identification) detection


rules (continued)

Name Type Description

Netherlands Value Added Tax Data Identifiers VAT is a consumption tax that is
(VAT) Number borne by the end consumer. VAT
is paid for each transaction in the
manufacturing and distribution
process. For the Netherlands, the
Value Added Tax is issued by
VAT office for the region in which
the business is established.

See “Netherlands Value Added


Tax (VAT) Number” on page 1367.

France Driver's License Data Identifiers Identification number for an


Number individual's driver's licence issued
by the Driver and Vehicle
Licensing Agency of France.

See “France Driver's License


Number” on page 1177.

France Health Insurance Data Identifiers A Carte Vitale is social insurance


Number card used in France that contains
medical information for the card
holder. It has a unique 21-digit
serial number.

See “France Health Insurance


Number” on page 1179.

France Tax Identification Data Identifiers France issue a tax identification


Number number for anyone who has
obligations to declare taxes in
France.

See “France Tax Identification


Number” on page 1181.

Germany Driver's License Data Identifiers Identification number for an


Number individual's driver's licence issued
by the Driver and Vehicle
Licensing Agency of Germany.

See “Germany Driver's License


Number” on page 1194.
Library of policy templates 1632
General Data Protection Regulation (Government Identification)

Table 46-31 General Data Protection Regulations (Government Identification) detection


rules (continued)

Name Type Description

Italy Passport Number Data Identifiers Italian passports are issued to


Italian citizens for the purpose of
international travel.

See “Italy Passport Number”


on page 1282.

Italy Value Added Tax (VAT) Data Identifiers VAT is a consumption tax that is
Number borne by the end consumer. VAT
is paid for each transaction in the
manufacturing and distribution
process. For Italy, the Value
Added Tax is issued by VAT office
for the region in which the
business is established.

See “Italy Value Added Tax (VAT)


Number” on page 1283.

Italy Driver's License Number Data Identifiers Identification number for an


individual's driver's licence issued
by the Driver and Vehicle
Licensing Agency of Italy.

See “Italy Driver's Licence


Number” on page 1278.

Netherlands Tax Identification Data Identifiers The Netherlands issues a tax


Number identification number at birth or at
registration at the municipality.

See “Netherlands Tax


Identification Number”
on page 1364.

Spain Driver's License Number Data Identifiers Identification number for an


individual's driver's licence issued
by the Driver and Vehicle
Licensing Agency of Spain.

See “Spain Driver's Licence


Number” on page 1477.
Library of policy templates 1633
General Data Protection Regulation (Government Identification)

Table 46-31 General Data Protection Regulations (Government Identification) detection


rules (continued)

Name Type Description

Ukraine Identity Card Data Identifiers The Ukraine Identity Card has a
15-digit record number issued to
citizens of Ukraine. It is used as
a form of identification in place of
Ukraine's domestic passport as of
January 2016.

See “Ukraine Identity Card”


on page 1539.

Ukraine Domestic Passport Data Identifiers An identity document issued to


Number citizens of Ukraine for domestic
use. It has been replaced by the
Ukraine Identity Card as of 2016,
but any existing passports are still
valid.

See “Ukraine Passport


(Domestic)” on page 1541.

Ukraine International Passport Data Identifiers A document used by citizens of


Number Ukraine to travel outside of
Ukraine.

See “Ukraine Passport


(International)” on page 1543.

Germany Value Added Tax Data Identifiers VAT is a consumption tax that is
(VAT) Number borne by the end consumer. VAT
is paid for each transaction in the
manufacturing and distribution
process. For Germany, the Value
Added Tax is issued by VAT office
for the region in which the
business is established.

See “Germany Driver's License


Number” on page 1194.
Library of policy templates 1634
General Data Protection Regulation (Government Identification)

Table 46-31 General Data Protection Regulations (Government Identification) detection


rules (continued)

Name Type Description

France Value Added Tax (VAT) Data Identifiers The Value Added Tax (VAT), is a
Number tax levied on goods and services
provided in France and is
collected from the final customer.
Companies must register with the
Register of Commerce and
Companies in France to get VAT
number allocated.

See “France Value Added Tax


(VAT) Number” on page 1182.

Ireland Passport Number Data Identifiers An Irish passport is the passport


issued to citizens of Ireland. An
Irish passport enables the bearer
to travel internationally and serves
as evidence of Irish citizenship
and citizenship of the European
union. It also facilitates the access
to consular assistance from both
Irish embassies and any embassy
from other European union
member states while abroad.

See “Ireland Passport Number”


on page 1266.

Luxembourg Passport Number Data Identifiers A Luxembourg passport is an


international travel document
issued to nationals of the grand
Duchy of Luxembourg, and may
also serve as proof of
Luxembourgish citizenship.

See “Luxembourg Passport


Number” on page 1322.
Library of policy templates 1635
General Data Protection Regulation (Government Identification)

Table 46-31 General Data Protection Regulations (Government Identification) detection


rules (continued)

Name Type Description

Portugal Passport Number Data Identifiers Portuguese passports are issued


to citizens of Portugal for the
purpose of international travel.
The passport, along with the
national identity card allows for
free rights of movement and
residence in any of the states of
the European Union and
European economic area.

See “Portugal Passport Number”


on page 1407.

Finland Passport Number Data Identifiers Finnish passports are issued to


nationals of Finland for the
purpose of international travel.
They also facilitate the process of
securing assistance from Finnish
consular officials abroad.

See “Finland Passport Number”


on page 1169.

Finland Driver's Licence Data Identifiers Identification number for an


Number individual's driver's license issued
in an EU or EEA Member State
for a Finnish license.

See “Finland Driver's Licence


Number” on page 1165.

Austria Value Added Tax (VAT) Data Identifiers VAT is a consumption tax that is
Number borne by the end consumer. VAT
is paid for each transaction in the
manufacturing and distribution
process. For Austria, the VAT
number is issued by the tax office
for the region in which the
business is established.

See “Austria Value Added Tax


(VAT) Number” on page 1033.
Library of policy templates 1636
General Data Protection Regulation (Government Identification)

Table 46-31 General Data Protection Regulations (Government Identification) detection


rules (continued)

Name Type Description

Sweden Tax Identification Data Identifiers Sweden uses tax identification


Number numbers (TINs) to identify
taxpayers and facilitate the
administration of their national tax
affairs. TINs are also useful for
identifying taxpayers who invest
in other EU countries and are
more reliable than other identifiers
such as name and address.

See “Sweden Tax Identification


Number” on page 1494.

Sweden Value Added Tax (VAT) Data Identifiers VAT is a consumption tax that is
Number borne by the end consumer. VAT
is paid for each transaction in the
manufacturing and distribution
process.

See “Sweden Value Added Tax


(VAT) Number” on page 1496.

Denmark Value Added Tax Data Identifiers VAT is a consumption tax that is
(VAT) Number borne by the end consumer. VAT
is paid for each transaction in the
manufacturing and distribution
process. For Denmark, the VAT
number is issued by the tax office
for the region in which the
business is established.

See “Denmark Value Added Tax


(VAT) Number” on page 1130.

Finland Value Added Tax (VAT) Data Identifiers VAT is a consumption tax that is
Number borne by the end consumer. VAT
is paid for each transaction in the
manufacturing and distribution
process.

See “Finland Value Added Tax


(VAT) Number” on page 1173.
Library of policy templates 1637
General Data Protection Regulation (Government Identification)

Table 46-31 General Data Protection Regulations (Government Identification) detection


rules (continued)

Name Type Description

Ireland Value Added Tax (VAT) Data Identifiers VAT is a consumption tax that is
Number borne by the end consumer. VAT
is paid for each transaction in the
manufacturing and distribution
process. For Ireland, the VAT
number is issued by the Irish tax
authority.

See “Ireland Value Added Tax


(VAT) Number” on page 1271.

Ireland Tax Identification Data Identifiers This number is issued by


Number department of social protection for
natural persons and by revenue
commissioner for non-natural
persons. Non-natural persons can
be companies, partnerships,
trusts, and unincorporated bodies.

See “Ireland Tax Identification


Number” on page 1268.

Portugal Tax Identification Data Identifiers A fiscal number is a tax


Number identification number that is
issued in Portugal to anyone who
wishes to undertake any official
matters in Portugal.

See “Portugal Tax Identification


Number” on page 1408.

Portugal Value Added Tax Data Identifiers VAT is a consumption tax that is
(VAT) Number borne by the end consumer. VAT
is paid for each transaction in the
manufacturing and distribution
process.

See “Portugal Value Added Tax


(VAT) Number” on page 1411.
Library of policy templates 1638
General Data Protection Regulation (Government Identification)

Table 46-31 General Data Protection Regulations (Government Identification) detection


rules (continued)

Name Type Description

Luxembourg Value Added Tax Data Identifiers VAT is a consumption tax that is
(VAT) Number borne by the end consumer. VAT
is paid for each transaction in the
manufacturing and distribution
process.

See “Luxembourg Value Added


Tax (VAT) Number” on page 1327.

Portugal National Identification Data Identifiers The national identification number


Number is a unique identification number
usually present on documents like
citizen cards which are issued by
the Portuguese government to its
citizens. It can be used as a travel
document within the EU and some
other European countries.

See “Portugal National


Identification Number”
on page 1404.

Portugal Driver's Licence Data Identifiers The Institute for Mobility and Land
Number Transport (IMTT) issues driver's
licenses in Portugal.

See “Portugal Driver's Licence


Number” on page 1402.

Denmark Tax Identification Data Identifiers Denmark issues a tax


Number identification number for persons
who have obligations to declare
taxes in Denmark. The tax
identification number also serves
as a personal health insurance
number.

See “Denmark Tax Identification


Number” on page 1128.
Library of policy templates 1639
General Data Protection Regulation (Government Identification)

Table 46-31 General Data Protection Regulations (Government Identification) detection


rules (continued)

Name Type Description

Finland Tax Identification Data Identifiers Finland issues a tax identification


Number number for persons who have
obligations to declare taxes in
Finland.

See “Finland Tax Identification


Number” on page 1171.

Luxembourg Tax Identification Data Identifiers This number is issued by


Number Luxembourg inland revenue
(Administration des contributions
directes - ACD) department and
is used for tax related purposes
of natural and non natural
persons.

See “Luxembourg Tax


Identification Number”
on page 1324.

Germany Tax Identification Data Identifiers Germany issues a tax


Number identification number for persons
who have obligations to declare
taxes in Germany.

See “Germany Tax Identification


Number” on page 1198.

UK Value Added Tax (VAT) Data Identifiers VAT is a consumption tax that is
Number borne by the end consumer. VAT
is paid for each transaction in the
manufacturing and distribution
process. For the United Kingdom,
the VAT number is issued by the
VAT office for the region in which
the business is established.

See “UK Value Added Tax (VAT)


Number” on page 1536.
Library of policy templates 1640
General Data Protection Regulation (Government Identification)

Table 46-31 General Data Protection Regulations (Government Identification) detection


rules (continued)

Name Type Description

Spain Value Added Tax (VAT) Data Identifiers VAT is a consumption tax that is
Number borne by the end consumer. VAT
is paid for each transaction in the
manufacturing and distribution
process. VAT in Spain is overseen
by the State Tax Administration
Agency.

See “Spain Value Added Tax


(VAT) Number” on page 1474.

UK Bank Account Number Sort Data Identifiers Sort codes are bank codes used
Code to route money transfers between
banks within their respective
countries via their respective
clearance organizations.

See “UK Bank Account Number


Sort Code” on page 1523.

Greece Social Security Number Data Identifiers The AMKA (social security
(AMKA) number) is the work and
insurance identification number of
every worker, retired person and
protected family member in
Greece.

See “Greece Social Security


Number (AMKA)” on page 1202.

Romania National Identification Data Identifiers In Romania each citizen has a


Number personal numerical code (Cod
Numeric Personal, CNP) as
unique national identification
number. This number is also used
as a tax identification number for
financial purposes.

See “Romania National


Identification Number”
on page 1419.
Library of policy templates 1641
General Data Protection Regulation (Government Identification)

Table 46-31 General Data Protection Regulations (Government Identification) detection


rules (continued)

Name Type Description

Slovakia National Identification Data Identifiers In Slovakia, identification cards


Number are issued by the state authorities
at 15 years of age for every
citizen. This number is used in
Slovak Republic as the primary
unique identifier for every person
by government institutions, banks
and so on.

See “Slovakia National


Identification Number”
on page 1453.

Slovenia Unique Master Citizen Data Identifiers The unique master citizen number
Number is a unique identification number
assigned to every citizen of
Slovenia at birth or on acquiring
citizenship.

See “Slovenia Unique Master


Citizen Number” on page 1465.

Latvia Personal Identification Data Identifiers The Latvian personal identification


Number number is used for national
identity and as a tax identification
number for financial purposes. It
is issued by the office of
citizenship and migration affairs
of the Ministry of Interior.

See “Latvia Personal Identification


Number” on page 1306.

Finland European Health Data Identifiers The unique 20 digit numeric


Insurance Number identifier that is assigned to every
person who uses health services
in Finland.

See “Finland European Health


Insurance Number” on page 1167.
Library of policy templates 1642
General Data Protection Regulation (Government Identification)

Table 46-31 General Data Protection Regulations (Government Identification) detection


rules (continued)

Name Type Description

Sweden Driver's Licence Data Identifiers In Sweden, a driving license is


Number required when operating a car,
motorcycle or moped on public
roads. Driving licenses are issued
by the prefectural governments
public safety commissions and
are overseen on a nationwide
basis by the National Police
Agency.

See “Sweden Driver's Licence


Number” on page 1492.

Greece Passport Number Data Identifiers Greek passports are issued to


Greek citizens for the purpose of
international travel. The passport
along with the national identity
card allows for free rights of
movement and residence in any
of the states of the European
Union and European Economic
Area.

See “Greece Passport Number”


on page 1200.

Greece Value Added Tax (VAT) Data Identifiers Value Added Tax (VAT) is a
Number consumption tax that is borne by
the end consumer. VAT is paid for
each transaction in the
manufacturing and distribution
process. For Greece, VAT is
administered by the VAT office for
the region in which the business
is established.

See “Greece Value Added Tax


(VAT) Number” on page 1206.
Library of policy templates 1643
General Data Protection Regulation (Government Identification)

Table 46-31 General Data Protection Regulations (Government Identification) detection


rules (continued)

Name Type Description

Poland Passport Number Data Identifiers A Polish passport is an


international travel document
issued to nationals of Poland. It
may also serve as proof of Polish
citizenship.

See “Poland Passport Number”


on page 1389.

Poland Value Added Tax (VAT) Data Identifiers Value Added Tax (VAT) is a
Number consumption tax that is borne by
the end consumer. VAT is paid for
each transaction in the
manufacturing and distribution
process. For Poland, VAT is
administered by the VAT office for
the region in which the business
is established.

See “Poland Value Added Tax


(VAT) Number” on page 1391.

Romania Value Added Tax Data Identifiers Value Added Tax (VAT) is a
(VAT) Number consumption tax that is borne by
the end consumer. VAT is paid for
each transaction in the
manufacturing and distribution
process. In Romania, it is also
called TVA or CIF.

See “Romania Value Added Tax


(VAT) Number” on page 1420.

Hungary Passport Number Data Identifiers Hungarian passports are issued


to Hungarian citizens for
international travel by the Central
Data Processing, Registration,
and Election Office of the
Hungarian Ministry of the Interior.

See “Hungary Passport Number”


on page 1219.
Library of policy templates 1644
General Data Protection Regulation (Government Identification)

Table 46-31 General Data Protection Regulations (Government Identification) detection


rules (continued)

Name Type Description

Czech Republic Value Added Data Identifiers Value Added Tax (VAT) is a
Tax (VAT) Number consumption tax that is borne by
the end consumer. VAT is paid for
each transaction in the
manufacturing and distribution
process. In the Czech Republic,
it is also called DPH.

See “Czech Republic Value


Added Tax (VAT) Number”
on page 1121.

Slovakia Passport Number Data Identifiers Slovak passports are issued to


citizens of Slovakia to facilitate
international travel.

See “Slovakia Passport Number”


on page 1457.

Slovakia Value Added Tax Data Identifiers Value Added Tax (VAT) is a
(VAT) Number consumption tax that is borne by
the end consumer. VAT is paid for
each transaction in the
manufacturing and distribution
process. For Slovakia, VAT is
administered by the tax office for
the region in which the business
is established.

See “Slovakia Value Added Tax


(VAT) Number” on page 1459.

Slovenia Passport Number Data Identifiers Slovenian passports are issued


to citizens of Slovenia to facilitate
international travel.

See “Slovenia Passport Number”


on page 1461.
Library of policy templates 1645
General Data Protection Regulation (Government Identification)

Table 46-31 General Data Protection Regulations (Government Identification) detection


rules (continued)

Name Type Description

Slovenia Tax Identification Data Identifiers The Slovenia Tax Identification


Number Number is a unique identifier of
individuals and legal entities for
tax purposes. The Financial
Administration of the Republic of
Slovenia issues and administers
tax identification numbers in
Slovenia.

See “Slovenia Tax Identification


Number” on page 1463.

Slovenia Value Added Tax Data Identifiers Value Added Tax (VAT) is a
(VAT) Number consumption tax that is borne by
the end consumer. VAT is paid for
each transaction in the
manufacturing and distribution
process. For Slovenia, VAT is
administered by the tax office for
the region in which the business
is established.

See “Slovenia Value Added Tax


(VAT) Number” on page 1467.

Croatia National Identification Data Identifiers The Croatian National


Number Identification number (Osobni
identifikacijski broj or OIB) is the
permanent personal and tax
identifier for Croatian citizens and
residents.

See “Croatia National


Identification Number”
on page 1104.
Library of policy templates 1646
General Data Protection Regulation (Government Identification)

Table 46-31 General Data Protection Regulations (Government Identification) detection


rules (continued)

Name Type Description

Estonia Personal Identification Data Identifiers In Estonia, the personal


Number identification code is a number
based on the sex and birth date
of a person. This code is used as
a unique personal identifier by
governmental and other systems
where identification is required,
as well as for digital signatures
using the national identity card
and its associated certificates. It
also serves as tax identification
number.

See “Estonia Personal


Identification Code” on page 1151.

Estonia Value Added Tax (VAT) Data Identifiers Value Added Tax (VAT) is a
Number consumption tax that is borne by
the end consumer. VAT is paid for
each transaction in the
manufacturing and distribution
process. For Estonia, VAT is
administered by tax office for the
region in which the business is
established.

See “Estonia Value Added Tax


(VAT) Number” on page 1153.

Lithuania Personal Data Identifiers In Lithuania, the personal


Identification Number identification code is a number
based on the sex and birth date
of a person. This code is used as
a unique personal identifier by
governmental and other systems
where identification is required,
as well as for digital signatures
using the national identity card
and its associated certificates.

See “Lithuania Personal


Identification Number”
on page 1312.
Library of policy templates 1647
General Data Protection Regulation (Government Identification)

Table 46-31 General Data Protection Regulations (Government Identification) detection


rules (continued)

Name Type Description

Lithuania Tax Identification Data Identifiers The Lithuanian Taxpayer


Number Identification Number is used to
identify taxpayers and facilitate
the administration of their national
tax affairs.

See “Lithuania Tax Identification


Number” on page 1315.

Estonia Passport Number Data Identifiers The Estonian passport is an


international travel document
issued to citizens of Estonia that
also serves as proof of Estonian
citizenship. The Border Guard
Board in Estonia and Estonian
foreign representations abroad
are responsible for issuing
Estonian passports.

See “Estonia Passport Number”


on page 1149.

Lithuania Value Added Tax Data Identifiers Value Added Tax (VAT) is a
(VAT) Number consumption tax that is borne by
the end consumer. VAT is paid for
each transaction in the
manufacturing and distribution
process. In Lithuania, VAT is
administered by the State Tax
Inspectorate.

See “Lithuania Value Added Tax


(VAT) Number” on page 1317.

Latvia Passport Number Data Identifiers Latvian passports are issued to


citizens of Latvia for identity and
international travel purposes. The
territorial section of The Office of
Citizenship and Migration Affairs
issues passports.

See “Latvia Passport Number”


on page 1305.
Library of policy templates 1648
General Data Protection Regulation (Government Identification)

Table 46-31 General Data Protection Regulations (Government Identification) detection


rules (continued)

Name Type Description

Latvia Value Added Tax (VAT) Data Identifiers Value Added Tax (VAT) is a
Number consumption tax that is borne by
the end consumer. VAT is paid for
each transaction in the
manufacturing and distribution
process. In Latvia, VAT is
administered by the State
Revenue Service.

See “Latvia Value Added Tax


(VAT) Number” on page 1308.

Bulgaria Value Added Tax Data Identifiers Value Added Tax (VAT) is a
(VAT) Number consumption tax that is borne by
the end consumer. VAT is paid for
each transaction in the
manufacturing and distribution
process. In Bulgaria, VAT is
administered by the National
Revenue Agency, which is
overseen by the Bulgarian
Ministry of Finance.

See “Bulgaria Value Added Tax


(VAT) Number” on page 1060.

Malta National Identification Data Identifiers Every resident of Malta is


Number assigned a national number. For
foreigners who are authorized to
reside in Malta, National numbers
for foreign resident end with the
letter A. National numbers for
Maltese citizens end with M, G, L,
H or P.

See “Malta National Identification


Number” on page 1337.
Library of policy templates 1649
General Data Protection Regulation (Government Identification)

Table 46-31 General Data Protection Regulations (Government Identification) detection


rules (continued)

Name Type Description

Malta Tax Identification Number Data Identifiers The Malta Tax Identification
Number is assigned by the Inland
Revenue Department as a means
of identification for income tax
purposes.

See “Malta Tax Identification


Number” on page 1339.

Malta Value Added Tax (VAT) Data Identifiers Value Added Tax (VAT) is a
Number consumption tax that is borne by
the end consumer. VAT is paid for
each transaction in the
manufacturing and distribution
process. In Malta, VAT is
administered by tax office for the
region in which the business is
established.

See “Malta Value Added Tax


(VAT) Number” on page 1342.

Iceland National Identification Data Identifiers The Iceland National Identification


Number Number is a unique national
identifier used by the Icelandic
government to identify individuals
and organizations. It is
administered by the Registers
Iceland. Icelandic national
identification numbers are issued
to Icelandic citizens at birth and
to foreign nationals resident in
Iceland upon registration. They
are also issued to corporations
and institutions.

See “Iceland National


Identification Number”
on page 1241.
Library of policy templates 1650
General Data Protection Regulation (Government Identification)

Table 46-31 General Data Protection Regulations (Government Identification) detection


rules (continued)

Name Type Description

Serbia Unique Master Citizen Data Identifiers The Serbian Unique Master
Number Citizen Number is a unique
identifier for Serbian citizens. It is
assigned to every citizen of Serbia
at birth or upon acquiring
citizenship.

See “Serbia Unique Master


Citizen Number” on page 1445.

Switzerland Passport Number Data Identifiers Swiss passports are issued to


citizens of Switzerland to facilitate
international travel.

See “Switzerland Passport


Number” on page 1511.

Iceland Value Added Tax (VAT) Data Identifiers Value Added Tax (VAT) is a
Number consumption tax that is borne by
the end consumer. VAT is paid for
each transaction in the
manufacturing and distribution
process. For Iceland, VAT is
administered by the VAT office for
the region in which the business
is established.

See “Iceland Value Added Tax


(VAT) Number” on page 1247.

Iceland Passport Number Data Identifiers Icelandic passports are issued to


citizens of Iceland for the purpose
of international travel and may
also serve as a proof of Iceland
citizenship.

See “Iceland Passport Number”


on page 1245.
Library of policy templates 1651
General Data Protection Regulation (Government Identification)

Table 46-31 General Data Protection Regulations (Government Identification) detection


rules (continued)

Name Type Description

Switzerland Value Added Tax Data Identifiers Value Added Tax (VAT) is a
(VAT) Number consumption tax that is borne by
the end consumer. VAT is paid for
each transaction in the
manufacturing and distribution
process. For Switzerland, VAT is
administered by the Federal
Statistical Office for the region in
which the business is established.

See “Switzerland Value Added


Tax (VAT) Number” on page 1513.

Serbia Value Added Tax (VAT) Data Identifiers Value Added Tax (VAT) is a
Number consumption tax that is borne by
the end consumer. VAT is paid for
each transaction in the
manufacturing and distribution
process. In Serbia, VAT is
administered by the Tax
Administration department of the
Ministry of Finance.

See “Serbia Value Added Tax


(VAT) Number” on page 1448.

Liechtenstein Passport Number Data Identifiers Liechtenstein passports are


issued to nationals of
Liechtenstein for the purpose of
international travel. The passport
may also serve as proof of
Liechtensteiner citizenship.

See “Liechtenstein Passport


Number” on page 1311.

Norway National Identification Data Identifiers The Norway National identification


Number number is assigned by the
Norwegian state to all citizens of
the country. It is administered by
the Tax Administration.

See “Norway National


Identification Number”
on page 1377.
Library of policy templates 1652
General Data Protection Regulation (Government Identification)

Table 46-31 General Data Protection Regulations (Government Identification) detection


rules (continued)

Name Type Description

Norway Value Added Tax (VAT) Data Identifiers Value Added Tax (VAT) is a
Number consumption tax that is borne by
the end consumer. VAT is paid for
each transaction in the
manufacturing and distribution
process. For Norway, VAT Is
administered by the VAT office for
the region in which the business
is established.

See “Norway Value Added Tax


Number” on page 1379.

Romania Driver's Licence Data Identifiers A driving license in Romania is a


Number document confirming the rights of
the holder to drive motor vehicles.

See “Romania Driver's Licence


Number” on page 1416.

Czech Republic Driver's Data Identifiers The Czech Republic Ministry of


Licence Number Transport grants driver's licenses
in the Czech Republic, confirming
the rights of the holder to drive
motor vehicles.

See “Czech Republic Driver's


Licence Number” on page 1112.

Slovakia Driver's Licence Data Identifiers A Slovak drivers license is a


Number document confirming the rights of
the holder to drive motor vehicles.
Slovak driver's licenses are
granted by the Ministry of Interior.

See “Slovakia Driver's Licence


Number” on page 1451.

Poland Driver's Licence Data Identifiers Poland issues driving licenses


Number confirming the rights of the holder
to drive motor vehicles.

See “Poland Driver's Licence


Number” on page 1386.
Library of policy templates 1653
General Data Protection Regulation (Government Identification)

Table 46-31 General Data Protection Regulations (Government Identification) detection


rules (continued)

Name Type Description

Hungary Driver's Licence Data Identifiers A driving license in Hungary is a


Number document issued by the Ministry
of Economics and Transport,
confirming the rights of the holder
to drive motor vehicles.

See “Hungary Driver's Licence


Number” on page 1217.

Latvia Driver's Licence Number Data Identifiers A driver's license in Latvia is a


document issued by the Road
Traffic Safety Directorate,
confirming the rights of the holder
to drive motor vehicles.

See “Latvia Driver's Licence


Number” on page 1303.

Norway Driver's Licence Data Identifiers A driver's license is required in


Number Norway before a person is
permitted to drive a motor vehicle
of any description on a road in
Norway.

See “Norway Driver's Licence


Number” on page 1375.

Cyprus Value Added Tax (VAT) Data Identifiers Value Added Tax (VAT) is a
Number consumption tax that is borne by
the end consumer. VAT is paid for
each transaction in the
manufacturing and distribution
process. For Cyprus, VAT is
administered by the tax office for
the region in which the business
is established.

See “Cyprus Value Added Tax


(VAT) Number” on page 1111.

Cyprus Tax Identification Data Identifiers The Cyprus Tax Identification


Number Number is a unique identifier for
Cypriot taxpayers.

See “Cyprus Tax Identification


Number” on page 1109.
Library of policy templates 1654
General Data Protection Regulation (Government Identification)

Table 46-31 General Data Protection Regulations (Government Identification) detection


rules (continued)

Name Type Description

Switzerland Health Insurance Data Identifiers Swiss insurance providers issue


Card Number health insurance cards to their
customers. Swiss health
insurance cards can also be used
to access European health
services.

See “Switzerland Health


Insurance Card Number”
on page 1509.

Estonia Driver's Licence Data Identifiers The Estonian Road Administration


Number issues driving licenses in Estonia,
confirming the rights of the holder
to drive motor vehicles.

See “Estonia Driver's Licence


Number” on page 1147.

SEPA Creditor Identifier Data Identifiers The Single Euro Payment Area
Number North (SEPA) is a payments system
created by the European Union
that harmonizes the way cashless
payments transact between Euro
countries. SEPA North is for the
United Kingdom, Sweden,
Denmark, Finland, Ireland.
European consumers,
businesses, and government
agents who make payments by
direct debit, credit card or through
credit transfers use the SEPA
architecture. The Single Euro
Payment Area is approved and
regulated by European
Commission.

See “SEPA Creditor Identifier


Number North” on page 1430.
Library of policy templates 1655
General Data Protection Regulation (Government Identification)

Table 46-31 General Data Protection Regulations (Government Identification) detection


rules (continued)

Name Type Description

SEPA Creditor Identifier Data Identifiers The Single Euro Payment Area
Number South (SEPA) is a payments system
created by the European Union
that harmonizes the way cashless
payments transact between Euro
countries. SEPA South is for Italy,
Spain, and Portugal. European
consumers, businesses, and
government agents who make
payments by direct debit, credit
card or through credit transfers
use the SEPA architecture. The
Single Euro Payment Area is
approved and regulated by
European Commission.

See “SEPA Creditor Identifier


Number South” on page 1437.

SEPA Creditor Identifier Data Identifiers The Single Euro Payment Area
Number West (SEPA) is a payments system
created by the European Union
that harmonizes the way cashless
payments transact between Euro
countries. SEPA West is for
Germany, France, Netherlands,
Belgium, Austria, and
Luxembourg. European
consumers, businesses, and
government agents who make
payments by direct debit, credit
card, or through credit transfers
use the SEPA architecture. The
Single Euro Payment Area is
approved and regulated by
European Commission.

See “SEPA Creditor Identifier


Number West” on page 1441.
Library of policy templates 1656
General Data Protection Regulation (Healthcare and Insurance)

Table 46-31 General Data Protection Regulations (Government Identification) detection


rules (continued)

Name Type Description

European Health Insurance Data Identifiers The European Health Insurance


Number Card (EHIC) allows anyone
insured by or covered by a
statutory social security scheme
of the European Economic Area
countries and Switzerland to
receive medical treatment in
another member state free or at
a reduced cost.

See “European Health Insurance


Card Number” on page 1156.

General Data Protection Regulation (Healthcare and


Insurance)
This template focuses on General Data Protection Regulation (GDPR) healthcare and insurance
related keywords, Data Identifiers and an EDM profile with related columns.
The GDPR is a regulation by which the European Commission intends to strengthen and unify
data protection for individuals within the EU. It also addresses export of personal data outside
the EU. The primary objectives of the GDPR are to give citizens back the control of their
personal data and to simplify the regulatory environment for international business by unifying
the regulation within the EU. The GDPR replaces the EU Data Protection Directives as of 25
May 2018.
Library of policy templates 1657
General Data Protection Regulation (Healthcare and Insurance)

Table 46-32 General Data Protection Regulations (Healthcare and Insurance) detection rules

Name Type Description

GDPR Healthcare and Keyword Match Matches a list of related


Insurance Related Keywords keywords:

account number, bank card


number,ID card number,
medical record
number,Kontonummer,
Bankkartennummer,
ID-Kartennummer, medizinische
Datensatznummer, Numéro
compte, banque carte nombre,
numéro de carte d'identité,
numéro d'enregistrement
médical, numero conto, numero
carta banca, numero carta
d'identità, numero cartella
clinica, número cuenta, Número
cuenta bancaria, Numero de la
tarjeta identificacion, número
registro médico,
rekeningnummer, bank
kaartnummer,
identiteitskaartnummer,
medisch dossier nummer,
bankkortnummer,
identitetskortnummer,
ID-kortnummer, tilinumero,
pankkikortin numero,
Henkilökortin numero,
lääketieteellisen
ennätysnumero, uimhir
chuntais, uimhir chárta bainc,
Uimhir chárta aitheantais,
uimhir taifead leighis,
Kontosnummer,
Identifikatiounskaart,
medizinescher
Dateschutznummer, número de
conta, número cartão bancário,
Número do cartão de
identificação
Library of policy templates 1658
General Data Protection Regulation (Healthcare and Insurance)

Table 46-32 General Data Protection Regulations (Healthcare and Insurance) detection rules
(continued)

Name Type Description

UK Driver's Licence Number Data Identifiers The UK Drivers Licence Number


is the identification number for an
individual's driver's license issued
by the Driver and Vehicle
Licensing Agency of the United
Kingdom.

See “UK Drivers Licence Number”


on page 1525.

UK National Health Service Data Identifiers The UK National Health Service


(NHS) (NHS) Number is the personal
identification number issued by
the U.K. National Health Service
(NHS) for administration of
medical care.

See “UK National Health Service


(NHS) Number” on page 1528.

UK National Insurance Number Data Identifiers The UK National Insurance


Number is issued by the United
Kingdom Department for Work
and Pensions (DWP) to identify
an individual for the national
insurance program. It is also
known as a NI number, NINO or
NINo.

See “UK National Insurance


Number” on page 1530.

Belgian National Number Data Identifiers All citizens of Belgium have a


National Number. Belgians 12
years of age and older are issued
a Belgian identity card.

See “Belgian National Number”


on page 1039.
Library of policy templates 1659
General Data Protection Regulation (Healthcare and Insurance)

Table 46-32 General Data Protection Regulations (Healthcare and Insurance) detection rules
(continued)

Name Type Description

Czech Personal Identification Data Identifiers All citizens of the Czech Republic
Number are issued a unique personal
identification number by the
Ministry of Interior.

See “Czech Republic Personal


Identification Number”
on page 1114.

French INSEE code Data Identifiers The INSEE code in France is


used as a social insurance
number, a national identification
number, and for taxation and
employment purposes.

See “French INSEE Code”


on page 1185.

French Social Security Number Data Identifiers The French Social Security
Number (FSSN) is a unique
number assigned to each French
citizen or resident foreign national.
It serves as a national
identification number.

See “French Social Security


Number” on page 1188.

Hungarian Social Security Data Identifiers The Hungarian Social Security


Number Number (TAJ) is a unique
identifier issued by the Hungarian
government.

See “Hungarian Social Security


Number” on page 1221.
Library of policy templates 1660
General Data Protection Regulation (Healthcare and Insurance)

Table 46-32 General Data Protection Regulations (Healthcare and Insurance) detection rules
(continued)

Name Type Description

Irish Personal Public Service Data Identifiers The format of the number is a
Number unique 8-character alphanumeric
string ending with a letter, such
as 8765432A. The number is
assigned at the registration of
birth of the child and is issued on
a Public Services Card and is
unique to every person.

See “Irish Personal Public Service


Number” on page 1274.

Luxembourg National Register Data Identifiers The Luxembourg National


of Individuals Number Register of Individuals Number is
an 11-digit identification number
issued to all Luxembourg citizens
at age 15.

See “Luxembourg National


Register of Individuals Number”
on page 1320.

Polish Identification Number Data Identifiers Every Polish citizen 18 years of


age or older residing permanently
in Poland must have an Identity
Card, with a unique personal
number. The number is used as
identification for almost all
purposes.

See “Polish Identification Number”


on page 1394.

Polish REGON Number Data Identifiers Each national economy entity is


obligated to register in the register
of business entities called
REGON in Poland. It is the only
integrated register in Poland
covering all of the national
economy entities. Each company
has a unique REGON number.

See “Polish REGON Number”


on page 1396.
Library of policy templates 1661
General Data Protection Regulation (Healthcare and Insurance)

Table 46-32 General Data Protection Regulations (Healthcare and Insurance) detection rules
(continued)

Name Type Description

Polish Social Security Number Data Identifiers The Polish Social Security
(PESEL) Number (PESEL) is the national
identification number used in
Poland. The PESEL number is
mandatory for all permanent
residents of Poland and for
temporary residents living in
Poland. It uniquely identifies a
person and cannot be transferred
to another.

See “Polish Social Security


Number (PESEL)” on page 1398.

Romanian Numerical Personal Data Identifiers In Romania, each citizen has a


Code unique numerical personal code
(Code Numeric Personal, or
CNP). The number is used by
authorities, health care, schools,
universities, banks, and insurance
companies for customer
identification.

See “Romanian Numerical


Personal Code” on page 1425.

Spanish DNI ID Data Identifiers The Spanish DNI ID appears on


the Documento nacional de
identidad (DNI) and is issued by
the Spanish Hacienda Publica to
every citizen of Spain. It is the
most important unique identifier
in Spain used for opening
accounts, signing contracts, taxes,
and elections.

See “Spanish DNI ID” on page 1481.


Library of policy templates 1662
General Data Protection Regulation (Healthcare and Insurance)

Table 46-32 General Data Protection Regulations (Healthcare and Insurance) detection rules
(continued)

Name Type Description

Spanish Social Security Data Identifiers The Spanish Social Security


Number Number is a 12-digit number
assigned to Spanish workers to
allow access to the Spanish
healthcare system.

See “Spanish Social Security


Number ” on page 1485.

Bulgarian Uniform Civil Number Data Identifiers The uniform civil number (EGN)
is unique number assigned to
each Bulgarian citizen or resident
foreign national. It serves as a
national identification number. An
EGN is assigned to Bulgarians at
birth, or when a birth certificate is
issued.

See “Bulgarian Uniform Civil


Number - EGN” on page 1063.

Austrian Social Security Data Identifiers A social security number is


Number allocated to Austrian citizens who
receive available social security
benefits. It is allocated by the
umbrella association of the
Austrian social security
authorities.

See “Austrian Social Security


Number” on page 1036.

German Personal ID Number Data Identifiers The German Personal ID Number


is issued to all German citizens.

See “German Personal ID


Number” on page 1192.
Library of policy templates 1663
General Data Protection Regulation (Healthcare and Insurance)

Table 46-32 General Data Protection Regulations (Healthcare and Insurance) detection rules
(continued)

Name Type Description

Burgerservicenummer Data Identifiers In the Netherlands, the


Burgerservicenummer is used to
uniquely identify citizens and is
printed on driving licenses,
passports and international ID
cards under the header Personal
Number.

See “Burgerservicenummer”
on page 1066.

Codice Fiscale Data Identifiers The Codice Fiscale uniquely


identifies an Italian citizen or
permanent resident alien and
issuance of the code is centralized
to the Ministry of Treasure. The
Codice Fiscale is issued to every
Italian at birth.

See “Codice Fiscale” on page 1081.

Finnish Personal Identification Data Identifiers The Finnish Personal


Number Identification Number or Personal
Identity Code is a unique personal
identifier used for identifying
citizens in government and many
other transactions.

See “Finnish Personal


Identification Number”
on page 1175.

Swedish Personal Identification Data Identifiers The Swedish Personal


Number Identification Number is the
unique national identification for
Swedish every citizen. The
number is used by authorities,
health care, schools, universities,
banks, and insurance companies
for customer identification.

See “Sweden Personal


Identification Number”
on page 1501.
Library of policy templates 1664
General Data Protection Regulation (Healthcare and Insurance)

Table 46-32 General Data Protection Regulations (Healthcare and Insurance) detection rules
(continued)

Name Type Description

Belgium Driver's License Data Identifiers Identification number for an


Number individual's driver's licence issued
by the Driver and Vehicle
Licensing Agency of Belgium.

See “Belgium Driver's Licence


Number” on page 1042.

Denmark Personal Data Identifiers In Denmark, every citizen has a


Identification Number national identification number. The
number serves as proof of
identification for almost all
purposes.

See “Denmark Personal


Identification Number”
on page 1126.

Netherlands Driver's License Data Identifiers Identification number for an


Number individual's driver's licence issued
by the RDW government agency
of the Netherlands.

See “Netherlands Driver's License


Number” on page 1362.

France Driver's License Data Identifiers Identification number for an


Number individual's driver's licence issued
by the Driver and Vehicle
Licensing Agency of France.

See “France Driver's License


Number” on page 1177.

France Health Insurance Data Identifiers A Carte Vitale is social insurance


Number card used in France that contains
medical information for the card
holder. It has a unique 21-digit
serial number.

See “France Health Insurance


Number” on page 1179.
Library of policy templates 1665
General Data Protection Regulation (Healthcare and Insurance)

Table 46-32 General Data Protection Regulations (Healthcare and Insurance) detection rules
(continued)

Name Type Description

Germany Driver's License Data Identifiers Identification number for an


Number individual's driver's licence issued
by the Driver and Vehicle
Licensing Agency of Germany.

See “Germany Driver's License


Number” on page 1194.

Italy Health Insurance Number Data Identifiers The Italian Health Insurance Card
is issued to every Italian citizen
by the Italian Ministry of Economy
and Finance in cooperation with
the Italian Agency of Revenue.
The objective of the card is to
improve the social security
services through expenditure
control and performance, and to
optimize the use health services
to citizens.

See “Italy Health Insurance


Number” on page 1280.

Italy Driver's License Number Data Identifiers Identification number for an


individual's driver's licence issued
by the Driver and Vehicle
Licensing Agency of Italy.

See “Italy Driver's Licence


Number” on page 1278.

Spain Driver's License Number Data Identifiers Identification number for an


individual's driver's licence issued
by the Driver and Vehicle
Licensing Agency of Spain.

See “Spain Driver's Licence


Number” on page 1477.

Finland Driver's Licence Data Identifiers Identification number for an


Number individual's driver's license issued
in an EU or EEA Member State
for a Finnish license.

See “Finland Driver's Licence


Number” on page 1165.
Library of policy templates 1666
General Data Protection Regulation (Healthcare and Insurance)

Table 46-32 General Data Protection Regulations (Healthcare and Insurance) detection rules
(continued)

Name Type Description

Portugal National Identification Data Identifiers The national identification number


Number is a unique identification number
usually present on documents like
citizen cards which are issued by
the Portuguese government to its
citizens. It can be used as a travel
document within the EU and some
other European countries.

See “Portugal National


Identification Number”
on page 1404.

Portugal Driver's Licence Data Identifiers The Institute for Mobility and Land
Number Transport (IMTT) issues driver's
licenses in Portugal.

See “Portugal Driver's Licence


Number” on page 1402.

Greece Social Security Number Data Identifiers The AMKA (social security
(AMKA) number) is the work and
insurance identification number of
every worker, retired person and
protected family member in
Greece.

See “Greece Social Security


Number (AMKA)” on page 1202.

Romania National Identification Data Identifiers In Romania each citizen has a


Number personal numerical code (Cod
Numeric Personal, CNP) as
unique national identification
number. This number is also used
as a tax identification number for
financial purposes.

See “Romania National


Identification Number”
on page 1419.
Library of policy templates 1667
General Data Protection Regulation (Healthcare and Insurance)

Table 46-32 General Data Protection Regulations (Healthcare and Insurance) detection rules
(continued)

Name Type Description

Slovakia National Identification Data Identifiers In Slovakia, identification cards


Number are issued by the state authorities
at 15 years of age for every
citizen. This number is used in
Slovak Republic as the primary
unique identifier for every person
by government institutions, banks
and so on.

See “Slovakia National


Identification Number”
on page 1453.

Slovenia Unique Master Citizen Data Identifiers The unique master citizen number
Number is a unique identification number
assigned to every citizen of
Slovenia at birth or on acquiring
citizenship.

See “Slovenia Unique Master


Citizen Number” on page 1465.

Latvia Personal Identification Data Identifiers The Latvian personal identification


Number number is used for national
identity and as a tax identification
number for financial purposes. It
is issued by the office of
citizenship and migration affairs
of the Ministry of Interior.

See “Latvia Personal Identification


Number” on page 1306.

Finland European Health Data Identifiers The unique 20 digit numeric


Insurance Number identifier that is assigned to every
person who uses health services
in Finland.

See “Finland European Health


Insurance Number” on page 1167.
Library of policy templates 1668
General Data Protection Regulation (Healthcare and Insurance)

Table 46-32 General Data Protection Regulations (Healthcare and Insurance) detection rules
(continued)

Name Type Description

Sweden Driver's Licence Data Identifiers In Sweden, a driving license is


Number required when operating a car,
motorcycle or moped on public
roads. Driving licenses are issued
by the prefectural governments
public safety commissions and
are overseen on a nationwide
basis by the National Police
Agency.

See “Sweden Driver's Licence


Number” on page 1492.

Croatia National Identification Data Identifiers The Croatian National


Number Identification number (Osobni
identifikacijski broj or OIB) is the
permanent personal and tax
identifier for Croatian citizens and
residents.

See “Croatia National


Identification Number”
on page 1104.

Estonia Personal Identification Data Identifiers In Estonia, the personal


Number identification code is a number
based on the sex and birth date
of a person. This code is used as
a unique personal identifier by
governmental and other systems
where identification is required,
as well as for digital signatures
using the national identity card
and its associated certificates. It
also serves as tax identification
number.

See “Estonia Personal


Identification Code” on page 1151.
Library of policy templates 1669
General Data Protection Regulation (Healthcare and Insurance)

Table 46-32 General Data Protection Regulations (Healthcare and Insurance) detection rules
(continued)

Name Type Description

Lithuania Personal Data Identifiers In Lithuania, the personal


Identification Number identification code is a number
based on the sex and birth date
of a person. This code is used as
a unique personal identifier by
governmental and other systems
where identification is required,
as well as for digital signatures
using the national identity card
and its associated certificates.

See “Lithuania Personal


Identification Number”
on page 1312.

Malta National Identification Data Identifiers Every resident of Malta is


Number assigned a national number. For
foreigners who are authorized to
reside in Malta, National numbers
for foreign resident end with the
letter A. National numbers for
Maltese citizens end with M, G, L,
H or P.

See “Malta National Identification


Number” on page 1337.

Iceland National Identification Data Identifiers The Iceland National Identification


Number Number is a unique national
identifier used by the Icelandic
government to identify individuals
and organizations. It is
administered by the Registers
Iceland. Icelandic national
identification numbers are issued
to Icelandic citizens at birth and
to foreign nationals resident in
Iceland upon registration. They
are also issued to corporations
and institutions.

See “Iceland National


Identification Number”
on page 1241.
Library of policy templates 1670
General Data Protection Regulation (Healthcare and Insurance)

Table 46-32 General Data Protection Regulations (Healthcare and Insurance) detection rules
(continued)

Name Type Description

Serbia Unique Master Citizen Data Identifiers The Serbian Unique Master
Number Citizen Number is a unique
identifier for Serbian citizens. It is
assigned to every citizen of Serbia
at birth or upon acquiring
citizenship.

See “Serbia Unique Master


Citizen Number” on page 1445.

Norway National Identification Data Identifiers The Norway National identification


Number number is assigned by the
Norwegian state to all citizens of
the country. It is administered by
the Tax Administration.

See “Norway National


Identification Number”
on page 1377.

Romania Driver's Licence Data Identifiers A driving license in Romania is a


Number document confirming the rights of
the holder to drive motor vehicles.

See “Romania Driver's Licence


Number” on page 1416.

Czech Republic Driver's Data Identifiers The Czech Republic Ministry of


Licence Number Transport grants driver's licenses
in the Czech Republic, confirming
the rights of the holder to drive
motor vehicles.

See “Czech Republic Driver's


Licence Number” on page 1112.

Slovakia Driver's Licence Data Identifiers A Slovak drivers license is a


Number document confirming the rights of
the holder to drive motor vehicles.
Slovak driver's licenses are
granted by the Ministry of Interior.

See “Slovakia Driver's Licence


Number” on page 1451.
Library of policy templates 1671
General Data Protection Regulation (Healthcare and Insurance)

Table 46-32 General Data Protection Regulations (Healthcare and Insurance) detection rules
(continued)

Name Type Description

Poland Driver's Licence Data Identifiers Poland issues driving licenses


Number confirming the rights of the holder
to drive motor vehicles.

See “Poland Driver's Licence


Number” on page 1386.

Hungary Driver's Licence Data Identifiers A driving license in Hungary is a


Number document issued by the Ministry
of Economics and Transport,
confirming the rights of the holder
to drive motor vehicles.

See “Hungary Driver's Licence


Number” on page 1217.

Latvia Driver's Licence Number Data Identifiers A driver's license in Latvia is a


document issued by the Road
Traffic Safety Directorate,
confirming the rights of the holder
to drive motor vehicles.

See “Latvia Driver's Licence


Number” on page 1303.

Norway Driver's Licence Data Identifiers A driver's license is required in


Number Norway before a person is
permitted to drive a motor vehicle
of any description on a road in
Norway.

See “Norway Driver's Licence


Number” on page 1375.

Switzerland Health Insurance Data Identifiers Swiss insurance providers issue


Card Number health insurance cards to their
customers. Swiss health
insurance cards can also be used
to access European health
services.

See “Switzerland Health


Insurance Card Number”
on page 1509.
Library of policy templates 1672
General Data Protection Regulation (Personal Profile)

Table 46-32 General Data Protection Regulations (Healthcare and Insurance) detection rules
(continued)

Name Type Description

Estonia Driver's Licence Data Identifiers The Estonian Road Administration


Number issues driving licenses in Estonia,
confirming the rights of the holder
to drive motor vehicles.

See “Estonia Driver's Licence


Number” on page 1147.

European Health Insurance Data Identifiers The European Health Insurance


Number Card (EHIC) allows anyone
insured by or covered by a
statutory social security scheme
of the European Economic Area
countries and Switzerland to
receive medical treatment in
another member state free or at
a reduced cost.

See “European Health Insurance


Card Number” on page 1156.

General Data Protection Regulation (Personal Profile)


This template focuses on General Data Protection Regulation (GDPR) personal profile related
keywords, Data Identifiers and an EDM profile with related columns.
The GDPR is a regulation by which the European Commission intends to strengthen and unify
data protection for individuals within the EU. It also addresses export of personal data outside
the EU. The primary objectives of the GDPR are to give citizens back the control of their
personal data and to simplify the regulatory environment for international business by unifying
the regulation within the EU. The GDPR replaces the EU Data Protection Directives as of 25
May 2018.
Library of policy templates 1673
General Data Protection Regulation (Personal Profile)

Table 46-33 General Data Protection Regulations (Personal Profile) detection rule

Name Type Description

GDPR Personal Profile Keyword Match


Keywords
Library of policy templates 1674
General Data Protection Regulation (Personal Profile)

Table 46-33 General Data Protection Regulations (Personal Profile) detection rule (continued)

Name Type Description

Matches a list of related


keywords:

academic details, work history,


professional qualification,
summary of qualifications, bio
data, bio-data, CV, curriculum
vitae, Akademische Details,
Arbeitsgeschichte,
Berufsqualifikation,
Zusammenfassung der
Qualifikationen, Bio-Daten,
Lebenslauf, Bio Daten, Les
données académiques, la
qualification professionnelle, le
résumé des qualifications, Bio
données, le curriculum vitae,
dettagli accademici, storia del
lavoro, qualificazione
professionale, sintesi delle
qualifiche, i dati bio, bio-dati,
Datos académicos, historial de
trabajo, calificación profesional,
resumen de calificaciones,
datos bio, bio-datos,
academische informatie, werk
geschiedenis,
beroepskwalificatie,
samenvatting van kwalificaties,
bio gegevens, bio-gegevens,
leerplan vitae, akademiska
detaljer, Jobbhistorik,
professionell kvalifikation,
sammanfattning av
kvalifikationer,
meritförteckning, akademiske
detaljer, arbejdshistorie,
professionel kvalifikation,
Resumé af kvalifikationer,
Genoptag, akateemiset
yksityiskohdat, työhistoria,
ammattipätevyys, yhteenveto
tutkinnoist, sonraí acadúla,
Library of policy templates 1675
General Data Protection Regulation (Travel)

Table 46-33 General Data Protection Regulations (Personal Profile) detection rule (continued)

Name Type Description

stair oibre, cáilíocht ghairmiúil,


achoimre ar cháilíochtaí,
akademesch Detailer,
Aarbechtsgeschicht, berufflech
Qualifikatioun,
Zesummefaassung vu
Qualifikatiounen, Liewenslaf,
detalhes acadêmicos, histórico
de trabalho, qualificação
profissional, sumário de
qualificações, Currículo

General Data Protection Regulation (Travel)


This template focuses on General Data Protection Regulation (GDPR) travel related keywords,
Data Identifiers and an EDM profile with related columns.
The GDPR is a regulation by which the European Commission intends to strengthen and unify
data protection for individuals within the EU. It also addresses export of personal data outside
the EU. The primary objectives of the GDPR are to give citizens back the control of their
personal data and to simplify the regulatory environment for international business by unifying
the regulation within the EU. The GDPR replaces the EU Data Protection Directives as of 25
May 2018.
Library of policy templates 1676
General Data Protection Regulation (Travel)

Table 46-34 General Data Protection Regulations (Travel) detection rules

Name Type Description

GDPR Travel Related Keywords Keyword Match


Library of policy templates 1677
General Data Protection Regulation (Travel)

Table 46-34 General Data Protection Regulations (Travel) detection rules (continued)

Name Type Description

Matches a list of related


keywords:

account number, bank card


number, driver license number,
ID card number, passenger
name, seat number, luggage
details, journey details,
purchase details, purchase
invoice, travel ticket, travel
invoice, passenger details,
tourist details, Kontonummer,
Bankkartennummer,
Führerscheinnummer,
Ausweisnummer,
Passagiername,
Sitzplatznummer,
Einkaufsdetails,
Kaufrechnungen,
Passagierdetails,
Touristendetails,
Gepäckdetails, Fahrtdetails,
ReiseFahrkarte,
ReiseRechnung, numéro
compte, numéro carte bancaire,
numéro de permis de conduire,
numéro de carte d'identité,
passager nom, numéro du
siège, bagage détails, détails
voyage, l'achat détails, la
facture d'achat, billet de
voyage, la facture voyage,
détails passager, détails
touristiques, numero di conto,
numero carta banca, numero
patente di guida, numero carta
d'identità, nome passeggero,
numero del posto, dettagli dei
bagagli, dettagli di viaggio,
dettagli acquisto, fattura
acquisto, biglietto viaggio,
fattura viaggio, dati passeggeri,
dettagli turistiche, Número
Library of policy templates 1678
General Data Protection Regulation (Travel)

Table 46-34 General Data Protection Regulations (Travel) detection rules (continued)

Name Type Description

cuenta, número tarjeta


bancaria, número licencia de
conducir, número de tarjeta
identificación, nombre
pasajero, número asiento,
detalles equipaje, detalles de
viaje, detalles de compra, viaje
factura, viaje billete, factura de
viaje, pasajeros detalles,
detalles turísticos,
rekeningnummer, bankkaart
nummer, rijbewijs nummer,
ID-kaart nummer, naam
passagier, stoelnummer,
bagage-informatie, reis
informatie, aankoopgegevens,
aankoopfactuur,
reizenreisbiljet, reizen factuur,
passagiersgegevens,
toeristische informatie,
bankkortnummer, körkort
nummer, identitetskortnummer,
Passagerarens namn,
sitsnummer, reseinformation,
köp detaljer, inköpsfaktura,
resa biljett, resefaktura,
passagerare detaljer,
førerkortnummer,
ID-kortnummer, Passagernavn,
sæde nummer, bagage detaljer,
rejsedetaljer, købsoplysninger,
købsfaktura, rejse billet, rejse
faktura, passageroplysninger,
turist detaljer, tilinumero,
pankkikortin numero, ajokortin
numero, Henkilökortin numero,
matkustajan nimi, istumapaikan
numero, matkatavaran
yksityiskohdat, matk
yksityiskoh, ostotiedot,
matkustaalippu, matkustajan
yksityiskohdat, turisti
yksityiskohdat, uimhir chuntais,
Library of policy templates 1679
General Data Protection Regulation (Travel)

Table 46-34 General Data Protection Regulations (Travel) detection rules (continued)

Name Type Description

uimhir chárta bainc, uimhir


ceadúnas tiomána, Uimhir
chárta aitheantais, ainm
phaisinéara, uimhir suíocháin,
sonraí turas, sonraí cheannach,
cheannach sonrasc, sonrasc
taistil, sonraí paisinéirí, sonraí
turasóireachta, Kontosnummer,
Identifikatiounskaart, Numm
Passagéier, Sitznummer,
Gepäckdetailer, Rees Detailer,
kaaft Detailer,
Passagéierdetailer, número de
conta, número cartão bancário,
número licença motorista,
Número do cartão de
identificação, Nome do
passageiro, Número do
assento, Detalhes bagagem,
detalhes viagem, detalhes da
compra, nota fiscal de compra,
bilhete de viagem, factura de
viagem, Detalhes do
passageiro, detalhes do turista

UK Driver's Licence Number Data Identifiers The UK Drivers Licence Number


is the identification number for an
individual's driver's license issued
by the Driver and Vehicle
Licensing Agency of the United
Kingdom.

See “UK Drivers Licence Number”


on page 1525.

UK Passport Number Data Identifiers The UK Passport Number


identifies a United Kingdom
passport using the current official
specification of the UK
Government Standards of the UK
Cabinet Office.

See “UK Passport Number”


on page 1532.
Library of policy templates 1680
General Data Protection Regulation (Travel)

Table 46-34 General Data Protection Regulations (Travel) detection rules (continued)

Name Type Description

French Passport Number Data Identifiers The French passport is an identity


document issued to French
citizens. Besides enabling the
bearer to travel internationally and
serving as indication of French
citizenship, the passport facilitates
the process of securing
assistance from French consular
officials abroad or other European
Union member states in case a
French consular is absent, if
needed.

See “French Passport Number”


on page 1187.

German Passport Number Data Identifiers The German passport number is


issued to German nationals for
the purpose of international travel.
A German passport is an officially
recognized document that
German authorities accept as
proof of identity from German
citizens.

See “German Passport Number”


on page 1190.

Spanish Passport Number Data Identifiers Spanish passports are issued to


Spanish citizens for the purpose
of travel outside Spain.

See “Spanish Passport Number”


on page 1483.
Library of policy templates 1681
General Data Protection Regulation (Travel)

Table 46-34 General Data Protection Regulations (Travel) detection rules (continued)

Name Type Description

Swedish Passport Number Data Identifiers Swedish passports are issued to


nationals of Sweden for the
purpose of international travel.
Besides serving as proof of
Swedish citizenship, they facilitate
the process of securing
assistance from Swedish consular
officials abroad or other European
Union member states in case a
Swedish consular is absent, if
needed.

See “Swedish Passport Number”


on page 1499.

Austria Passport Number Data Identifiers Austrian passports are travel


documents issued to Austrian
citizens by the Austrian Passport
Office of the Department of
Foreign Affairs and Trade, both in
Austria and overseas, and enable
the passport holder to travel
internationally.

See “Austria Passport Number”


on page 1030.

Belgium Passport Number Data Identifiers Belgian passports are passports


issued by the Belgian state to its
citizens to facilitate international
travel. The Federal Public Service
Foreign Affairs, formerly known
as the Ministry of Foreign Affairs,
is responsible for issuing and
renewing Belgian passports.

See “Belgium Passport Number”


on page 1044.

Belgium Driver's License Data Identifiers Identification number for an


Number individual's driver's licence issued
by the Driver and Vehicle
Licensing Agency of Belgium.

See “Belgium Driver's Licence


Number” on page 1042.
Library of policy templates 1682
General Data Protection Regulation (Travel)

Table 46-34 General Data Protection Regulations (Travel) detection rules (continued)

Name Type Description

Netherlands Driver's License Data Identifiers Identification number for an


Number individual's driver's licence issued
by the RDW government agency
of the Netherlands.

See “Netherlands Driver's License


Number” on page 1362.

Netherlands Passport Number Data Identifiers Dutch passports are issued to


Netherlands citizens for the
purpose of international travel.

See “Netherlands Passport


Number” on page 1363.

France Driver's License Data Identifiers Identification number for an


Number individual's driver's licence issued
by the Driver and Vehicle
Licensing Agency of France.

See “France Driver's License


Number” on page 1177.

Germany Driver's License Data Identifiers Identification number for an


Number individual's driver's licence issued
by the Driver and Vehicle
Licensing Agency of Germany.
See “Germany Driver's License
Number” on page 1194.

Italy Passport Number Data Identifiers Italian passports are issued to


Italian citizens for the purpose of
international travel.

See “Italy Passport Number”


on page 1282.

Italy Driver's License Number Data Identifiers Identification number for an


individual's driver's licence issued
by the Driver and Vehicle
Licensing Agency of Italy.

See “Italy Driver's Licence


Number” on page 1278.
Library of policy templates 1683
General Data Protection Regulation (Travel)

Table 46-34 General Data Protection Regulations (Travel) detection rules (continued)

Name Type Description

Spain Driver's License Number Data Identifiers Identification number for an


individual's driver's licence issued
by the Driver and Vehicle
Licensing Agency of Spain.

See “Spain Driver's Licence


Number” on page 1477.

Ukraine Domestic Passport Data Identifiers An identity document issued to


Number citizens of Ukraine for domestic
use. It has been replaced by the
Ukraine Identity Card as of 2016,
but any existing passports are still
valid.

See “Ukraine Passport


(Domestic)” on page 1541.

Ukraine International Passport Data Identifiers A document used by citizens of


Number Ukraine to travel outside of
Ukraine.

See “Ukraine Passport


(International)” on page 1543.

Ireland Passport Number Data Identifiers An Irish passport is the passport


issued to citizens of Ireland. An
Irish passport enables the bearer
to travel internationally and serves
as evidence of Irish citizenship
and citizenship of the European
union. It also facilitates the access
to consular assistance from both
Irish embassies and any embassy
from other European union
member states while abroad.

See “Ireland Passport Number”


on page 1266.
Library of policy templates 1684
General Data Protection Regulation (Travel)

Table 46-34 General Data Protection Regulations (Travel) detection rules (continued)

Name Type Description

Luxembourg Passport Number Data Identifiers A Luxembourg passport is an


international travel document
issued to nationals of the grand
Duchy of Luxembourg, and may
also serve as proof of
Luxembourgish citizenship.

See “Luxembourg Passport


Number” on page 1322.

Portugal Passport Number Data Identifiers Portuguese passports are issued


to citizens of Portugal for the
purpose of international travel.
The passport, along with the
national identity card allows for
free rights of movement and
residence in any of the states of
the European Union and
European economic area.

See “Portugal Passport Number”


on page 1407.

Finland Passport Number Data Identifiers Finnish passports are issued to


nationals of Finland for the
purpose of international travel.
They also facilitate the process of
securing assistance from Finnish
consular officials abroad.

See “Finland Passport Number”


on page 1169.

Finland Driver's Licence Data Identifiers Identification number for an


Number individual's driver's license issued
in an EU or EEA Member State
for a Finnish license.

See “Finland Driver's Licence


Number” on page 1165.

Portugal Driver's Licence Data Identifiers The Institute for Mobility and Land
Number Transport (IMTT) issues driver's
licenses in Portugal.

See “Portugal Driver's Licence


Number” on page 1402.
Library of policy templates 1685
General Data Protection Regulation (Travel)

Table 46-34 General Data Protection Regulations (Travel) detection rules (continued)

Name Type Description

Sweden Driver's Licence Data Identifiers In Sweden, a driving license is


Number required when operating a car,
motorcycle or moped on public
roads. Driving licenses are issued
by the prefectural governments
public safety commissions and
are overseen on a nationwide
basis by the National Police
Agency.

See “Sweden Driver's Licence


Number” on page 1492.

Greece Passport Number Data Identifiers Greek passports are issued to


Greek citizens for the purpose of
international travel. The passport
along with the national identity
card allows for free rights of
movement and residence in any
of the states of the European
Union and European Economic
Area.

See “Greece Passport Number”


on page 1200.

Poland Passport Number Data Identifiers A Polish passport is an


international travel document
issued to nationals of Poland. It
may also serve as proof of Polish
citizenship.

See “Poland Passport Number”


on page 1389.

Hungary Passport Number Data Identifiers Hungarian passports are issued


to Hungarian citizens for
international travel by the Central
Data Processing, Registration,
and Election Office of the
Hungarian Ministry of the Interior.

See “Hungary Passport Number”


on page 1219.
Library of policy templates 1686
General Data Protection Regulation (Travel)

Table 46-34 General Data Protection Regulations (Travel) detection rules (continued)

Name Type Description

Slovakia Passport Number Data Identifiers Slovak passports are issued to


citizens of Slovakia to facilitate
international travel.

See “Slovakia Passport Number”


on page 1457.

Slovenia Passport Number Data Identifiers Slovenian passports are issued


to citizens of Slovenia to facilitate
international travel.

See “Slovenia Passport Number”


on page 1461.

Estonia Passport Number Data Identifiers The Estonian passport is an


international travel document
issued to citizens of Estonia that
also serves as proof of Estonian
citizenship. The Border Guard
Board in Estonia and Estonian
foreign representations abroad
are responsible for issuing
Estonian passports.

See “Estonia Passport Number”


on page 1149.

Latvia Passport Number Data Identifiers Latvian passports are issued to


citizens of Latvia for identity and
international travel purposes. The
territorial section of The Office of
Citizenship and Migration Affairs
issues passports.

See “Latvia Passport Number”


on page 1305.

Switzerland Passport Number Data Identifiers Swiss passports are issued to


citizens of Switzerland to facilitate
international travel.

See “Switzerland Passport


Number” on page 1511.
Library of policy templates 1687
General Data Protection Regulation (Travel)

Table 46-34 General Data Protection Regulations (Travel) detection rules (continued)

Name Type Description

Iceland Passport Number Data Identifiers Icelandic passports are issued to


citizens of Iceland for the purpose
of international travel and may
also serve as a proof of Iceland
citizenship.

See “Iceland Passport Number”


on page 1245.

Liechtenstein Passport Number Data Identifiers Liechtenstein passports are


issued to nationals of
Liechtenstein for the purpose of
international travel. The passport
may also serve as proof of
Liechtensteiner citizenship.

See “Liechtenstein Passport


Number” on page 1311.

Romania Driver's Licence Data Identifiers A driving license in Romania is a


Number document confirming the rights of
the holder to drive motor vehicles.

See “Romania Driver's Licence


Number” on page 1416.

Czech Republic Driver's Data Identifiers The Czech Republic Ministry of


Licence Number Transport grants driver's licenses
in the Czech Republic, confirming
the rights of the holder to drive
motor vehicles.

See “Czech Republic Driver's


Licence Number” on page 1112.

Slovakia Driver's Licence Data Identifiers A Slovak drivers license is a


Number document confirming the rights of
the holder to drive motor vehicles.
Slovak driver's licenses are
granted by the Ministry of Interior.

See “Slovakia Driver's Licence


Number” on page 1451.
Library of policy templates 1688
Gramm-Leach-Bliley policy template

Table 46-34 General Data Protection Regulations (Travel) detection rules (continued)

Name Type Description

Poland Driver's Licence Data Identifiers Poland issues driving licenses


Number confirming the rights of the holder
to drive motor vehicles.

See “Poland Driver's Licence


Number” on page 1386.

Hungary Driver's Licence Data Identifiers A driving license in Hungary is a


Number document issued by the Ministry
of Economics and Transport,
confirming the rights of the holder
to drive motor vehicles.

See “Hungary Driver's Licence


Number” on page 1217.

Latvia Driver's Licence Number Data Identifiers A driver's license in Latvia is a


document issued by the Road
Traffic Safety Directorate,
confirming the rights of the holder
to drive motor vehicles.

See “Latvia Driver's Licence


Number” on page 1303.

Norway Driver's Licence Data Identifiers A driver's license is required in


Number Norway before a person is
permitted to drive a motor vehicle
of any description on a road in
Norway.

See “Norway Driver's Licence


Number” on page 1375.

Estonia Driver's Licence Data Identifiers The Estonian Road Administration


Number issues driving licenses in Estonia,
confirming the rights of the holder
to drive motor vehicles.

See “Estonia Driver's Licence


Number” on page 1147.

Gramm-Leach-Bliley policy template


The Gramm-Leach-Bliley (GLB) Act gives consumers the right to limit some sharing of their
information by financial institutions.
Library of policy templates 1689
Gramm-Leach-Bliley policy template

The Gramm-Leach-Bliley policy template detects transmittal of customer data.

Table 46-35 Gramm-Leach-Bliley policy template conditions

Detection method Type Description

Username/Password Simple rule: EDM This rule looks for user names and passwords in combination.
Combinations
See “Choosing an Exact Data Profile” on page 409.

Exact SSN or CCN Simple rule: EDM This rule looks for SSN or Credit Card Number.

Customer Directory Simple rule: EDM This rule looks for Phone or Email.

3 or more critical customer Simple rule: EDM This rule looks for a match among any three of the following fields:
fields
■ Account number
■ Bank card number
■ Email address
■ First name
■ Last name
■ PIN number
■ Phone number
■ Social security number
■ ABA Routing Number
■ Canadian Social Insurance Number
■ UK National Insurance Number
■ Date of Birth
However, the following combinations are not a match:

■ Phone, email, and first name


■ Phone, email, and last name
■ Email, first name, and last name
■ Phone, first name, and last name

ABA Routing Numbers Simple rule: DCM This condition detects nine-digit numbers. It validates the number
(DI) using the final check digit. This condition eliminates common test
numbers, such as 123456789, number ranges that are reserved for
future use, and all the same digit. This condition also requires the
presence of an ABA-related keyword.
See “ABA Routing Number” on page 1013.
Library of policy templates 1690
HIPAA and HITECH (including PHI) policy template

Table 46-35 Gramm-Leach-Bliley policy template conditions (continued)

Detection method Type Description

US Social Security Numbers Simple rule: DCM This rule looks for social security numbers. For this rule to match,
(DI) there must be a number that fits the Randomized US SSN data
identifier. There must also be a keyword or phrase that indicates the
presence of a US SSN with a keyword from "US SSN Keywords"
dictionary. The keyword condition is included to reduce false positives
with any numbers that may match the SSN format.

See “Randomized US Social Security Number (SSN)” on page 1414.

Credit Card Numbers Simple rule: DCM This condition detects valid credit card numbers that are separated
(DI) by spaces, dashes, periods, or without separators. This condition
performs Luhn check validation and includes the following credit
card formats:

■ American Express
■ Diner's Club
■ Discover
■ Japan Credit Bureau (JCB)
■ MasterCard
■ Visa

This rule eliminates common test numbers, including those reserved


for testing by credit card issuers, and also requires the presence of
a credit card-related keyword.

See “Credit Card Number narrow breadth” on page 1100.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

HIPAA and HITECH (including PHI) policy template


The HIPAA and HITECH (including PHI) policy strictly enforces the US Health Insurance
Portability and Accountability Act (HIPAA). Health Information Technology for Economic and
Clinical Health Act (HITECH) is the first national law that mandates breach notification for
protected health information (PHI).
This policy template detects data concerning prescription drugs, diseases, and treatments in
combination with PHI. Organizations that are not subject to HIPAA can also use this policy to
control PHI data.
The HIPAA and HITECH (including PHI) policy template is updated with recent Drug, and
Disease, and Treatment keyword lists based on information from the U.S. Federal Drug
Administration (FDA) and other sources. The policy template is also updated to use the
Library of policy templates 1691
HIPAA and HITECH (including PHI) policy template

Randomized US Social Security Number (SSN) data identifier, which detects both traditional
and randomized SSNs.
See “Keep the keyword lists for your HIPAA and Caldicott policies up to date” on page 850.
See “Updating policies to use the Randomized US SSN data identifier” on page 810.
Table 46-36 describes the TPO exception that is provided by the template. TPOs (Treatment,
Payment, or health care Operations) are service providers to health care organizations and
have an exception for HIPAA information restrictions. The template requires that you enter the
allowed email addresses. If implemented the exception is evaluated before detection rules
and the policy does not trigger an incident if the protected information is sent to one of the
allowed partners.

Table 46-36 TPO exception

Name Type Configuration

TPO Exception Content Matches Keyword Simple exception (single condition match).
(DCM)
Looks for a recipient email address matching one from
the "TPO Email Addresses" user-defined keyword
dictionary.

Table 46-37 is a rule that looks for an exact data match against any single column from a
profiled Patient Data database record.

Table 46-37 Patient Data detection rule

Name Type Configuration

Patient Data Content Matches Exact Data Match data from any single field:
(EDM)
■ Last name
■ Tax payer ID (SSN)
■ Email address
■ Account number
■ ID card number
■ Phone number

See “Choosing an Exact Data Profile” on page 409.

Table 46-38 is a compound detection rule that requires a Patient Data exact match and a
match from the "Drug Code" data identifier.
Library of policy templates 1692
HIPAA and HITECH (including PHI) policy template

Table 46-38 Patient Data and Drug Codes detection rule

Name Condition types Configuration

Patient Data and Drug Codes Content Matches Exact Data Looks for a match against any single column from a
(EDM) profiled Patient Data database record and a match from
the National Drug Code data identifier.
And
See Table 46-37 on page 1691.
Content Matches Data
Identifier See “National Drug Code (NDC)” on page 1355.

Table 46-39 is a compound detection rule that requires a Patient Data exact match and a
keyword match from the "Prescription Drug Names" dictionary.

Table 46-39 Patient Data and Prescription Drug Names detection rule

Name Condition type Configuration

Patient Data and Prescription Content Matches Exact Data Looks for a match against any single column from a
Drug Names (EDM) profiled Patient Data database record and a keyword
match from the Prescription Drug Names dictionary
AND
See Table 46-37 on page 1691.
Content Matches Keyword
(DCM) See “Updating policies after upgrading to the latest
version” on page 447.

Table 46-40 is a compound detection rule that requires a Patient Data exact match and keyword
match from the "Medical Treatment Keywords" dictionary.

Table 46-40 Patient Data and Treatment Keywords detection rule

Name Condition type Configuration

Patient Data and Treatment Content Matches Exact Data Looks for a match against any single column from a
Keywords (EDM) profiled Patient Data database record and a keyword
match from the Medical Treatment Keywords dictionary.
And
See Table 46-37 on page 1691.
Content Matches Keyword
(DCM) See “Updating policies after upgrading to the latest
version” on page 447.

Table 46-41 is a compound detection rule that requires a Patient Data exact match and a
keyword match from the "Disease Names" dictionary.
Library of policy templates 1693
HIPAA and HITECH (including PHI) policy template

Table 46-41 Patient Data and Disease Keywords detection rule

Name Condition type Configuration

Patient Data and Disease Content Matches Exact Data Looks for a match against any single column from a
Keywords (EDM) profiled Patient Data database record and a keyword
match from the Disease Names dictionary.
And
See Table 46-37 on page 1691.
Content Matches Keyword
(DCM) See “Updating policies after upgrading to the latest
version” on page 447.

Table 46-42 is a compound detection rule that looks for SSNs using the Randomized US Social
Security Number (SSN) data identifier and for a keyword from the "Prescription Drug Names"
dictionary.

Table 46-42 SSN and Drug Keywords detection rule

Name Condition type Configuration

SSN and Drug Keywords Content Matches Data Randomized US Social Security Number (SSN) data
Identifier identifier (narrow breadth)

And See “Randomized US Social Security Number (SSN)”


on page 1414.
Content Matches Keyword
Prescription Drug Names keyword dictionary

See “Updating policies after upgrading to the latest


version” on page 447.

Table 46-43 is a compound detection rule that looks for SSNs using the Randomized US Social
Security Number (SSN) data identifier and for a keyword match from the "Medical Treatment
Keywords" dictionary.

Table 46-43 SSN and Treatment Keywords detection rule

Name Condition type Configuration

SSN and Treatment Content Matches Data Randomized US Social Security Number (SSN) data
Keywords Identifier identifier (narrow breadth)

And See “Randomized US Social Security Number (SSN)”


on page 1414.
Content Matches Keyword
Medical Treatment Keywords keyword dictionary.

See “Updating policies after upgrading to the latest


version” on page 447.
Library of policy templates 1694
Human Rights Act 1998 policy template

Table 46-44 is a compound detection rule that looks for SSNs using the Randomized US Social
Security Number (SSN) data identifier and for a keyword match from the "Disease Names"
dictionary.

Table 46-44 SSN and Disease Keywords detection rule

Name Condition type Configuration

SSN and Disease Keywords Content Matches Data Randomized US Social Security Number (SSN) data
Identifier identifier (narrow breadth)

And See “Randomized US Social Security Number (SSN)”


on page 1414.
Content Matches Keyword
Disease Names keyword dictionary

See “Updating policies after upgrading to the latest


version” on page 447.

Table 46-45 is a compound detection rule that looks for SSNs using the Randomized US Social
Security Number (SSN) data identifier and for a drug code using the Drug Code data identifier.

Table 46-45 SSN and Drug Code detection rule

Name Condition type Configuration

SSN and Drug Code Content Matches Data Randomized US Social Security Number (SSN) data
Identifier identifier (narrow breadth)

And See “Randomized US Social Security Number (SSN)”


on page 1414.
Content Matches Keyword
Drug Code data identifier (narrow breadth)

See “National Drug Code (NDC)” on page 1355.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

Human Rights Act 1998 policy template


The Human Rights Act 1998 allows UK citizens to assert their rights under the European
Convention on Human Rights in UK courts and tribunals. The Act states that "so far as possible
to do so, legislation must be read and given effect in a way which is compatible with convention
rights." The Human Rights Act 1998 policy enforces Article 8 by ensuring that the private lives
of British citizens stay private.
Library of policy templates 1695
Illegal Drugs policy template

EDM Rule UK Data Protection Act, Personal Data

This compound rule looks for two data types, last name and electoral roll number,
in combination with a keyword from the "UK Personal Data Keywords" dictionary.

DCM Rule UK Electoral Roll Numbers


This rule looks for a single compound condition with four parts:

■ A single keyword from the "UK Keywords" dictionary


■ A pattern matching that of the UK Electoral Roll Number data identifier
■ A single keyword from the "UK Electoral Roll Number Words" dictionary
■ A single keyword from the "UK Personal Data Keywords" dictionary

See “Choosing an Exact Data Profile” on page 409.


See “Configuring policies” on page 413.
See “Exporting policy detection as a template” on page 442.

Illegal Drugs policy template


This policy detects conversations about illegal drugs and controlled substances.

DCM Rule Street Drugs

This rule looks for five instances of keywords from the "Street Drug Names"
dictionary.

DCM Rule Mass Produced Controlled Substances

This rule looks for five instances of keywords from the "Manufactured Controlled
Substances" dictionary.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

Individual Taxpayer Identification Numbers (ITIN)


policy template
An Individual Taxpayer Identification Number (ITIN) is a tax-processing number issued by the
US Internal Revenue Service (IRS). The IRS issues ITINs to track individuals are not eligible
to obtain Social Security Numbers (SSNs).
Library of policy templates 1696
International Traffic in Arms Regulations (ITAR) policy template

Table 46-46 ITIN policy template conditions

DCM Keyword Rule Description

ITIN This rule looks for a match to the US ITIN data identifier and a keyword from the
"US ITIN Keywords" dictionary.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

International Traffic in Arms Regulations (ITAR) policy


template
The International Traffic in Arms Regulations (ITAR) are enforced by the US Department of
State. Exporters of defense services or related technical data are required to register with the
federal government and may need export licenses. This policy detects potential violations
based on countries and controlled assets designated by the ITAR.
The Indexed ITAR Munition Items and Recipients detection rule looks for a country code in
the recipient from the "ITAR Country Codes" dictionary and for a specific "SKU" from an indexed
EDM file.

Table 46-47 Indexed ITAR Munition Items and Recipients detection rule

Method Conditions (both Configuration


must match)

Compound rule Recipient Matches Match recipient email or URL domain from ITAR Country
Pattern (DCM) Codes list:

■ Severity: High.
■ Check for existence.
■ At least 1 recipient(s) must match.

Content Matches Exact See “Choosing an Exact Data Profile” on page 409.
Data (EDM)

The ITAR Munitions List and Recipients detection rule looks for both a country code in the
recipient from the "ITAR Country Codes" dictionary and a keyword from the "ITAR Munition
Names" dictionary.
Library of policy templates 1697
Media Files policy template

Table 46-48 ITAR Munitions List and Recipients detection rule

Method Conditions (both Configuration


must match)

Compound rule Recipient Matches Match recipient email or URL domain from ITAR Country
Pattern (DCM) Codes list:

■ Severity: High.
■ Check for existence.
■ At least 1 recipient pattern must match.

Content Matches Match any keyword from the ITAR Munitions List:
Keyword (DCM)
■ Severity: High.
■ Check for existence.
■ Look in envelope, subject, body, attachments.
■ Case insensitive.
■ Match on whole words only.
■ Severity: High.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

Media Files policy template


The Media Files policy detects various types of video and audio files (including mp3).

DCM Rule Media Files

This rule looks for the following media file types:

■ qt
■ riff
■ macromedia_dir
■ midi
■ mp3
■ mpeg_movie
■ quickdraw
■ realaudio
■ wav
■ video_win
■ vrml
Library of policy templates 1698
Medicare and Medicaid (including PHI)

DCM Rule Media Files Extensions

This rule looks for file name extensions from the "Media Files Extensions" dictionary.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

Medicare and Medicaid (including PHI)


This policy detects protected health information (PHI) associated with the United States
Medicare and Medicaid programs, including Medicare Beneficiary Numbers, Health Insurance
Claim Numbers and Current Procedural Terminology codes used by the Healthcare Common
Procedure Coding System.

Table 46-49 Medicare and Medicaid (including PHI) detection rules

Name Condition type Description

Healthcare Common Procedure Data Identifiers These three rules match the medium
Coding System (HCPCS CPT breadth of the Healthcare Common
and
Codes) Procedure Coding System (HCPCS
Keywords CPT Codes) data identifier.

They match all unique occurrences in


the message envelope, subject line,
body, or attachments. Matches are
given High severity.

See “Healthcare Common Procedure


Coding System (HCPCS CPT Code)”
on page 1208.

They also require the presence related


keywords.

Medicare Beneficiary Identifier Data Identifiers This rule matches the narrow breadth
of the Medicare Beneficiary Identifier
data identifier.

It matches all unique occurrences in


the message envelope, subject line,
body, or attachments. Matches are
given High severity.

See “Medicare Beneficiary Identifier”


on page 1344.
Library of policy templates 1699
Merger and Acquisition Agreements policy template

Table 46-49 Medicare and Medicaid (including PHI) detection rules (continued)

Name Condition type Description

Health Insurance Claim Number Data Identifiers This rule matches the narrow breadth
of the Health Insurance Claim
Number data identifier.

It matches all unique occurrences in


the message envelope, subject line,
body, or attachments. Matches are
given High severity.

See “Health Insurance Claim Number”


on page 1212.

Merger and Acquisition Agreements policy template


The Mergers and Acquisition Agreements policy template detects contracts and official
documentation concerning merger and acquisition activity.
You can modify this template with company-specific code words to detect specific deals.
The Merger and Acquisition Agreements template provides a single compound detection rule.
All conditions in the rule must match for the rule to trigger an incident.

Table 46-50 Merger and Acquisition Agreements compound detection rule

Condition Configuration

Contract Specific Keywords ■ Match any keyword: merger, agreement, contract, letter of intent, term sheet,
(Keyword Match) plan of reorganization
■ Severity: High.
■ Check for existence.
■ Look in envelope, subject, body, attachments.
■ Case insensitive.
■ Match on whole words only.

Acquisition Corporate Structure ■ Match any keyword: subsidiary, subsidiaries, affiliate, acquiror, merger sub,
Keywords (Keyword Match) covenantor, acquired company, acquiring company, surviving corporation,
surviving company
■ Severity: High.
■ Check for existence.
■ Look in envelope, subject, body, attachments.
■ Case insensitive.
■ Match on whole words only.
Library of policy templates 1700
NASD Rule 2711 and NYSE Rules 351 and 472 policy template

Table 46-50 Merger and Acquisition Agreements compound detection rule (continued)

Condition Configuration

Merger Consideration ■ Match any keyword: merger stock, merger consideration, exchange shares,
Keywords (Keyword Match) capital stock, dissenting shares, capital structure, escrow fund, escrow
account, escrow agent, escrow shares, escrow cash, escrow amount, stock
consideration, break-up fee, goodwill
■ Severity: High.
■ Check for existence.
■ Look in envelope, subject, body, attachments.
■ Case insensitive.
■ Match on whole words only.

Legal Contract Keywords ■ Match any keyword: recitals, in witness whereof, governing law, Indemnify,
(Keyword Match) Indemnified, indemnity, signature page, best efforts, gross negligence, willful
misconduct, authorized representative, severability, material breach
■ Severity: High.
■ Check for existence.
■ Look in envelope, subject, body, attachments.
■ Case insensitive.
■ Match on whole words only.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

NASD Rule 2711 and NYSE Rules 351 and 472 policy
template
This policy protects the name(s) of any companies involved in an upcoming stock offering,
internal project names for the offering, and the stock ticker symbols for the offering companies.
The NASD Rule 2711 Documents, Indexed detection rule looks for content from specific
documents registered as sensitive and known to be subject to NASD Rule 2711 or NYSE
Rules 351 and 472. This rule returns a match if 80% or more of the source document is found.
Library of policy templates 1701
NASD Rule 2711 and NYSE Rules 351 and 472 policy template

Table 46-51 NASD Rule 2711 Documents, Indexed detection rule

Method Condition Configuration

Simple rule Content Matches NASD Rule 2711 Documents, Indexed (IDM):
Document
■ Detect documents in selected Indexed Document Profile
Signature (IDM)
■ Require at least 80% content match.
■ Severity: High.
■ Check for existence.
■ Look in body, attachments.

See “Choosing an Indexed Document Profile” on page 411.

The NASD Rule 2711 and NYSE Rules 351 and 472 detection rule is a compound rule that
contains a sender condition and a keyword condition. The sender condition is based on a
user-defined list of email addresses of research analysts at the user's company ("Analysts'
Email Addresses" dictionary). The keyword condition looks for any upcoming stock offering,
internal project names for the offering, and the stock ticker symbols for the offering companies
("NASD 2711 Keywords" dictionary). Like the sender condition, it requires editing by the user.

Table 46-52 NASD Rule 2711 and NYSE Rules 351 and 472 detection rule

Method Condition Configuration

Compound rule Sender/User NASD Rule 2711 and NYSE Rules 351 and 472 (Sender):
Matches Pattern
■ Match sender pattern(s) [[email protected]] (user defined)
(DCM)
■ Severity: High.
■ Matches on entire message.

Content Matches NASD Rule 2711 and NYSE Rules 351 and 472 (Keyword Match):
Keyword (DCM)
■ Match "[company stock symbol]", "[name of offering company]", "[offering
name (internal name)]".
■ Severity: High.
■ Check for existence.
■ Look in envelope, subject, body, attachments.
■ Case insensitive.
■ Match on whole words only.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.
Library of policy templates 1702
NASD Rule 3010 and NYSE Rule 342 policy template

NASD Rule 3010 and NYSE Rule 342 policy template


NASD Rule 3010 and NYSE Rule 342 require brokers-dealers to supervise certain brokerage
employees' communications. The NASD Rule 3010 and NYSE Rule 342 policy monitors the
communications of registered principals who are subject to these regulations.
The Stock Recommendation detection rule looks for a keyword from the "NASD 3010 Stock
Keywords" dictionary and the "NASD 3010 Buy/Sell Keywords" dictionary. In addition, this rule
requires evidence of a stock recommendation in combination with a buy or sell action.

Table 46-53 Stock Recommendation detection rule

Method Conditions (all must Configuration


match)

Compound rule Content Matches Keyword Match keyword: "recommend"


(DCM)
■ Severity: High.
■ Check for existence.
■ Look in envelope, subject, body, attachments.
■ Case insensitive.
■ Match on whole words only.

Content Matches Keyword Match keyword: "buy" or "sell"


(DCM)
■ Severity: High.
■ Check for existence.
■ Look in envelope, subject, body, attachments.
■ Case insensitive.
■ Match on whole words only.

Content Matches Keyword Match keyword: "stock, stocks, security, securities, share, shares"
(DCM)
■ Severity: High.
■ Check for existence.
■ Look in envelope, subject, body, attachments.
■ Case insensitive.
■ Match on whole words only.

The NASD Rule 3010 and NYSE Rule 342 Keywords detection rule looks for keywords in the
"NASD 3010 General Keywords" dictionary, which look for any general stock broker activity,
and stock keywords.
Library of policy templates 1703
NERC Security Guidelines for Electric Utilities policy template

Table 46-54 NASD Rule 3010 and NYSE Rule 342 Keywords detection rule

Method Conditions (both must Configuration


match)

Compound rule Content Matches Keyword Match keyword: "authorize", "discretion", "guarantee", "options"
(DCM)
■ Severity: High.
■ Check for existence.
■ Look in envelope, subject, body, attachments.
■ Case insensitive.
■ Match on whole words only.

Content Matches Keyword Match keyword: "stock, stocks, security, securities, share, shares"
(DCM)
■ Severity: High.
■ Check for existence.
■ Look in envelope, subject, body, attachments.
■ Case insensitive.
■ Match on whole words only.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

NERC Security Guidelines for Electric Utilities policy


template
The North American Electric Reliability Council (NERC) Guideline for Protecting Potentially
Sensitive Information describes how to protect and secure data about critical electricity
infrastructure.
This policy detects the information outlined in the NERC security guidelines for the electricity
sector.

Table 46-55 Key Response Personnel detection rule

Detection method Match condition Configuration

Simple rule Content Matches Exact Data Match any three of the following data items:
(EDM)
■ First name
■ Last name
■ Phone
■ Email

See “Choosing an Exact Data Profile” on page 409.


Library of policy templates 1704
Network Diagrams policy template

Table 46-56 Network Infrastructure Maps detection rule

Detection method Match condition Configuration

Simple rule Content Matches Indexed This rule requires a 90% binary match.
Documents (IDM)
See “Choosing an Indexed Document Profile” on page 411.

The Sensitive Keywords and Vulnerability Keywords detection rule looks for any keyword
matches from the "Sensitive Keywords" dictionary and the "Vulnerability Keywords" dictionary.

Table 46-57 Sensitive Keywords and Vulnerability Keywords detection rule

Detection method Match conditions Configuration

Compound rule Content Matches Keyword Match any Sensitive Keyword:


(DCM)
■ Severity: High.
■ Check for existence.
■ Look in envelope, subject, body, attachments.
■ Case insensitive.
■ Match on whole words only.

Content Matches Keyword Match any Vulnerability Keyword:


(DCM)
■ Severity: High.
■ Check for existence.
■ Look in envelope, subject, body, attachments.
■ Case insensitive.
■ Match on whole words only.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

Network Diagrams policy template


The Network Diagrams policy detects computer network diagrams at risk of exposure.

IDM Rule Network Diagrams, Indexed

This rule looks for content from specific network diagrams that are registered as
confidential. This rule returns a match if 80% or more of the source document is
detected.

DCM Rule Network Diagrams with IP Addresses

This rule looks for a Visio file type in combination with an IP address data identifier.
Library of policy templates 1705
Network Security policy template

DCM Rule Network Diagrams with IP Address Keyword

This rule looks for a Visio file type in combination with phrase variations of "IP
address" with a data identifier.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

Network Security policy template


The Network Security policy detects evidence of hacking tools and attack planning.

DCM Rule GoToMyPC Activity

This rule looks for a GoToMyPC command format with a data identifier.

DCM Rule Hacker Keywords

This rule looks for a keyword from the "Hacker Keywords" dictionary.

DCM Rule KeyLoggers Keywords

This rule looks for a keyword from the "Keylogger Keywords" dictionary.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

Offensive Language policy template


The Offensive Language policy detects the use of offensive language.

DCM Rule Offensive Language, Explicit

This rule looks for any single keyword in the "Offensive Language, Explicit" dictionary.

DCM Rule Offensive Language, General

This rule looks for any three instances of keywords in the "Offensive Language,
General" dictionary.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.
Library of policy templates 1706
Office of Foreign Assets Control (OFAC) policy template

Office of Foreign Assets Control (OFAC) policy


template
The Office of Foreign Assets Control of the U.S. Department of the Treasury administers and
enforces economic and trade sanctions. These sanctions are based on US foreign policy and
national security goals against certain countries, individuals, and organizations. The Office of
Foreign Assets Control (OFAC) policy detects communications involving these targeted groups.
The OFAC policy has two primary parts. The first deals with the Specially Designated Nationals
(SDN) list, and the second deals with general OFAC policy restrictions.
The SDN list refers to specific people or organizations that are subject to trade restrictions.
The U.S. Treasury Department provides text files with specific names, last known addresses,
and known aliases for these individuals and entities. The Treasury Department stipulates that
the addresses may not be correct or current, and different locations do not change the
restrictions on people and organizations.
In the OFAC policy template, Symantec Data Loss Prevention has scrubbed the list to make
it more usable and practical. This includes extracting keywords and key phrases from the list
of names and aliases, since names do not always appear in the same format as the list. Also,
common names have been removed to reduce false positives. For example, one organization
on the SDN list is known as "SARA." Leaving this on the list would generate a high false positive
rate. "SARA Properties" is another entry on the list. It is used as a key phrase in the template
because the incidence of this phrase is much lower than "SARA" alone. The list of names and
organizations is considered in combination with the commonly found countries in the SDN
address list. The top 12 countries on the list are considered, after again removing more
commonly occurring countries. The template looks for recipients with any of the listed countries
as the designated country code. This SDN list minimizes false positives while still detecting
transactions or communications with known restricted parties.
The OFAC policy also provides guidance around the restrictions the U.S. Treasury Department
has placed on general trade with specific countries. This is distinct from the SDN list, since
individuals and organizations are not specified. The list of general sanctions can be found
here: http://www.treasury.gov/offices/enforcement/ofac/programs/index.shtml
The Office of Foreign Assets Control (OFAC) template looks for recipients on the OFAC- listed
countries by designated country code.
The OFAC Special Designated Nationals List and Recipients detection rule looks for a recipient
with a country code matching entries in the "OFAC SDN Country Codes" specification in
combination with a match on a keyword from the "Specially Designated Nationals List"
dictionary.
Library of policy templates 1707
OMB Memo 06-16 and FIPS 199 Regulations policy template

Table 46-58 OFAC Special Designated Nationals List and Recipients detection rule

Method Condition Configuration

Compound rule Recipient Matches OFAC Special Designated Nationals List and Recipients (Recipient):
Pattern (DCM)
■ Match email or URL domain by OFAC SDN Country Code.
■ Severity: High.
■ Check for existence.
■ At least 1 recipient(s) must match.
■ Matches on the entire message.

Content Matches Specially Designated Nationals List (Keyword Match):


Keyword (DCM)
■ Match keyword from the Specially Designated Nationals List.
■ Severity: High.
■ Check for existence.
■ Look in envelope, subject, body, attachments.
■ Case insensitive.
■ Match on whole words only.

The Communications to OFAC countries detection rule looks for a recipient with a country
code matching entries from the "OFAC Country Codes" list.

Table 46-59 Communications to OFAC countries detection rule

Method Condition Configuration

Simple rule Recipient Matches Communications to OFAC countries (Recipient):


Pattern (DCM)
■ Match email or URL domain by OFAC Country Code.
■ Severity: High.
■ Check for existence.
■ At least 1 recipient(s) must match.
■ Matches on the entire message.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

OMB Memo 06-16 and FIPS 199 Regulations policy


template
This policy detects information classified as confidential according to the guidelines established
in the Federal Information Processing Standards (FIPS) Publication 199 from the National
Institute of Standards and Technology (NIST). NIST is responsible for establishing standards
Library of policy templates 1708
OMB Memo 06-16 and FIPS 199 Regulations policy template

and guidelines for data security under the Federal Information Security Management Act
(FISMA).
This template contains three simple detection rules. If any rule reports a match, the policy
triggers an incident.
The High Confidentiality Indicators detection rule looks for any keywords in the "High
Confidentiality" dictionary.

Table 46-60 High Confidentiality Indicators detection rule

Method Condition Configuration

Simple rule Content Matches High Confidentiality Indicators (Keyword Match):


Keyword
■ Match "(confidentiality, high)", "(confidentiality,high)"
■ Severity: High.
■ Check for existence.
■ Look in envelope, subject, body, attachments.
■ Case insensitive.
■ Match on whole words only.

The Moderate Confidentiality Indicators detection rule looks for any keywords in the "Moderate
Confidentiality" dictionary.

Table 46-61 Moderate Confidentiality Indicators detection rule

Method Condition Configuration

Simple rule Content Matches Moderate Confidentiality Indicators (Keyword Match):


Keyword
■ Match "(confidentiality, moderate)", "(confidentiality,moderate)"
■ Severity: High.
■ Check for existence.
■ Look in envelope, subject, body, attachments.
■ Case insensitive.
■ Match on whole words only.

The Low Confidentiality Indicators detection rule looks for any keywords in the "Low
Confidentiality" dictionary.
Library of policy templates 1709
Password Files policy template

Table 46-62 Low Confidentiality Indicators detection rule

Method Condition Configuration

Simple rule Content Matches Low Confidentiality Indicators (Keyword Match):


Keyword
■ Match "(confidentiality, low)", "(confidentiality,low)"
■ Severity: High.
■ Check for existence.
■ Look in envelope, subject, body, attachments.
■ Case insensitive.
■ Match on whole words only.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

Password Files policy template


The Password Files policy detects password file formats, such as SAM, password, and shadow.

DCM Rule Password Filenames

This rule looks for the file names "passwd" or "shadow."

DCM Rule /etc/passwd Format

This rule looks for a regular expression pattern with the /etc/passwd format.

DCM Rule /etc/shadow Format

This rule looks for a regular expression pattern with the /etc/shadow format.

DCM Rule SAM Passwords

This rule looks for a regular expression pattern with the SAM format.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

Payment Card Industry (PCI) Data Security Standard


policy template
The Payment Card Industry (PCI) data security standards are jointly determined by Visa and
MasterCard to protect cardholders by safeguarding personally identifiable information. Visa's
Cardholder Information Security Program (CISP) and MasterCard's Site Data Protection (SDP)
Library of policy templates 1710
Payment Card Industry (PCI) Data Security Standard policy template

program both work toward enforcing these standards. The Payment Card Industry (PCI) Data
Security Standards policy detects Visa and MasterCard credit card number data.
The Card Numbers, Exact detection rule detects exact credit card numbers profiled from a
database or other data source.

Table 46-63 Credit Card Numbers, Exact detection rule

Method Condition Configuration

Simple rule Content Matches This rule detects credit card numbers.
Exact Data (EDM)
See “Choosing an Exact Data Profile” on page 409.

The Credit Card Numbers, All detection rule detects credit card numbers using the Credit Card
Number system Data Identifier.

Table 46-64 Credit Card Numbers, All detection rule

Method Condition Configuration

Simple rule Content Matches Credit Card Numbers, All (Data Identifiers):
Data Identifier
■ Data Identifier: Credit Card Number (narrow)
(DCM)
See “Credit Card Number” on page 1095.
■ Severity: High.
■ Count all matches.
■ Look in envelope, subject, body, attachments.

The Magnetic Stripe Data for Credit Cards detection rule detects raw data from the credit card
magnetic stripe using the Credit Card Magnetic Stripe system Data Identifier.

Table 46-65 Magnetic Stripe Data for Credit Cards detection rule

Method Condition Configuration

Simple rule Content Matches Magnetic Stripe Data for Credit Cards (Data Identifiers):
Data Identifier
■ Data Identifier: Credit Card Magnetic Stripe (medium)
(DCM)
See “Credit Card Number” on page 1095.
■ Data Severity: High.
■ Count all matches.
■ Look in envelope, subject, body, attachments.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.
Library of policy templates 1711
PIPEDA policy template

PIPEDA policy template


Canada's Personal Information Protection and Electronic Documents Act (PIPEDA) protects
personal information in the hands of private sector organizations. This act provides guidelines
for the collection, use, and disclosure of personal information.
The PIPEDA policy detects customer data that PIPEDA regulations protect.
The PIPEDA detection rule looks for a match of two data items, with certain data combinations
excluded from matching.

Table 46-66 PIPEDA detection rule

Detection Description Excluded combinations


method

EDM Rule The PIPEDA detection rule matches any two However, the following combinations do not create a
of the following data items: match:

■ Last name ■ Last name, email


■ Bank card ■ Last name, phone
■ Medical account number ■ Last name, account number
■ Medical record ■ Last name, user name
■ Agency number
■ Account number
■ PIN
■ User name
■ Password
■ SIN
■ ABA routing number
■ Email
■ Phone
■ Mother's maiden name

See “Choosing an Exact Data Profile”


on page 409.

The PIPEDA Contact Info detection rule looks for a match of two data items, with certain data
combinations excepted from matching.
Library of policy templates 1712
PIPEDA policy template

Table 46-67 PIPEDA Contact Info detection rule

Detection Description
method

EDM Rule This rule looks for any two of the following data columns:

■ Last name
■ Phone
■ Account number
■ User name
■ Email

See “Choosing an Exact Data Profile” on page 409.

Table 46-68 Canadian Social Insurance Numbers detection rule

Detection Description
method

DCM Rule This rule implements the narrow breadth edition of the Canadian Social Insurance Number data
identifier.

See “Canadian Social Insurance Number” on page 1074.

Table 46-69 ABA Routing Numbers detection rule

Detection Description
method

DCM Rule This rule implements the narrow breadth edition of the ABA Routing Number data identifier.

See “ABA Routing Number” on page 1013.

Table 46-70 Credit Card Numbers, All detection rule

Detection Description
method

DCM Rule This rule implements the narrow breadth edition of the Credit Card Number data identifier.

See “Credit Card Number narrow breadth” on page 1100.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.
Library of policy templates 1713
Price Information policy template

Price Information policy template


The Price Information policy detects specific SKU and pricing information at risk of exposure.

EDM Rule Price Information

This rule looks for the combination of user-specified Stock Keeping Unit (SKU)
numbers and the price for that SKU number.

Note: This template contains one EDM detection rule. If you do not have an EDM profile
configured, or you are using Symantec Data Loss Prevention Standard, this policy template
is empty and contains no rule to configure.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.
See “About the Exact Data Profile and index” on page 528.

Project Data policy template


The Project Data policy detects discussions of sensitive projects.

IDM Rule Project Documents, Indexed

This rule looks for content from specific project data files registered as proprietary.
It returns a match if the engine detects 80% or more of the source document.

DCM Rule Project Activity

This rule looks for any keywords in the "Sensitive Project Code Names" dictionary,
which is user-defined.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

Proprietary Media Files policy template


The Proprietary Media Files policy detects various types of video and audio files that can be
proprietary intellectual property of your organization at risk for exposure.

IDM Rule Media Files, Indexed

This rule looks for content from specific media files registered as proprietary.
Library of policy templates 1714
Publishing Documents policy template

DCM Rule Media Files

This rule looks for the following media file types:

■ qt
■ riff
■ macromedia_dir
■ midi
■ mp3
■ mpeg_movie
■ quickdraw
■ realaudio
■ wav
■ video_win
■ vrml

DCM Rule Media Files Extensions

This rule looks for file name extensions from the "Media Files Extensions" dictionary.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

Publishing Documents policy template


The Publishing Documents policy detects various types of publishing documents, such as
Adobe FrameMaker files, at risk of exposure.

IDM Rule Publishing Documents, Indexed

This rule looks for content from specific publishing documents registered as
proprietary. It returns a match if the engine detects 80% or more of the source
document.

DCM Rule Publishing Documents

This rule looks for the specified file types:

■ qxpress
■ frame
■ aldus_pagemaker
■ publ

DCM Rule Publishing Documents, extensions

This rule looks for specified file name extensions found in the "Publishing Document
Extensions" dictionary.
Library of policy templates 1715
Racist Language policy template

Note: Both file types and file name extensions are required for this policy because the detection
engine does not detect the true file type for all the required documents. As such, the file name
extension must be used with the file type.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

Racist Language policy template


The Racist Language policy detects the use of racist language.

DCM Rule Racist Language

This rule looks for any single keyword in the "Racist Language" dictionary.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

Restricted Files policy template


The Restricted Files policy detects various file types that are generally inappropriate to send
out of the company, such as Microsoft Access and executable files.

DCM Rule MSAccess Files and Executables


This rule looks for files of the specified types: access, exe, and exe_unix.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

Restricted Recipients policy template


The Restricted Recipients policy detects communications with specified recipients, such as
former employees.

DCM Rules Restricted Recipients

This rule looks for messages to recipients with email addresses in the "Restricted
Recipients" dictionary.

See “Configuring policies” on page 413.


Library of policy templates 1716
Resumes policy template

See “Exporting policy detection as a template” on page 442.

Resumes policy template


The Resumes policy detects active job searches.

EDM Rule Resumes, Employee

This rule is a compound rule with two conditions; both must match to trigger an
incident. This rule contains an EDM condition for first and last names of employees
provided by the user. This rule also looks for a specific file type attachment (.doc)
that is less than 50 KB and contains at least one keyword from each of the following
dictionaries:

■ Job Search Keywords, Education


■ Job Search Keywords, Work
■ Job Search Keywords, General

DCM Rule Resumes, All

This rule looks for files of a specified type (.doc) that are less than 50 KB and match
at least one keyword from each of the following dictionaries:

■ Job Search Keywords, Education


■ Job Search Keywords, Work
■ Job Search Keywords, General

DCM Rule Job Search Websites

This rule looks for URLs of Web sites that are used in job searches.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.
See “About the Exact Data Profile and index” on page 528.

Sarbanes-Oxley policy template


The US Sarbanes-Oxley Act (SOX) imposes requirements on financial accounting, including
the preservation of data integrity and the ability to create an audit trail. The Sarbanes-Oxley
policy detects sensitive financial data.
The Sarbanes-Oxley Documents, Indexed detection rule looks for content from specific
documents registered as being subject to Sarbanes-Oxley Act. This rule returns a match if
80% or more of the source document is found.
Library of policy templates 1717
Sarbanes-Oxley policy template

Table 46-71 Sarbanes-Oxley Documents, Indexed detection rule

Method Condition Configuration

Simple rule Content Matches See “Choosing an Indexed Document Profile” on page 411.
Indexed Document
Profile

The SEC Fair Disclosure Regulation compound detection rule looks for the following conditions;
all must be satisfied for the rule to trigger an incident:
■ The SEC Fair Disclosure keywords indicate possible disclosure of advance financial
information ("SEC Fair Disclosure Keywords" dictionary).
■ An attachment or file type that is a commonly used document or spreadsheet format. The
detected file types are Microsoft Word, Excel Macro, Excel, Works Spreadsheet, SYLK
Spreadsheet, Corel Quattro Pro, WordPerfect, Lotus 123, Applix Spreadsheets, CSV,
Multiplan Spreadsheet, and Adobe PDF.
■ The company name keyword list requires editing by the user, which can include any name,
alternate name, or abbreviation that might indicate a reference to the company.
Library of policy templates 1718
Sarbanes-Oxley policy template

Table 46-72 SEC Fair Disclosure Regulation detection rule

Method Condition Configuration

Compound rule Content Matches SEC Fair Disclosure Regulation (Keyword Match):
Keyword
■ Match keyword: earnings per share, forward guidance
■ Severity: High.
■ Check for existence.
■ Look in envelope, subject, body, attachments.
■ Case insensitive.
■ Match on whole words only.
■ Match on same component.
The keyword must be in the attachment or file type detected by that
condition.

Message Attachment SEC Fair Disclosure Regulation (Attachment/File Type):


or File Type Match
■ File type detected: excel_macro, xls, works_spread, sylk, quattro_pro,
mod, csv, applix_spread, 123, doc, wordperfect, and pdf.
■ Severity: High.
■ Match on: Attachments and same component.

Content Matches SEC Fair Disclosure Regulation (Keyword Match):


Keyword
■ Match "[company name]"
■ Severity: High.
■ Check for existence.
■ Look in envelope, subject, body, attachments.
■ Case insensitive.
■ Match on whole words only.
■ Match on same component.
The keyword must be in the attachment or file type detected by that
condition.

The Financial Information detection rule looks for a specific file type containing a word from
the "Financial Keywords" dictionary and a word from the "Confidential/Proprietary Words"
dictionary. The spreadsheet file types detected are Microsoft Excel Macro, Microsoft Excel,
Microsoft Works Spreadsheet, SYLK Spreadsheet, Corel Quattro Pro, and more.
Library of policy templates 1719
SEC Fair Disclosure Regulation policy template

Table 46-73 Financial Information detection rule

Method Condition Configuration

Compound rule Content Matches Financial Information (Attachment/File Type):


Indexed Document
■ Match file type: excel_macro, xls, works_spread, sylk,
Profile
quattro_pro, mod, csv, applix_spread, Lotus 1-2-3
■ Severity: High.
■ Match on attachments, same component.

Content Matches Financial Information (Keyword Match):


Keyword
■ Match "accounts receivable turnover", "adjusted gross
margin", "adjusted operating expenses", "adjusted operating
margin", "administrative expenses", ....
■ Severity: High.
■ Check for existence.
■ Look in envelope, subject, body, attachments.
■ Case insensitive.
■ Match on whole words only.
■ Keyword must be detected in the attachment (same
component).

Content Matches Financial Information (Keyword Match):


Keyword
■ Match "confidential", "internal use only", "proprietary".
■ Severity: High.
■ Check for existence.
■ Look in envelope, subject, body, attachments.
■ Case insensitive.
■ Match on whole words only.
■ Keyword must be detected in the attachment (same
component).

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

SEC Fair Disclosure Regulation policy template


The US SEC Selective Disclosure and Insider Trading Rules prohibit public companies from
selectively divulging material information to analysts and institutional investors before its general
release to the public.
The SEC Fair Disclosure Regulation template detects data indicating disclosure of material
financial information.
Library of policy templates 1720
SEC Fair Disclosure Regulation policy template

The SEC Fair Disclosure Regulation Documents, Indexed (IDM) detection rule looks for content
from specific documents subject to SEC Fair Disclosure regulation. This rule returns a match
if 80% or more of the source document content is found.

Table 46-74 SEC Fair Disclosure Regulation Documents, Indexed (IDM) detection rule

Method Condition Configuration

Simple rule Content Matches SEC Fair Disclosure Regulation Documents, Indexed (IDM):
Document
■ Detect documents from the selected Indexed Document Profile.
Signature (IDM)
See “Choosing an Indexed Document Profile” on page 411.
■ Match documents with at least 80% content match.
■ Severity: High.
■ Check for existence.
■ Look in body, attachments.

The SEC Fair Disclosure Regulation detection rule looks for the a keyword match from the
"SEC Fair Disclosure Keywords" dictionary, an attachment or file type that is a commonly used
document or spreadsheet, and a keyword match from the "Company Name Keywords"
dictionary.
All three conditions must be satisfied for the rule to trigger an incident:
■ The SEC Fair Disclosure keywords indicate possible disclosure of advance financial
information.
■ The file types detected are Microsoft Word, Excel Macro, Excel, Works Spreadsheet, SYLK
Spreadsheet, Corel Quattro Pro, WordPerfect, Lotus 123, Applix Spreadsheets, CSV,
Multiplan Spreadsheet, and Adobe PDF.
■ The company name keyword list requires editing by the user, which can include any name,
alternate name, or abbreviation that might indicate a reference to the company.
Library of policy templates 1721
Sexually Explicit Language policy template

Table 46-75 SEC Fair Disclosure Regulation detection rule

Method Condition Configuration

Compound rule Content Matches SEC Fair Disclosure Regulation (Keyword Match):
Keyword (DCM)
■ Match "earnings per share", "forward guidance".
■ Severity: High.
■ Check for existence.
■ Look in envelope, subject, body, attachments.
■ Case insensitive.
■ Match on whole words only.

Message Attachment SEC Fair Disclosure Regulation (Attachment/File Type):


or File Type Match
■ Match file type: excel_macro, xls, works_spread, sylk, quattro_pro,
(DCM)
mod, csv, applix_spread, 123, doc, wordperfect, pdf
■ Severity: High.
■ Match on attachments.
■ Require content match to be in the same component (attachment).

Content Matches SEC Fair Disclosure Regulation (Keyword Match):


Keyword (DCM)
■ Match "[company name]" (user defined)
■ Severity: High.
■ Check for existence.
■ Look in envelope, subject, body, attachments, same component.
■ Case insensitive.
■ Match on whole words only.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

Sexually Explicit Language policy template


The Sexually Explicit Language policy detects vulgar, sexually explicit, and pornographic
language content.

DCM Rule Sexually Explicit Keywords, Confirmed

This rule looks for any single keyword in the "Sex. Explicit Keywords, Confirmed"
dictionary.
Library of policy templates 1722
Source Code policy template

DCM Rule Sexually Explicit Keywords, Suspected

This rule looks for any three instances of keywords in the "Sex. Explicit Words,
Suspect" dictionary.

DCM Rule Sexually Explicit Keywords, Possible

This rule looks for any three instances of keywords in the "Sex. Explicit Words,
Possible" dictionary.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

Source Code policy template


The Source Code policy template provides match conditions for detecting various types of
source code at risk of exposure, including C, Java, Perl, and Visual Basic (VB).

Table 46-76 Source code policy template match conditions

Name Type Description

Source Code Documents IDM This rule looks for specific user-provided source code from a
Document Profile.

This rule returns a match if it detects 80% or more of the


source document.

This rule is not available if you do not select a profile when


creating the policy.

Source Code Extensions File Name Match This rule looks for a match among file name extensions from
the "Source Code Extensions" dictionary.

Java Source Code Regular Expressions This compound rule looks for matches on two different regular
expression patterns: Java Import Statements and Java Class
Files.

C Source Code Regular Expression This rule looks for matches on the C Source Code regular
expression pattern.

VB Source Code Regular Expression This rule looks for matches on the VB Source Code regular
expression pattern.

Perl Source Code Regular Expressions This compound rule looks for matches on three different
Perl-related regular expressions patterns.

See “Configuring policies” on page 413.


Library of policy templates 1723
State Data Privacy policy template

See “Exporting policy detection as a template” on page 442.

State Data Privacy policy template


Many states in the US have adopted statutes mandating data protection and public disclosure
of information security breaches in which confidential data of individuals is compromised. The
State Data Privacy policy template is designed to address these types of confidential data
breaches.
The State Data Privacy policy template provides several individual detection rules and produces
an incident if any of these rules are violated. This policy template also provides a configurable
exception condition that allows one or more authorized email recipients to receive otherwise
confidential data.
Table 46-77 describes the acceptable use condition implemented by the State Data Privacy
policy. You must configure the exception for it to apply.

Table 46-77 Email to Affiliates policy exception

Name Type Description Configuration details

Email to Described Email to Affiliates is a policy exception that allows ■ Simple exception (single
Affiliates identity (DCM) email messages to be sent to affiliates who are condition)
(Recipient) legitimately allowed to receive information ■ Match email recipient:
Recipient
covered under the State Data Privacy regulations. [affiliate1], [affiliate2].
Matches Pattern
Policy exceptions are evaluated before detection ■ Edit the "Affiliate Domains"
match conditions. If there is an exception, in this list and enter the email
case an affiliate email address that you have address for each recipient
entered, the entire message is discarded and not who may make acceptable
available for evaluation by detection. use of the confidential data.
■ At least 1 recipient(s) must
match for the exception to
trigger.
■ Matches on the entire
message.

The State Data Privacy policy template implements Exact Data Matching (Table 46-78). If you
do not select an Exact Data profile when you first create a policy based on this template, the
EDM condition is not available for use.
See “Choosing an Exact Data Profile” on page 409.
Library of policy templates 1724
State Data Privacy policy template

Table 46-78 State Data Privacy EDM rule

Rule name Condition type Description Configuration details

State Data Content matches This rule looks for an exact data match on three When you are creating the EDM
Privacy, Exact Data of the following: profile, you should validate it
Consumer (EDM) against the State Data Privacy
■ ABA Routing Number
Data template to ensure that the
■ Account Number
resulting index includes
■ Bank Card Number (credit card number) expected fields.
■ Birth Date
■ Simple rule (single match
■ Driver License Number
condition)
■ First Name
■ Severity: High
■ Last Name
■ Report incident if 1 match
■ Password
■ Look in envelope, body,
■ PIN Number
attachments
■ Social Security Number
■ State ID Card Number
Exception conditions: the following combinations
do not match:

■ First Name, Last Name, PIN


■ First Name, Last Name, Password

Table 46-79 lists and describes the DCM detection rules implemented by the State Data Privacy
policy. If any one of these rules is violated the policy produces an incident, unless you have
configured the exception condition and the message recipient is an acceptable use affiliate.

Table 46-79 State Data Privacy detection rules

Rule name Condition type Description Configuration details

US Social Content Matches The US Social Security Number Patterns rule is ■ Simple rule (single match
Security Data Identifier designed to detect US social security numbers condition)
Number (DCM) (SSNs). The Randomized US SSN data identifier ■ Severity: High.
Patterns detects SSN patterns, both traditional and those ■ Count all matches.
issued under the new randomization scheme. ■ Look in envelope, subject,
See “Randomized US Social Security Number body, attachments.
(SSN)” on page 1414.

ABA Routing Content Matches The ABA Routing Numbers rule is designed to ■ Simple rule (single match
Numbers Data Identifier detect ABA Routing Numbers. condition)
(DCM) Severity: High.
The ABA Routing Numbers data identifier detects ■
ABA routing numbers. ■ Count all matches.
■ Look in envelope, subject,
See “ABA Routing Number” on page 1013.
body, attachments.
Library of policy templates 1725
State Data Privacy policy template

Table 46-79 State Data Privacy detection rules (continued)

Rule name Condition type Description Configuration details

Credit Card Content Matches The Credit Card Numbers rule is designed to ■ Simple rule (single condition)
Numbers, All Data Identifier match on credit card numbers. ■ Severity: High.
(DCM) ■ Count all matches.
To detect credit card numbers, this rule
implements the Credit Card Number narrow ■ Look in envelope, subject,
breadth system data identifier. body, attachments

See “Credit Card Number narrow breadth”


on page 1100.

CA Drivers Content Matches The CA Drivers License Numbers rule looks for ■ Simple rule (single condition)
License Data Identifier a match for the CA drivers license number ■ Severity: High.
Numbers (DCM) pattern, a match for a data identifier for terms ■ Count all matches.
relating to "drivers license," and a keyword from
■ Look in envelope, subject,
the "California Keywords" dictionary.
body, attachments
See “Driver's License Number – CA State ”
on page 1133.

NY Drivers Content Matches The NY Drivers License Numbers rule looks for ■ Simple rule (single condition)
License Data Identifier a match for the NY drivers license number ■ Severity: High.
Numbers (DCM) pattern, a match for a regular expression for terms ■ Count all matches.
relating to "drivers license," and a keyword from
■ Look in envelope, subject,
the "New York Keywords" dictionary.
body, attachments
See “Driver's License Number - NY State”
on page 1139.

FL, MI, and Content Matches The FL, MI, and MN Drivers License Numbers ■ Simple rule (single condition)
MN Drivers Data Identifier rule looks for a match for the stated drivers ■ Severity: High.
License (DCM) license number pattern, a match for a regular ■ Count all matches.
Numbers expression for terms relating to "drivers license,"
■ Look in envelope, subject,
and a keyword from the "Letter/12 Num. DLN
body, attachments
State Words" dictionary (namely, Florida,
Minnesota, and Michigan).

See “Driver's License Number - FL, MI, MN


States” on page 1134.

IL Drivers Content Matches The IL Drivers License Numbers detection rule ■ Simple rule (single condition)
License Data Identifier looks for a match for the IL drivers license number ■ Severity: High.
Numbers (DCM) pattern, a match for a regular expression for terms ■ Count all matches.
relating to "drivers license," and a keyword from
■ Look in envelope, subject,
the "Illinois Keywords" dictionary.
body, attachments
See “Driver's License Number - IL State”
on page 1136.
Library of policy templates 1726
SWIFT Codes policy template

Table 46-79 State Data Privacy detection rules (continued)

Rule name Condition type Description Configuration details

NJ Drivers Content Matches The NJ Drivers License Numbers detection rule ■ Simple rule (single condition)
License Data Identifier looks for a match for the NJ drivers license ■ Severity: High.
Numbers (DCM) number pattern, a match for a regular expression ■ Count all matches.
for terms relating to "drivers license," and a
■ Look in envelope, subject,
keyword from the "New Jersey Keywords"
body, attachments
dictionary.

This condition implements the Driver's License


Number- NJ State medium breadth system Data
Identifier.

See “Driver's License Number- NJ State medium


breadth” on page 1138.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

SWIFT Codes policy template


The Society for Worldwide Interbank Financial Telecommunication (SWIFT) is a cooperative
organization under Belgian law and is owned by its member financial institutions. The SWIFT
code (also known as a Bank Identifier Code, BIC, or ISO 9362) has a standard format to identify
a bank, location, and the branch involved. These codes are used when transferring money
between banks, particularly across international borders.

DCM Rule SWIFT Code Regular Expression

This rule looks for a match to the SWIFT code regular expression and a keyword
from the "SWIFT Code Keywords" dictionary.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

Symantec DLP Awareness and Avoidance policy


template
The Symantec DLP Awareness & Avoidance policy detects any communications that refer to
Symantec Data Loss Prevention or data loss prevention systems and possible avoidance of
detection. The Symantec DLP Awareness & Avoidance policy is most useful for the deployments
that are not widely known among monitored users.
Library of policy templates 1727
UK Drivers License Numbers policy template

DCM Rule Symantec DLP Awareness

Checks for a keyword match from the "Symantec DLP Awareness" dictionary.

DCM Rule Symantec DLP Avoidance

This rule is a compound rule with two conditions; both must be matched to trigger
an incident. This rule looks for a keyword match from the "Symantec DLP Awareness"
dictionary and a keyword from the "Symantec DLP Avoidance" dictionary.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

UK Drivers License Numbers policy template


The UK Drivers License Numbers policy detects UK Drivers License Numbers using the official
specification of the UK Government Standards of the UK Cabinet Office.

DCM Rule UK Drivers License Numbers


This rule is a compound rule with the following conditions:

■ A single keyword from the "UK Keywords" dictionary


■ The pattern matching that of the UK drivers license data identifier
■ Different combinations of the phrase "drivers license" using a data identifier

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

UK Electoral Roll Numbers policy template


The UK Electoral Roll Numbers policy detects UK Electoral Roll Numbers using the official
specification of the UK Government Standards of the UK Cabinet Office.

DCM Rule UK Electoral Roll Numbers


This rule is a compound rule with the following conditions:

■ A single keyword from the "UK Keywords" dictionary


■ A pattern matching the UK Electoral Roll Number data identifier
■ A single keyword from the "UK Electoral Roll Number Words" dictionary

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.
Library of policy templates 1728
UK National Health Service (NHS) Number policy template

UK National Health Service (NHS) Number policy


template
The UK National Health Service (NHS) Number policy detects the personal identification
number issued by the U.K. National Health Service (NHS) for administration of medical care.

DCM Rule UK NHS Numbers

This rule looks for a single compound condition with two parts: either new or old
style National Health Service numbers and a single keyword from the "UK NHS
Keywords" dictionary.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

UK National Insurance Numbers policy template


The National Insurance Number is issued to individuals by the UK Department for Work and
Pensions and Inland Revenue (DWP/IR) for administering the national insurance system. The
UK National Insurance Numbers policy detects these insurance policy numbers.

DCM Rule UK National Insurance Numbers

This rule looks for a match to the UK National Insurance number data identifier and
a keyword from the dictionary "UK NIN Keywords."

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

UK Passport Numbers policy template


The UK Passport Numbers policy detects valid UK passports using the official specification of
the UK Government Standards of the UK Cabinet Office.

DCM Rule UK Passport Numbers (Old Type)

This rule looks for a keyword from the "UK Passport Keywords" dictionary and a
pattern matching the regular expression for UK Passport Numbers (Old Type).

DCM Rule UK Passport Numbers (New Type)

This rule looks for a keyword from the "UK Passport Keywords" dictionary and a
pattern matching the regular expression for UK Passport Numbers (New Type).
Library of policy templates 1729
UK Tax ID Numbers policy template

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

UK Tax ID Numbers policy template


The UK Tax ID Numbers policy detects UK Tax ID Numbers using the official specification of
the UK Government Standards of the UK Cabinet Office.

DCM Rule UK Tax ID Numbers

This rule looks for a match to the UK Tax ID number data identifier and a keyword
from the dictionary "UK Tax ID Number Keywords."

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

US Intelligence Control Markings (CAPCO) and DCID


1/7 policy template
The US Intelligence Control Markings (CAPCO) & DCID 1/7 policy detects authorized terms
to identify classified information in the US Federal Intelligence community as defined in the
Control Markings Register, which is maintained by the Controlled Access Program Coordination
Office (CAPCO) of the Community Management Staff (CMS). The register was created in
response to the Director of Central Intelligence Directive (DCID) 1/7.
This rule looks for a keyword match on the phrase "TOP SECRET."

Table 46-80 Top Secret Information detection rule

Method Condition Configuration

Simple rule Content Matches Match "TOP SECRET//"


Keyword (DCM)
■ Severity: High.
■ Check for existence.
■ Look in envelope, subject, body, attachments.
■ Case sensitive.
■ Match on whole or partial words.

This rule looks for a keyword match on the phrase "SECRET."


Library of policy templates 1730
US Social Security Numbers policy template

Table 46-81 Secret Information detection rule

Method Condition Configuration

Simple rule Content Matches Match "SECRET//"


Keyword (DCM)
■ Severity: High.
■ Check for existence.
■ Look in envelope, subject, body, attachments.
■ Case sensitive.
■ Match on whole or partial words.

This rule looks for a keyword match on the phrases "CLASSIFIED" or "RESTRICTED."

Table 46-82 Classified or Restricted Information (Keyword Match) detection rule

Method Condition Configuration

Simple rule Content Matches Match "CLASSIFIED//,//RESTRICTED//"


Keyword (DCM)
■ Severity: High.
■ Check for existence.
■ Look in envelope, subject, body, attachments.
■ Case sensitive.
■ Match on whole or partial words.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

US Social Security Numbers policy template


The US Social Security Numbers policy detects patterns indicating social security numbers at
risk of exposure.

Table 46-83 US Social Security Numbers policy template

Rule name Rule type Description Details

US Social Security DCM Rule This rule looks for a match to the social See “Randomized US Social
Number Patterns security number regular expression and Security Number (SSN)”
a keyword from the dictionary "US SSN on page 1414.
Keywords."

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.
Library of policy templates 1731
Violence and Weapons policy template

Violence and Weapons policy template


The Violence and Weapons policy detects violent language and discussions about weapons.

Table 46-84 Violence and Weapons policy template

Name Type Description

Violence and DCM Rule This rule is a compound rule with two conditions; both must match to trigger an
Weapons incident. This rule looks for a keyword from the "Violence Keywords" dictionary
and a keyword from the "Weapons Keywords" dictionary.

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

Webmail policy template


The Webmail policy detects the use of a variety of Webmail services, including Yahoo, Google,
and Hotmail.

Table 46-85 Webmail policy template rules

Name Type Condition(s) Description

Yahoo Compound Recipient Matches This condition checks for the URL domain
detection rule Pattern (DCM) mail.yahoo.com.

Content Matches This condition checks for the keyword ym/compose.


Keyword (DCM)

Hotmail Compound Recipient Matches This condition checks for the URL domain
detection rule Pattern (DCM) hotmail.msn.com.

Content Matches This condition checks for the keyword


Keyword (DCM) compose?&curmbox.

Go Compound Recipient Matches This condition checks for the URL gomailus.go.com.
detection rule Pattern (DCM)

Content Matches This condition checks for the keyword compose.


Keyword (DCM)

AOL Compound Recipient Matches This condition checks for the URL domain aol.com.
detection rule Pattern (DCM)

Content Matches This condition checks for the keyword compose.


Keyword (DCM)
Library of policy templates 1732
Yahoo Message Board Activity policy template

Table 46-85 Webmail policy template rules (continued)

Name Type Condition(s) Description

Gmail Compound Recipient Matches This condition checks for the URL domain
detection rule Pattern (DCM) gmail.google.com.

Content Matches This condition checks for the keyword gmail.


Keyword (DCM)

See “Configuring policies” on page 413.


See “Exporting policy detection as a template” on page 442.

Yahoo Message Board Activity policy template


The Yahoo Message Board policy template detects Yahoo message board activity.
The Yahoo Message Board detection rule is a compound method that looks for messages
posted to the Yahoo message board you specify.
Table 46-86 describes its configuration details.

Table 46-86 Yahoo Message Board detection rule

Method Condition Configuration

Compound Content Matches Keyword Yahoo Message Board (Keyword Match):


rule (DCM)
■ Case insensitive.
■ Match Keyword: post.messages.yahoo.com/bbs.
■ Match on whole words only.
■ Check for existence (do not count multiple matches).
■ Look in envelope, subject, body, attachments.
■ Match must occur in the same component for both conditions.

AND

Content Matches Keyword Yahoo Message Board (Keyword Match):


(DCM)
■ Case insensitive.
■ Match Keyword: board=<enter board number>.
■ Match on whole words only.
■ Check for existence (do not count multiple matches).
■ Look in envelope, subject, body, attachments.
■ Match must occur in the same component for both conditions.
Library of policy templates 1733
Yahoo and MSN Messengers on Port 80 policy template

The Finance Message Board URL detection rule detects messages posted to the Yahoo
Finance message board.
Table 46-87 describes its configuration.

Table 46-87 Finance Message Board URL detection rule

Method Condition Configuration

Simple rule Content Matches Keyword Finance Message Board URL (Keyword Match):
(DCM)
■ Case insensitive.
■ Match Keyword: messages.finance.yahoo.com.
■ Match on whole words only.
■ Check for existence (do not count multiple matches).
■ Look in envelope, subject, body, attachments.

The Board URLs detection rule detects messages posted to the Yahoo or Yahoo Finance
message boards by the URL of either.
Table 46-88 describes its configuration details.

Table 46-88 Board URLs detection rule

Method Condition Configuration

Simple rule Recipient Matches Pattern Board URLs (Recipient):


(DCM)
■ Recipient URL:
messages.yahoo.com,messages.finance.yahoo.com.
■ At least 1 recipient(s) must match.
■ Matches on the entire message (not configurable).

See “Creating a policy from a template” on page 397.


See “Exporting policy detection as a template” on page 442.

Yahoo and MSN Messengers on Port 80 policy


template
The Yahoo and MSN Messengers on Port 80 policy detects Yahoo and MSN Messenger
activity over port 80.
The Yahoo IM detection rule looks for keyword matches on both ymsg and
shttp.msg.yahoo.com.
Library of policy templates 1734
Yahoo and MSN Messengers on Port 80 policy template

Table 46-89 Yahoo IM detection rule

Method Condition Configuration

Yahoo IM (Keyword Match):


■ Case insensitive.
■ Match keyword: ymsg.
Content Matches Keyword ■ Match on whole words only.
(DCM) ■ Count all matches and report an incident for each match.
■ Look for matches in the envelope, subject, body, and attachments.
■ Match must occur in the same component for both conditions in the
rule.
Compound
AND
rule
Yahoo IM (Keyword Match):

■ Case insensitive.
■ Match keyword: shttp.msg.yahoo.com.
Content Matches Keyword ■ Match on whole words only.
(DCM) ■ Count all matches and report an incident for each match.
■ Look for matches in the envelope, subject, body, and attachments.
■ Match must occur in the same component for both conditions in the
rule.

The MSN IM detection rule looks for matches on three keywords in the same message
component.
Library of policy templates 1735
Yahoo and MSN Messengers on Port 80 policy template

Table 46-90 MSN IM detection rule

Method Condition Configuration

MSN IM (Keyword Match):


■ Case insensitive.
■ Match keyword: msg.
Content Matches Keyword
■ Match on whole words only.
(DCM)
■ Count all matches and report an incident for each match.
■ Look for matches in the envelope, subject, body, and attachments.
■ Match must occur in the same component for all conditions in the rule.

AND

MSN IM (Keyword Match):

■ Case insensitive.
■ Match keyword: x-msn.
Compound Content Matches Keyword
■ Match on whole words only.
rule (DCM)
■ Count all matches and report an incident for each match.
■ Look for matches in the envelope, subject, body, and attachments.
■ Match must occur in the same component for all conditions in the rule.

AND

MSN IM (Keyword Match):

■ Case insensitive.
■ Match keyword: charset=utf-8.
Content Matches Keyword
■ Match on whole words only.
(DCM)
■ Count all matches and report an incident for each match.
■ Look for matches in the envelope, subject, body, and attachments.
■ Match must occur in the same component for all conditions in the rule.

See “Creating a policy from a template” on page 397.


See “Exporting policy detection as a template” on page 442.
Section 5
Configuring policy response
rules

■ Chapter 47. Responding to policy violations

■ Chapter 48. Configuring and managing response rules

■ Chapter 49. Response rule conditions

■ Chapter 50. Response rule actions


Chapter 47
Responding to policy
violations
This chapter includes the following topics:

■ About response rules

■ About response rule actions

■ Response rule actions for all detection servers

■ Response rule actions for endpoint detection

■ Response rule actions for Network Prevent detection

■ Response rule actions for Network Protect detection

■ Response rule actions for Cloud Storage detection

■ Response rule actions for Cloud Applications and API appliance detectors

■ About response rule execution types

■ About Automated Response rules

■ About Smart Response rules

■ About response rule conditions

■ About response rule action execution priority

■ About response rule authoring privileges

■ Implementing response rules

■ Response rule best practices


Responding to policy violations 1738
About response rules

About response rules


You can implement one or more response rules in a policy to remedy, escalate, resolve, and
dismiss incidents when a violation occurs. For example, if a policy is violated, a response rule
blocks the transmission of a file containing sensitive content.
See “About response rule actions” on page 1738.
You create, modify, and manage response rules separate from the policies that declare them.
This decoupling allows response rules to be updated and reused across policies.
See “Implementing response rules” on page 1758.
The detection server automatically executes response rules. Or, you can configure Smart
Response rules for manual execution by an incident remediator.
See “About response rule execution types” on page 1750.
You can implement conditions to control how and when response rules execute.
See “About response rule conditions” on page 1752.
You can sequence the order of execution for response rules of the same type.
See “About response rule action execution priority” on page 1753.
You must have response rule authoring privileges to create and manage response rules.
See “About response rule authoring privileges” on page 1757.

About response rule actions


Response rule actions are the components that take action when a policy violation occurs.
Response rule actions are mandatory components of response rules. If you create a response
rule, you must define at least one action for the response rule to be valid.
Symantec Data Loss Prevention provides several response rule actions. Many are available
for all types of detection servers. Others are available for specific detection servers.
See “Implementing response rules” on page 1758.
The detection server where a policy is deployed executes a response rule action any time a
policy violation occurs. Or, you can configure a response rule condition to dictate when the
response rule action executes.
See “About response rule conditions” on page 1752.
For example, any time a policy is violated, send an email to the user who violated the policy
and the manager. Or, if a policy violation severity level is medium, present the user with an
on-screen warning. Or, if the severity is high, block a file from being copied to an external
device.
Responding to policy violations 1739
Response rule actions for all detection servers

Table 47-1 Response rule actions by server type

Server type Description

All detection servers See “Response rule actions for all detection servers” on page 1739.

Endpoint detection servers See “Response rule actions for endpoint detection” on page 1740.

Network Prevent detection servers See “Response rule actions for Network Prevent detection” on page 1741.

Network Protect detection servers See “Response rule actions for Network Protect detection” on page 1742.

Cloud storage detections servers and See “Response rule actions for Cloud Storage detection” on page 1743.
detectors

Cloud Detection Service REST See “Response rule actions for Cloud Applications and API appliance
detectors and API Detection for detectors” on page 1744.
Developer Apps Appliances

Response rule actions for all detection servers


Symantec Data Loss Prevention provides several response rule actions for Endpoint Prevent,
Endpoint Discover, Network Prevent for Web, Network Prevent for Email, and Network Protect.

Table 47-2 Available response rule actions for all detection servers

Response rule action Description

Add Note Add a field to the incident record that the remediator can annotate at the
Incident Snapshot screen.

See “Configuring the Add Note action” on page 1782.

Limit Incident Data Retention Discard or retain matched data with the incident record.

See “Configuring the Limit Incident Data Retention action” on page 1783.

Log to a Syslog Server Log the incident to a syslog server.

See “Configuring the Log to a Syslog Server action” on page 1785.

Send Email Notification Send an email you compose to recipients you specify.

See “Configuring the Send Email Notification action” on page 1786.


Responding to policy violations 1740
Response rule actions for endpoint detection

Table 47-2 Available response rule actions for all detection servers (continued)

Response rule action Description

Server FlexResponse Execute a custom Server FlexResponse action.


See “Configuring the Server FlexResponse action” on page 1788.
Note: This response rule action is available only if you deploy one or more
custom Server FlexResponse plug-ins to Symantec Data Loss Prevention.

See “Deploying a Server FlexResponse plug-in” on page 2143.

Set Attribute Add a custom value to the incident record.

See “Configuring the Set Attribute action” on page 1789.

Set Status Change the incident status to the specified value.

See “Configuring the Set Status action” on page 1790.

See “About response rules” on page 1738.


See “Implementing response rules” on page 1758.

Response rule actions for endpoint detection


Symantec Data Loss Prevention provides several response rule actions for Endpoint Prevent
and Endpoint Discover.

Table 47-3 Available Endpoint response rule actions

Response rule action Description

Endpoint: FlexResponse Take custom action using the FlexResponse API.

See “Configuring the Endpoint: FlexResponse action” on page 1813.

Endpoint: ICT Classification And Tagging Apply the appropriate ICT classification to content in policy violation or as
a baseline Classification Scan.

See “Configuring the Endpoint: ICT Classification And Tagging action”


on page 1814.

Endpoint Discover: Information Centric The Endpoint Discover: Information Centric Defense response rule action
Defense flags sensitive files for Symantec Endpoint Protection (SEP) monitoring.

Endpoint Discover: Quarantine File Quarantine a discovered sensitive file.

See “Configuring the Endpoint Discover: Quarantine File action” on page 1815.
Responding to policy violations 1741
Response rule actions for Network Prevent detection

Table 47-3 Available Endpoint response rule actions (continued)

Response rule action Description

Endpoint Prevent: Block Block the transfer of data that violates the policy.
For example, block the copy of confidential data from an endpoint to a USB
flash drive.

See “Configuring the Endpoint Prevent: Block action” on page 1817.

Endpoint Prevent: Notify Display an on-screen notification to the endpoint user when confidential
data is transferred.

See “Configuring the Endpoint Prevent: Notify action” on page 1825.

Endpoint Prevent: User Cancel Allow the user to cancel the transfer of a confidential file. The override is
time sensitive.

See “Configuring the Endpoint Prevent: User Cancel action” on page 1828.

See “About response rules” on page 1738.


See “Implementing response rules” on page 1758.
See “Endpoint Prevent on Mac response rule features” on page 2285.

Response rule actions for Network Prevent detection


Symantec Data Loss Prevention provides several response rule actions for Network Prevent
for Web and Network Prevent for Email.

Table 47-4 Available Network response rule actions

Response rule action Description

Network Prevent: Block FTP Request Block FTP transmissions.

See “Configuring the Network Prevent for Web: Block FTP Request action”
on page 1831.
Note: Only available with Network Prevent for Web.

Network Prevent: Block HTTP/S Block Web postings.

See “Configuring the Network Prevent for Web: Block HTTP/S action”
on page 1831.
Note: Only available with Network Prevent for Web.
Responding to policy violations 1742
Response rule actions for Network Protect detection

Table 47-4 Available Network response rule actions (continued)

Response rule action Description

Network Prevent: Block SMTP Message Block email that causes an incident.
See “Configuring the Network Prevent: Block SMTP Message action”
on page 1832.

Network Prevent: ICE Encryption Encrypt emails and attachments, or attachments.

See “Encrypting cloud email with Symantec Information Centric Encryption”


on page 2518.

Network Prevent: Modify SMTP Message Modify sensitive email messages.

For example, change the email subject to include information about the
violation.

See “Configuring the Network Prevent: Modify SMTP Message action”


on page 1833.

Network Prevent: Remove HTTP/S Remove confidential content from Web posts.
Content
See “Configuring the Network Prevent for Web: Remove HTTP/S Content
action” on page 1835.
Note: Only available with Network Prevent for Web.

See “About response rules” on page 1738.


See “Implementing response rules” on page 1758.

Response rule actions for Network Protect detection


Symantec Data Loss Prevention provides several response rule actions for Network Protect
(Discover).

Table 47-5 Available Network Protect response rule actions

Response rule action Description

Network Protect: Copy File Copy sensitive files to a location you specify.

See “Configuring the Network Protect: Copy File action” on page 1836.
Note: Only available with Network Protect.
Responding to policy violations 1743
Response rule actions for Cloud Storage detection

Table 47-5 Available Network Protect response rule actions (continued)

Response rule action Description

Network Protect: Quarantine File Quarantine sensitive files.


See “Configuring the Network Protect: Quarantine File action” on page 1837.
Note: Only available with Network Protect.

Network Protect: Encrypt File Encrypt sensitive files using Symantec ICE.

See “Configuring the Network Protect: Encrypt File action” on page 1838.
Note: This action is available only if you have installed the Network Protect
ICE license and configured the Enforce Server to connect to the Symantec
ICE Cloud. For information about how Symantec Data Loss Prevention
interacts with Symantec ICE, refer to the Symantec Information Centric
Encryption Deployment Guide at http://www.symantec.com/docs/DOC9707.

See “About response rules” on page 1738.


See “Implementing response rules” on page 1758.

Response rule actions for Cloud Storage detection


Symantec Data Loss Prevention provides two response rule actions for Cloud Storage detection,
from either on-premises detection servers or on cloud detectors.

Table 47-6 Available Cloud Storage response rule actions

Response rule action Description

Cloud Storage: Add Visual Tag Add a text tag to Box cloud storage content that
violates a policy.

See “Configuring the Cloud Storage: Add Visual


Tag action” on page 1791.

Cloud Storage: Quarantine Quarantine sensitive files from a cloud storage user
account to a quarantine user account. For
on-premises Box scanning, you can also use an
on-premises quarantine location.

See “Configuring the Cloud Storage: Quarantine


action” on page 1791.

See “About response rules” on page 1738.


See “Implementing response rules” on page 1758.
Responding to policy violations 1744
Response rule actions for Cloud Applications and API appliance detectors

Response rule actions for Cloud Applications and API


appliance detectors
The Symantec Data Loss Prevention Cloud Detection Service enables you to connect Symantec
Data Loss Prevention to your cloud access security broker (CASB) solution. You can use the
public REST API to send sensitive data from your CASB solution to Symantec Data Loss
Prevention for inspection. Symantec Data Loss Prevention responds with policy violation
information and recommendations for remediation action where appropriate.
The API Detection for Developer Apps Appliance enables you to connect with on-premises
applications. You can use the REST API to submit data from your applications to Symantec
Data Loss Prevention for inspection. Symantec Data Loss Prevention responds with policy
violation information and recommendations for remediation action where appropriate.
These Cloud Applications and API appliance response rules let you configure the remediation
recommendation messages that Symantec Data Loss Prevention includes in the detection
responses it sends back to the REST client in the customResponsePayload or message
parameters.

Table 47-7 Available Cloud Applications and API appliance Smart Response rule actions

Response rule action Description

Encrypt The Encrypt Smart Response action lets you


encrypt sensitive files in cloud applications through
the Symantec Data Loss Prevention Cloud
Detection Service.

See “Configuring the Encrypt Smart Response


action” on page 1783.

Remove Collaborator Access The Remove Collaborator Access Smart


Response action removes collaborator access from
shared files in cloud applications through the Cloud
Detection Service.

See “Configuring the Remove Collaborator Access


Smart Response action” on page 1797.

Remove Shared Links The Remove Shared Links Smart Response action
removes shared links from files in cloud applications
through the Cloud Detection Service.

See “Configuring the Remove Shared Links Smart


Response action” on page 1797.
Responding to policy violations 1745
Response rule actions for Cloud Applications and API appliance detectors

Table 47-8 Available Cloud Applications and API appliance (Data-at-Rest) automated
response rule actions

Response rule action Description

Custom Action on Data-at-Rest The Custom Action on Data-at-Rest action returns


a recommendation to perform some custom action
on the sensitive data with the detection result.

See “Configuring the Custom Action on Data-at-Rest


action” on page 1798.

Delete Data-at-Rest The Delete Data-at-Rest action deletes sensitive


data in the following cloud applications through the
Cloud Detection Service:

■ Dropbox
■ Gmail
■ Office 365 Email

See “Configuring the Delete Data-at-Rest action”


on page 1799.

Encrypt Data-at-Rest The Encrypt Data-at-Rest action encrypts sensitive


data in the following applications through the Cloud
Detection Service:

■ Office 365 OneDrive


■ Office 365 SharePoint

See “Configuring the Encrypt Data-at-Rest action”


on page 1799.

Perform DRM on Data-at-Rest The Perform DRM on Data-at-Rest action applies


Digital Rights Management (DRM) to the sensitive
data.

See “Configuring the Perform DRM on Data-at-Rest


action” on page 1800.
Responding to policy violations 1746
Response rule actions for Cloud Applications and API appliance detectors

Table 47-8 Available Cloud Applications and API appliance (Data-at-Rest) automated
response rule actions (continued)

Response rule action Description

Quarantine Data-at-Rest The Quarantine Data-at-Rest action quarantines


sensitive data in the following cloud applications
through the Cloud Detection Service:

■ Box
■ Office 365 OneDrive
■ Office 365 SharePoint
■ Salesforce
■ Slack

See “Configuring the Quarantine Data-at-Rest


action” on page 1801.

Remove Shared Links in Data-at-Rest The Remove Shared Links in Data-at-Rest action
removes shared links to sensitive data in the
following cloud applications through the Cloud
Detection Service:

■ Box
■ Dropbox
■ Google Drive
■ Office 365 OneDrive
■ Salesforce

See “Configuring the Remove Shared Links in


Data-at-Rest action” on page 1802.

Tag Data-at-Rest The Tag Data-at-Rest action tags the sensitive


data.

See “Configuring the Tag Data-at-Rest action”


on page 1802.

Table 47-9 Available Cloud Applications and API appliance (Additional Data-at-Rest Actions)
automated response rule actions

Response rule action Description

Prevent download, copy, print The Prevent download, copy, print action
prevents download, copy, and print options for the
sensitive data.

See “Configuring the Prevent download, copy, print


action” on page 1803.
Responding to policy violations 1747
Response rule actions for Cloud Applications and API appliance detectors

Table 47-9 Available Cloud Applications and API appliance (Additional Data-at-Rest Actions)
automated response rule actions (continued)

Response rule action Description

Remove Collaborator Access The Remove Collaborator Access action removes


access from collaborators to sensitive data files in
the following cloud applications through the Cloud
Detection Service:

■ Box
■ Dropbox
■ Google Drive
■ Office 365 SharePoint
■ Salesforce

See “Configuring the Remove Collaborator Access


action” on page 1804.

Set Collaborator Access to 'Edit' The Set Collaborator Access to 'Edit' action
grants collaborators edit access to sensitive data
files in the following cloud applications through the
Cloud Detection Service:

■ Box
■ Dropbox
■ Google Drive
■ Office 365 SharePoint
■ Salesforce

See “Configuring the Set Collaborator Access to


'Edit' action” on page 1804.

Set Collaborator Access to 'Preview' The Set Collaborator Access to 'Preview' action
grants collaborators preview access to sensitive
data files in the Box cloud application through the
Cloud Detection Service.

See “Configuring the Set Collaborator Access to


'Preview' action” on page 1805.
Responding to policy violations 1748
Response rule actions for Cloud Applications and API appliance detectors

Table 47-9 Available Cloud Applications and API appliance (Additional Data-at-Rest Actions)
automated response rule actions (continued)

Response rule action Description

Set Collaborator Access to 'Read' The Set File Access to 'Internal Edit' action grants
edit access to all members of your organization to
sensitive files in the following cloud applications
through the Cloud Detection Service:

■ Box
■ Dropbox
■ Google Drive
■ Office 365 SharePoint
■ Salesforce

See “Configuring the Set Collaborator Access to


'Read' action” on page 1805.

Set File Access to 'All Read' The Set File Access to 'All Read' action grants
public read access to sensitive data files in the
following cloud applications through the Cloud
Detection Service.

■ Google Drive
■ Office 365 OneDrive
■ Office 365 SharePoint

See “Configuring the Set File Access to 'All Read'


action” on page 1806.

Set File Access to 'Internal Edit' The Set File Access to 'Internal Edit' action grants
edit access to all members of your organization to
sensitive files in the following cloud applications
through the Cloud Detection Service:

■ Box
■ Google Drive
■ Office 365 OneDrive
■ Office 365 SharePoint
■ Salesforce

See “Configuring the Set File Access to 'Internal


Edit'” on page 1806.
Responding to policy violations 1749
Response rule actions for Cloud Applications and API appliance detectors

Table 47-9 Available Cloud Applications and API appliance (Additional Data-at-Rest Actions)
automated response rule actions (continued)

Response rule action Description

Set File Access to 'Internal Read' The Set File Access to 'Internal Read' action
grants read access to all members of your
organization to sensitive data files in the following
cloud applications through the Cloud Detection
Service:

■ Box
■ Google Drive
■ Office 365 SharePoint
■ Salesforce

See “Configuring the Set File Access to 'Internal


Read' action” on page 1807.

Table 47-10 Available Cloud Applications and API appliance (Data-in-Motion) automated
response rule actions

Response rule action Description

Add two-factor authentication The Add two-factor authentication action adds


two-factor authentication to the sensitive data.

See “Configuring the Add two-factor authentication


action” on page 1808.

Block Data-in-Motion The Block Data-in-Motion action blocks the


sensitive data.

See “Configuring the Block Data-in-Motion action”


on page 1808.

Custom Action on Data-in-Motion The Custom Action on Data-in-Motion action


returns a recommendation to take some custom
action on the sensitive data with the detection result.

See “Configuring the Custom Action on


Data-in-Motion action” on page 1809.

Encrypt Data-in-Motion The Encrypt Data-in-Motion action encrypts the


sensitive data.

See “Configuring the Encrypt Data-in-Motion action”


on page 1810.
Responding to policy violations 1750
About response rule execution types

Table 47-10 Available Cloud Applications and API appliance (Data-in-Motion) automated
response rule actions (continued)

Response rule action Description

Perform DRM on Data-in-Motion The Perform DRM on Data-in-Motion action


applies Digital Rights Management (DRM) to the
sensitive data.

See “Configuring the Perform DRM on


Data-in-Motion action” on page 1810.

Quarantine Data-in-Motion The Quarantine Data-in-Motion action quarantines


the sensitive data.

See “Configuring the Quarantine Data-in-Motion


action” on page 1811.

Redact Data-in-Motion The Redact Data-in-Motion action redacts the


sensitive data.

See “Configuring the Redact Data-in-Motion action”


on page 1812.

About response rule execution types


Symantec Data Loss Prevention provides two types of policy response rules: Automated and
Smart.
The detection server that reports a policy violation executes Automated Response rules. Users
such as incident remediators execute Smart Response rules on demand from the Enforce
Server administration console.
See “About recommended roles for your organization” on page 111.

Table 47-11 Response rule types

Response rule execution type Description

Automated Response rules When a policy violation occurs, the detection server automatically executes
response rule actions.

See “About Automated Response rules” on page 1751.

Smart Response rules When a policy violation occurs, an authorized user manually triggers the
response rule.

See “About Smart Response rules” on page 1751.

See “About response rule actions” on page 1738.


Responding to policy violations 1751
About Automated Response rules

See “Implementing response rules” on page 1758.

About Automated Response rules


The system executes Automated Response rules when the detection engine reports a policy
violation. However, if you implement a response rule condition, the condition must be met for
the system to execute the response rule. Conditions let you control the automated execution
of response rule actions.
See “About response rule conditions” on page 1752.
For example, the system can automatically block certain policy violating actions, such as the
attempted transfer of high value customer data or sensitive design documents. Or, the system
can escalate an incident to a workflow management system for immediate attention. Or, you
can set a different severity level for an incident involving 1000 customer records than for one
involving only 10 records.
See “Implementing response rules” on page 1758.

About Smart Response rules


Users execute Smart Response rules on demand in response to policy violations from the
Enforce Server administration console Incident Snapshot screen.
See “About response rule actions” on page 1738.
You create Smart Response rules for the situations that require human remediation. For
example, you might create a Smart response rule to dismiss false positive incidents. An incident
remediator can review the incident, identify the match as a false positive, and dismiss it.
See “About configuring Smart Response rules” on page 1764.
Only some response rules are available for manual execution.

Table 47-12 Available Smart Response rules for manual execution

Smart response rule Description

Add Note Add a field to the incident record that the remediator can annotate at the
Incident Snapshot screen.

See “Configuring the Add Note action” on page 1782.

Log to a Syslog Server Log the incident to a syslog server for workflow remediation.

See “Configuring the Log to a Syslog Server action” on page 1785.

Quarantine Quarantine sensitive data in cloud applications.


Responding to policy violations 1752
About response rule conditions

Table 47-12 Available Smart Response rules for manual execution (continued)

Smart response rule Description

Restore File Restore a previously quarantined cloud application file.

Send Email Notification Send an email you compose to recipients you specify.

See “Configuring the Send Email Notification action” on page 1786.

Server FlexResponse Execute a custom Server FlexResponse action.

See “Configuring the Server FlexResponse action” on page 1788.


Note: This response rule action is available only if you deploy one or more
custom Server FlexResponse plug-ins to Symantec Data Loss Prevention.

See “Deploying a Server FlexResponse plug-in” on page 2143.

Set Status Set the incident status to the specified value.

See “Configuring the Set Status action” on page 1790.

Network Protect SharePoint Quarantine Quarantine sensitive data stored on a Microsoft SharePoint server.

See “Configuring the Network Protect: SharePoint Quarantine smart response


action” on page 1793.

Network Protect SharePoint Release Release sensitive files that were quarantined from a Microsoft SharePoint
from Quarantine server.

See “Configuring the Network Protect: SharePoint Release from Quarantine


smart response action” on page 1795.

See “Implementing response rules” on page 1758.

About response rule conditions


Response rule conditions are optional response rule components. Conditions define how and
when the system triggers response rule actions. Conditions give you multiple ways to prioritize
incoming incidents to focus remediation efforts and take appropriate response.
See “Implementing response rules” on page 1758.
Response rule conditions trigger action based on detection match criteria. For example, you
can configure a condition to trigger action for high severity incidents, certain types of incidents,
or after a specified number of incidents.
See “Configuring response rule conditions” on page 1764.
Conditions are not required. If a response rule does not declare a condition, the response rule
action always executes each time an incident occurs. If a condition is declared, it must be met
Responding to policy violations 1753
About response rule action execution priority

for the action to trigger. If more than one condition is declared, all must be met for the system
to take action.
See “Configuring response rules” on page 1763.

Table 47-13 Available response rule conditions

Condition type Description

Endpoint Location Triggers a response action when the endpoint is on or off the corporate network.

See “Configuring the Endpoint Location response condition” on page 1771.

Endpoint Device Triggers a response action when an event occurs on a configured endpoint
device.

See “Configuring the Endpoint Device response condition” on page 1772.

Incident Type Triggers a response action when the specified type of detection server reports
a match.

See “Configuring the Incident Type response condition” on page 1773.

Incident Match Count Triggers a response action when the volume of policy violations exceeds a
threshold or range.

See “Configuring the Incident Match Count response condition” on page 1774.

Protocol or Endpoint Monitoring Triggers a response action when an incident is detected on a specified network
communications protocol (such as HTTP) or endpoint destination (such as
CD/DVD).

See “Configuring the Protocol or Endpoint Monitoring response condition”


on page 1775.

Severity Triggers a response action when the policy violation is a certain severity level.

See “Configuring the Severity response condition” on page 1778.

About response rule action execution priority


A Symantec Data Loss Prevention server executes response rule actions according to a
system-defined prioritized order. You cannot modify the order of execution among response
rules of different types.
In all cases, when a server executes two or more different response rules for the same policy,
the higher priority response action takes precedence.
Consider the following example(s):
■ One endpoint response rule lets a user cancel an attempted file copy and another rule
blocks the attempt.
Responding to policy violations 1754
About response rule action execution priority

The detection server blocks the file copy.


■ One network response rule action copies a file and another action quarantines it.
The detection server quarantines the file.
■ One network response rule action modifies the content of an email message and another
action blocks the transmission.
The detection server blocks the email transmission.
You cannot change the priority execution order for different response rule action types. But,
you can modify the order of execution for the same type of response rule action with conflicting
instructions.
See “Modifying response rule ordering” on page 1769.

Table 47-14 System-defined response rule execution priority

Execution priority Description


(from highest to lowest)

Endpoint Prevent: Block See “Configuring the Endpoint Prevent: Block action”
on page 1817.

Endpoint Prevent: Encrypt See “Configuring the Endpoint Prevent: Encrypt action”
on page 1821.

Endpoint Prevent: User Cancel See “Configuring the Endpoint Prevent: User Cancel action”
on page 1828.

Endpoint: FlexResponse See “Configuring the Endpoint: FlexResponse action”


on page 1813.

Endpoint Prevent: Notify See “Configuring the Endpoint Prevent: Notify action”
on page 1825.

Endpoint Discover: Quarantine File See “Configuring the Endpoint Discover: Quarantine File action”
on page 1815.

All: Limit Incident Data Retention See “Configuring the Limit Incident Data Retention action”
on page 1783.

Network Prevent: Block SMTP Message See “Configuring the Network Prevent: Block SMTP Message
action” on page 1832.

Network Prevent: Modify SMTP See “Configuring the Network Prevent: Modify SMTP Message
Message action” on page 1833.

Network Prevent for Web: Remove See “Configuring the Network Prevent for Web: Remove
HTTP/HTTPS Content HTTP/S Content action” on page 1835.
Responding to policy violations 1755
About response rule action execution priority

Table 47-14 System-defined response rule execution priority (continued)

Execution priority Description


(from highest to lowest)

Network Prevent for Web: Block See “Configuring the Network Prevent for Web: Block HTTP/S
HTTP/HTTPS action” on page 1831.

Network Prevent for Web: Block FTP See “Configuring the Network Prevent for Web: Block FTP
Request Request action” on page 1831.

Network Protect: Quarantine File See “Configuring the Network Protect: Quarantine File action”
on page 1837.

Network Protect: Encrypt File See “Configuring the Network Protect: Encrypt File action”
on page 1838.

Network Protect: Copy File See “Configuring the Network Protect: Copy File action”
on page 1836.

All: Set Status See “Configuring the Set Status action” on page 1790.

All: Set Attribute See “Configuring the Set Attribute action” on page 1789.

All: Add Note See “Configuring the Add Note action” on page 1782.

All: Log to a Syslog Server See “Configuring the Log to a Syslog Server action” on page 1785.

All: Send Email Notification See “Configuring the Send Email Notification action”
on page 1786.

Cloud Storage: Add Visual Tag See “Configuring the Cloud Storage: Add Visual Tag action”
on page 1791.

Cloud Storage: Quarantine See “Configuring the Cloud Storage: Quarantine action”
on page 1791.

Server FlexResponse See “Configuring the Server FlexResponse action” on page 1788.
Note: Server FlexResponse actions that are part of Automated
Response rules execute on the Enforce Server, rather than the
detection server.

Cloud Applications and API appliance See “Configuring the Block Data-in-Motion action” on page 1808.
(Data-in-Motion): Block Data-in-Motion

Cloud Applications and API appliance See “Configuring the Redact Data-in-Motion action” on page 1812.
(Data-in-Motion): Redact Data-in-Motion

Cloud Applications and API appliance See “Configuring the Encrypt Data-in-Motion action”
(Data-in-Motion): Encrypt Data-in-Motion on page 1810.
Responding to policy violations 1756
About response rule action execution priority

Table 47-14 System-defined response rule execution priority (continued)

Execution priority Description


(from highest to lowest)

Cloud Applications and API appliance See “Configuring the Quarantine Data-in-Motion action”
(Data-in-Motion): Quarantine on page 1811.
Data-in-Motion

Cloud Applications and API appliance See “Configuring the Perform DRM on Data-in-Motion action”
(Data-in-Motion): Perform DRM on on page 1810.
Data-in-Motion

Cloud Applications and API appliance See “Configuring the Custom Action on Data-in-Motion action”
(Data-in-Motion): Custom Action on on page 1809.
Data-in-Motion

Cloud Applications and API appliance See “Configuring the Encrypt Data-at-Rest action” on page 1799.
(Data-at-Rest): Encrypt Data-at-Rest

Cloud Applications and API appliance See “Configuring the Delete Data-at-Rest action” on page 1799.
(Data-at-Rest): Delete Data-at-Rest

Cloud Applications and API appliance See “Configuring the Quarantine Data-at-Rest action”
(Data-at-Rest): Quarantine Data-at-Rest on page 1801.

Cloud Applications and API appliance See “Configuring the Tag Data-at-Rest action” on page 1802.
(Data-at-Rest): Tag Data-at-Rest

Cloud Applications and API appliance See “Configuring the Perform DRM on Data-at-Rest action”
(Data-at-Rest): Perform DRM on on page 1800.
Data-at-Rest

Cloud Applications and API appliance See “Configuring the Remove Shared Links in Data-at-Rest
(Data-at-Rest): Break Links in action” on page 1802.
Data-at-Rest

Cloud Applications and API appliance See “Configuring the Custom Action on Data-at-Rest action”
(Data-at-Rest): Custom Action on on page 1798.
Data-at-Rest

Cloud Applications and API appliance See “Configuring the Set File Access to 'All Read' action”
(Additional Data-at-Rest Actions): Set on page 1806.
File Access to 'All Read'

Cloud Applications and API appliance See “Configuring the Prevent download, copy, print action”
(Additional Data-at-Rest Actions): on page 1803.
Prevent download, copy, print
Responding to policy violations 1757
About response rule authoring privileges

Table 47-14 System-defined response rule execution priority (continued)

Execution priority Description


(from highest to lowest)

Cloud Applications and API appliance See “Configuring the Set File Access to 'Internal Read' action”
(Additional Data-at-Rest Actions): Set on page 1807.
File Access to 'Internal Read'

Cloud Applications and API appliance See “Configuring the Set File Access to 'Internal Edit'”
(Additional Data-at-Rest Actions): Set on page 1806.
File Access to 'Internal Edit'

Cloud Applications and API appliance See “Configuring the Set Collaborator Access to 'Read' action”
(Additional Data-at-Rest Actions): Set on page 1805.
Collaborator Access to 'Read'

Cloud Applications and API appliance See “Configuring the Set Collaborator Access to 'Edit' action”
(Additional Data-at-Rest Actions): Set on page 1804.
Collaborator Access to 'Edit'

Cloud Applications and API appliance See “Configuring the Remove Collaborator Access action”
(Additional Data-at-Rest Actions): on page 1804.
Remove Collaborator Access

Cloud Applications and API appliance See “Configuring the Set Collaborator Access to 'Preview'
(Additional Data-at-Rest Actions): Set action” on page 1805.
Collaborator Access to 'Preview'

Cloud Applications and API appliance See “Configuring the Add two-factor authentication action”
(Data-in-Motion): Add two-factor on page 1808.
authentication

See “Implementing response rules” on page 1758.


See “Manage response rules” on page 1761.

About response rule authoring privileges


To manage and create response rules, you must be assigned to a role with response rule
authoring privileges. To add a response rule to a policy, you must have policy authoring
privileges.
See “Policy authoring privileges” on page 375.
For business reasons, you may want to grant response rule authoring and policy authoring
privileges to the same role. Or, you may want to keep these roles separate.
See “About recommended roles for your organization” on page 111.
Responding to policy violations 1758
Implementing response rules

If you log on to the system as a user without response rule authoring privileges, the Manage
> Policies > Response Rules screen is not available.
See “About role-based access control” on page 109.

Implementing response rules


You define response rules independent of policies.
See “About response rules” on page 1738.
You must have response rule authoring privileges to create and manage response rules.
See “About response rule authoring privileges” on page 1757.

Table 47-15 Workflow for implementing policy response rules

Step Action Description

1 Review the available response rules. The Manage > Policies > Response Rules screen displays
all configured response rules.

See “Manage response rules” on page 1761.

The solution pack for your system provides configured


response rules. You can use these response rules in your
policies as they exist, or you can modify them.

See “Solution packs” on page 372.

2 Decide the type of response rule to Decide the type of response rules based on your business
implement: Smart, Automated, both. requirements.

See “About response rule execution types” on page 1750.

3 Determine the type of actions you want to See “About response rule conditions” on page 1752.
implement and any triggering conditions.
See “About response rule actions” on page 1738.

4 Understand the order of precedence among See “About response rule action execution priority”
response rule actions of different and the on page 1753.
same types.
See “Modifying response rule ordering” on page 1769.
Responding to policy violations 1759
Response rule best practices

Table 47-15 Workflow for implementing policy response rules (continued)

Step Action Description

5 Integrate the Enforce Server with an external Some response rules may require integration with external
system (if required for the response rule). systems.
These may include:

■ A SIEM system for the Log to a Syslog Server response


rule.
■ An SMTP email server for the Send Email Notification
response rule
■ A Web proxy host for Network Prevent for Web response
rules.
■ An MTA for Network Prevent for Email response rules.

6 Add a new response rule. See “Adding a new response rule” on page 1762.

7 Configure response rules. See “Configuring response rules” on page 1763.

8 Configure one or more response rule See “Configuring response rule conditions” on page 1764.
conditions (optional).

9 Configure one or more response rule actions You must define at least one action for a valid response rule.
(required).
See “Configuring response rule actions” on page 1765.

The action executes when a policy violation is reported or


when a response rule condition is matched.

10 Add response rules to policies. You must have policy authoring privileges to add response
rules to policies.

See “Adding an automated response rule to a policy”


on page 442.

Response rule best practices


When implementing response rules, consider the following:
■ Response rules are not required for policy execution. In general it is best to implement and
fine-tune your policy rules and exceptions before you implement response rules. Once you
achieve the desired policy detection results, you can then implement and refine response
rules.
■ Response rules require at lease one rule action; a condition is optional. If you do not
implement a condition, the action always executes when an incident is reported. If you
configure more than one response rule condition, all conditions must match for the response
rule action to trigger.
Responding to policy violations 1760
Response rule best practices

See “About response rule actions” on page 1738.


■ Response rule conditions are derived from policy rules. Understand the type of rule and
exception conditions that the policy implements when you configure response rule conditions.
The system evaluates the response rule condition based on how the policy rule counts
matches.
See “Policy matching conditions” on page 386.
■ The system displays only the response rule name for policy authors to select when they
add response rules to policies. Be sure to provide a descriptive name that helps policy
authors identify the purpose of the response rule.
See “Configuring policies” on page 413.
■ You cannot combine an Endpoint Prevent: Notify or Endpoint Prevent: Block response rule
action with EDM, IDM, or DGM detection methods. If you do, the system displays a warning
for the policy that it is misconfigured.
See “Manage and add policies” on page 432.
■ If you combine multiple response rules in a single policy, make sure that you understand
the order of precedence among response rules.
See “About response rule action execution priority” on page 1753.
■ Use Smart Response rules only where it is appropriate for human intervention.
See “About configuring Smart Response rules” on page 1764.
■ When sensitive files are encrypted using Symantec Information Centric Encryption, the
original file is replaced with an HTML file of the same name. You must update all existing
links and references so that they point to the new HTML file.
■ Microsoft SharePoint enables users to upload HTML files that are no larger than 256 MB
in size. To ensure that sensitive files in SharePoint can be encrypted successfully, do not
upload files that are 256 MB in size or greater.
See “Configuring the Server FlexResponse action” on page 1788.
■ If you configure multiple Server FlexResponse response rule actions for Microsoft SharePoint
scan targets, the response rule actions could be executed in order of response rule action
priority.
See “About response rule action execution priority” on page 1753.
Chapter 48
Configuring and managing
response rules
This chapter includes the following topics:

■ Manage response rules

■ Adding a new response rule

■ Configuring response rules

■ About configuring Smart Response rules

■ Configuring response rule conditions

■ Configuring response rule actions

■ Modifying response rule ordering

■ About removing response rules

Manage response rules


The Manage > Policies > Response Rules screen is the home page for managing response
rules, and the starting point for adding new ones.
See “About response rules” on page 1738.
You must have response rule authoring privileges to manage and add response rules.
See “About response rule authoring privileges” on page 1757.
Configuring and managing response rules 1762
Adding a new response rule

Table 48-1 Response Rules screen actions

Action Description

Add Response Rule Click Add Response Rule to define a new response rule.
See “Adding a new response rule” on page 1762.

Modify Response Rule Click Modify Response Rule Order to modify the response rule order of precedence.
Order
See “Modifying response rule ordering” on page 1769.

Edit an existing response Click the response rule to modify it.


rule
See “Configuring response rules” on page 1763.

Delete an existing Click the red X icon next to the far right of the response rule to delete it.
response rule
You must confirm the operation before deletion occurs.

See “About removing response rules” on page 1770.

Refresh the list Click the refresh arrow icon at the upper right of the Response Rules screen to fetch
the latest status of the rule.

Table 48-2 Response Rules screen display

Display column Description

Order The Order of precedence when more than one response rule is configured.

See “Modifying response rule ordering” on page 1769.

Rule The Name of the response rule.

See “Configuring response rules” on page 1763.

Actions The type of Action the response rule can take to respond to an incident (required).

See “Configuring response rule actions” on page 1765.

Conditions The Condition that triggers the response rule (if any).

See “Configuring response rule conditions” on page 1764.

See “Implementing response rules” on page 1758.

Adding a new response rule


Add a new response rule from the Manage > Policies > Response Rules > New Response
Rule screen.
See “About response rules” on page 1738.
Configuring and managing response rules 1763
Configuring response rules

To add a new response rule


1 Click Add Response Rule at the Manage > Policies > Response Rules screen.
See “Manage response rules” on page 1761.
2 At the New Response Rule screen, select one of the following options:
■ Automated Response
The system automatically executes the response action as the server evaluates
incidents (default option).
See “About Automated Response rules” on page 1751.
■ Smart Response
An authorized user executes the response action from the Incident Snapshot screen
in the Enforce Server administration console.
See “About Smart Response rules” on page 1751.

3 Click Next to configure the response rule.


See “Configuring response rules” on page 1763.
See “Implementing response rules” on page 1758.

Configuring response rules


You configure response rules at the Manage > Policies > Response Rules > Configure
Response Rule screen.
See “About response rules” on page 1738.
To configure a response rule
1 Add a new response rule, or modify an existing one.
See “Adding a new response rule” on page 1762.
See “Manage response rules” on page 1761.
2 Enter a response Rule Name and Description.
3 Optionally, define one or more Conditions to dictate when the response rule executes.
See “Configuring response rule conditions” on page 1764.
If no condition is declared, the response rule action always executes when there is a
match (assuming that the detection rule is set the same).
Skip this step if you selected the Smart Response rule option.
See “About configuring Smart Response rules” on page 1764.
Configuring and managing response rules 1764
About configuring Smart Response rules

4 Select and configure one or more Actions. You must define at least one action.
See “Configuring response rule actions” on page 1765.
5 Click Save to save the response rule definition.
See “Manage response rules” on page 1761.
See “Implementing response rules” on page 1758.

About configuring Smart Response rules


When implementing Smart Response rules, consider the following:
■ Smart Response rules are best suited for the incidents that warrant user review to determine
if any response action is required.
If you do not want user involvement in triggering a response rule action, use Automated
Response rules instead.
■ You cannot configure any triggering conditions with Smart Response rules.
Authorized users decide when a detection incident warrants a response.
■ You are limited in the actions you can take with Smart Response rules (note, log, email,
status).
If you need to block or modify an action, use Automated Response rules.
See “About Smart Response rules” on page 1751.
See “Implementing response rules” on page 1758.

Configuring response rule conditions


You can add one or more conditions to a response rule. An incident must meet all response
rule conditions before the system executes any response rule actions.
See “About response rule conditions” on page 1752.
To configure a response rule condition
1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Click Add Condition to add a new condition.
Conditions are optional and based on detection rule matches. Each type of response rule
condition performs a different function.
See “About response rule conditions” on page 1752.
Configuring and managing response rules 1765
Configuring response rule actions

3 Choose the condition type from the Conditions list.


For example, select the condition Incident Match Count and Is Greater Than and enter
15 in the textbox. This condition triggers the response rule action after 15 policy violation
matches.
4 To add another condition, click Add Condition and repeat the process.
If all conditions do not match, no action is taken.
5 Click Save to save the condition.
Click Cancel to not save the condition and return to the previous screen.
Click the red X icon beside the condition to delete it from the response rule.
See “Manage response rules” on page 1761.
See “Implementing response rules” on page 1758.

Configuring response rule actions


You must configure at least one action for the response rule to be valid. You can configure
multiple response rule actions. Each action is evaluated independently.
See “Implementing response rules” on page 1758.
To define a response rule action
1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Choose an action type from the Actions list and click Add Action.
For example, add the All: Add Note action to the response rule. This action lets the
remediator annotate the incident.
3 Configure the action type by specifying the expected parameters for the chosen action
type.
See Table 48-3 on page 1766.
4 Repeat these steps for each action you want to add.
If you add additional actions, consider the execution order and possible modification of
similar types.
See “Modifying response rule ordering” on page 1769.
5 Click Save to save the response rule.
See “Manage response rules” on page 1761.
Configuring and managing response rules 1766
Configuring response rule actions

Table 48-3 Configure a response rule action

Incident type Response rule Description

All Add Note See “Configuring the Add Note action” on page 1782.

All Limit Incident Data See “Configuring the Limit Incident Data Retention action” on page 1783.
Retention

All Log to a Syslog Server See “Configuring the Log to a Syslog Server action” on page 1785.

All Send Email Notification See “Configuring the Send Email Notification action” on page 1786.

All Server FlexResponse See “Configuring the Server FlexResponse action” on page 1788.

All Set Attribute See “Configuring the Set Attribute action” on page 1789.

All Set Status See “Configuring the Set Status action” on page 1790.

Cloud Storage Add Visual Tag See “Configuring the Cloud Storage: Add Visual Tag action”
on page 1791.

Cloud Storage Quarantine See “Configuring the Cloud Storage: Quarantine action” on page 1791.

Applications: Break Links in See “Configuring the Remove Shared Links in Data-at-Rest action”
Data-at-Rest Data-at-Rest on page 1802.
(DAR)

Applications: Custom Action on See “Configuring the Custom Action on Data-at-Rest action”
Data-at-Rest Data-at-Rest on page 1798.
(DAR)

Applications: Delete Data-at-Rest See “Configuring the Delete Data-at-Rest action” on page 1799.
Data-at-Rest
(DAR)

Applications: Encrypt Data-at-Rest See “Configuring the Encrypt Data-at-Rest action” on page 1799.
Data-at-Rest
(DAR)

Applications: Perform DRM on See “Configuring the Perform DRM on Data-at-Rest action”
Data-at-Rest Data-at-Rest on page 1800.
(DAR)

Applications: Quarantine Data-at-Rest See “Configuring the Quarantine Data-at-Rest action” on page 1801.
Data-at-Rest
(DAR)

Applications: Tag Data-at-Rest See “Configuring the Tag Data-at-Rest action” on page 1802.
Data-at-Rest
(DAR)
Configuring and managing response rules 1767
Configuring response rule actions

Table 48-3 Configure a response rule action (continued)

Incident type Response rule Description

Applications: Add two-factor See “Configuring the Add two-factor authentication action” on page 1808.
Data-in-Motion authentication
(DIM)

Applications: Block Data-in-Motion See “Configuring the Block Data-in-Motion action” on page 1808.
Data-in-Motion
(DIM)

Applications: Custom Action on See “Configuring the Custom Action on Data-in-Motion action”
Data-in-Motion Data-in-Motion on page 1809.
(DIM)

Applications: Encrypt Data-in-Motion See “Configuring the Encrypt Data-in-Motion action” on page 1810.
Data-in-Motion
(DIM)

Applications: Perform DRM on See “Configuring the Perform DRM on Data-in-Motion action”
Data-in-Motion Data-in-Motion on page 1810.
(DIM)

Applications: Quarantine See “Configuring the Quarantine Data-in-Motion action” on page 1811.
Data-in-Motion Data-in-Motion
(DIM)

Applications: Redact Data-in-Motion See “Configuring the Redact Data-in-Motion action” on page 1812.
Data-in-Motion
(DIM)

Applications: Prevent download, copy, See “Configuring the Prevent download, copy, print action”
Data-at-Rest print on page 1803.
(DAR)

Applications: Remove Collaborator See “Configuring the Remove Collaborator Access action” on page 1804.
Data-at-Rest Access
(DAR)

Applications: Set Collaborator Access See “Configuring the Set Collaborator Access to 'Edit' action”
Data-at-Rest to 'Edit' on page 1804.
(DAR)

Applications: Set Collaborator Access See “Configuring the Set Collaborator Access to 'Preview' action”
Data-at-Rest to 'Preview' on page 1805.
(DAR)
Configuring and managing response rules 1768
Configuring response rule actions

Table 48-3 Configure a response rule action (continued)

Incident type Response rule Description

Applications: Set Collaborator Access See “Configuring the Set Collaborator Access to 'Read' action”
Data-at-Rest to 'Read' on page 1805.
(DAR)

Applications: Set File Access to 'All See “Configuring the Set File Access to 'All Read' action” on page 1806.
Data-at-Rest Read'
(DAR)

Applications: Set File Access to See “Configuring the Set File Access to 'Internal Edit'” on page 1806.
Data-at-Rest 'Internal Edit'
(DAR)

Applications: Set File Access to See “Configuring the Set File Access to 'Internal Read' action”
Data-at-Rest 'Internal Read' on page 1807.
(DAR)

Endpoint FlexResponse See “Configuring the Endpoint: FlexResponse action” on page 1813.

Endpoint ICT Classification And See “Configuring the Endpoint: ICT Classification And Tagging action”
Tagging on page 1814.

Endpoint Information Centric


Discover Defense

Endpoint Quarantine File See “Configuring the Endpoint Discover: Quarantine File action”
Discover on page 1815.

Endpoint Block See “Configuring the Endpoint Prevent: Block action” on page 1817.
Prevent

Endpoint Encrypt See “Configuring the Endpoint Prevent: Encrypt action” on page 1821.
Prevent

Endpoint Notify See “Configuring the Endpoint Prevent: Notify action” on page 1825.
Prevent

Endpoint User Cancel See “Configuring the Endpoint Prevent: User Cancel action”
Prevent on page 1828.

Network Prevent Block FTP Request See “Configuring the Network Prevent for Web: Block FTP Request
for Web action” on page 1831.

Network Prevent Block HTTP/S See “Configuring the Network Prevent for Web: Block HTTP/S action”
for Web on page 1831.

Network Prevent Block SMTP Message See “Configuring the Network Prevent: Block SMTP Message action”
for Email on page 1832.
Configuring and managing response rules 1769
Modifying response rule ordering

Table 48-3 Configure a response rule action (continued)

Incident type Response rule Description

Network Prevent Modify SMTP Message See “Configuring the Network Prevent: Modify SMTP Message action”
for Email on page 1833.

Network Prevent Remove HTTP/S Content See “Configuring the Network Prevent for Web: Remove HTTP/S
for Web Content action” on page 1835.

Network Protect Copy File See “Configuring the Network Protect: Copy File action” on page 1836.

Network Protect Quarantine File See “Configuring the Network Protect: Quarantine File action”
on page 1837.

Network Protect Encrypt File See “Configuring the Network Protect: Encrypt File action” on page 1838.

See “Implementing response rules” on page 1758.

Modifying response rule ordering


You cannot change the system-defined execution priority for different types of response rule
actions. But, you can modify the order of execution for response rule actions of the same type
with conflicting instructions.
See “About response rule action execution priority” on page 1753.
For example, consider a scenario where you include two response rules in a policy. Each
response rule implements a Limit Incident Data Retention action. One action discards all
attachments and the other action discards only those attachments that are not violations. In
this case, when the policy is violated, the detection server looks to the response rule order
priority to determine which action takes precedence. This type of ordering is configurable.
To modify response rule action ordering
1 Navigate to the Manage > Policies > Response Rules screen.
See “Manage response rules” on page 1761.
2 Note the Order column and number beside each configured response rule.
By default the system sorts the list of response rules by the Order column in descending
order from highest priority (1) to lowest. Initially the system orders the response rules in
the order they are created. You can modify this order.
3 To enable modification mode, click Modify Response Rule Order.
The Order column now displays a drop-down menu for each response rule.
Configuring and managing response rules 1770
About removing response rules

4 To modify the ordering, for each response rule you want to reorder, select the desired
order priority from the drop-down menu.
For example, for a response rule with order priority of 2, you can modify it to be 1 (highest
priority).
Modifying an order number moves that response rule to its modified position in the list
and updates all other response rules.
5 Click Save to save the modifications to the response rule ordering.
6 Repeat these steps as necessary to achieve the desired results.
See “Implementing response rules” on page 1758.

About removing response rules


You can delete response rules at the Manage > Policies > Response Rules screen.
See “Manage response rules” on page 1761.
When deleting a response rule, consider the following:
■ A user must have response rule authoring privileges to delete an existing response rule.
■ A response rule author cannot delete an existing response rule while another user modifies
it.
■ A response rule author cannot delete a response rule if a policy declares that response
rule. In this case you must remove the response rule from all policies that declare the
response rule before you can delete it.
Chapter 49
Response rule conditions
This chapter includes the following topics:

■ Configuring the Endpoint Location response condition

■ Configuring the Endpoint Device response condition

■ Configuring the Incident Type response condition

■ Configuring the Incident Match Count response condition

■ Configuring the Protocol or Endpoint Monitoring response condition

■ Configuring the SEP Intensity Level response condition

■ Configuring the Severity response condition

Configuring the Endpoint Location response condition


The Endpoint Location condition triggers response rule action based on the connection status
of the DLP Agent when an endpoint policy is violated.
See “About response rule conditions” on page 1752.

Note: This condition is specific to endpoint incidents. You should not implement this condition
for Network or Discover incidents. If you do the response rule action does not to execute.
Response rule conditions 1772
Configuring the Endpoint Device response condition

To configure the Endpoint Location condition


1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Select the Endpoint Location condition from the Conditions list.
See “Configuring response rule conditions” on page 1764.
3 Select the endpoint location requirements to trigger actions.
See Table 49-1 on page 1772.

Table 49-1 Endpoint Location condition parameters

Qualifier Condition Description

Is Any Of Off the corporate This combination triggers a response rule action if an incident occurs when the
network endpoint is off the corporate network.

Is None Of Off the corporate This combination does not trigger a response rule action if an incident occurs
network when the endpoint is off the corporate network.

Is Any Of On the corporate This combination triggers a response rule action if an incident occurs when the
network endpoint is on the corporate network.

Is None Of On the corporate This combination does not trigger a response rule action if an incident occurs
network when the endpoint is on the corporate network.

See “Implementing response rules” on page 1758.


See “Manage response rules” on page 1761.

Configuring the Endpoint Device response condition


The Endpoint Device condition triggers response rule action when an incident is detected from
one or more configured endpoint devices.
See “About response rule conditions” on page 1752.
You configure endpoint devices at the System > Agents > Endpoint Devices screen.
See “About endpoint device detection” on page 917.

Note: This condition is specific to endpoint incidents. You should not implement this condition
for Network or Discover incidents. If you do the response rule action does not to execute.
Response rule conditions 1773
Configuring the Incident Type response condition

To configure the Endpoint Device response condition


1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Select the Endpoint Device condition from the Conditions list.
See “Configuring response rule conditions” on page 1764.
3 Select to detect or except specific endpoint devices.
See Table 49-2 on page 1773.

Table 49-2 Endpoint Device condition parameters

Qualifier Condition Description

Is Any Of Configured Triggers a response rule action when an incident is detected on a configured
device endpoint device.

Is None Of Configured Does not trigger (excludes from executing) a response rule action when an incident
device is detected on a configured endpoint device.

See “Implementing response rules” on page 1758.


See “Manage response rules” on page 1761.

Configuring the Incident Type response condition


The Incident Type condition triggers a response rule action based on the type of detection
server that reports the incident.
See “About response rule conditions” on page 1752.
To configure the Incident Type condition
1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Choose the Incident Type condition from the Conditions list.
See “Configuring response rule conditions” on page 1764.
3 Select one or more incident types.
Use the Ctrl key to select multiple types.
See Table 49-3 on page 1774.
Response rule conditions 1774
Configuring the Incident Match Count response condition

Table 49-3 Incident Type condition parameters

Parameter Server Description

Is Any Of Cloud Detection Triggers a response rule action for any incident detected by the Cloud Detection
Service or API Service or API Detection for Developer Apps Appliance.
Detection for
Is None Of Developer Apps Does not trigger a response rule action for any incident detected by the Cloud
Appliance Detection Service or API Detection for Developer Apps Appliance.

Is Any Of Discover Triggers a response rule action for any incident that Network Discover detects.

Is None Of Does not trigger a response rule action for any incident that Network Discover
detects.

Is Any Of Endpoint Triggers a response rule action for any incident that Endpoint Prevent detects.

Is None Of Does not trigger a response rule action for any incident that Endpoint Prevent
detects.

Is Any Of Network Triggers a response rule action for any incident that Network Prevent detects.

Is None Of Does not trigger a response rule action for any incident that Network Prevent
detects.

See “Implementing response rules” on page 1758.


See “Manage response rules” on page 1761.

Configuring the Incident Match Count response


condition
The Incident Match Count condition triggers a response rule action based on the number of
policy violations reported.
See “About response rule conditions” on page 1752.
Response rule conditions 1775
Configuring the Protocol or Endpoint Monitoring response condition

To configure the Incident Match Count condition


1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Choose the Incident Match Count condition from the Conditions list.
See “Configuring response rule conditions” on page 1764.
3 In the text field, enter a numeric value that indicates the threshold above which you want
the response rule to trigger.
For example, if you enter 15 the response rule triggers after 15 policy violations have
been detected.
See Table 49-4 on page 1775.

Table 49-4 Incident Match Count condition parameters

Parameter Input Description

Is Greater Than User-specified Triggers a response rule action if the threshold number of incidents is
number eclipsed.

Is Greater Than or User-specified Triggers a response rule action if the threshold number of incidents is met
Equals number or eclipsed.

Is Between User-specified pair of Triggers a response rule action when the number of incidents is between
numbers the range of numbers specified.

Is Less Than User-specified Triggers a response rule action if the number of incidents is less than the
number specified number.

Is Less Than or User-specified Triggers a response rule action when the number of incidents is equal to
Equals number or less than the specified number.

See “Implementing response rules” on page 1758.


See “Manage response rules” on page 1761.

Configuring the Protocol or Endpoint Monitoring


response condition
The Protocol or Endpoint Monitoring condition triggers action based on the protocol or the
endpoint destination, device, or application where the policy violation occurred.
See “About response rule conditions” on page 1752.
Response rule conditions 1776
Configuring the Protocol or Endpoint Monitoring response condition

To configure the Protocol or Endpoint Monitoring condition


1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Choose the Protocol or Endpoint Monitoring condition from the Conditions list.
See “Configuring response rule conditions” on page 1764.
3 Use the Ctrl key to select multiple conditions. Use the Shift key to select a range.
See Table 49-5 on page 1776.
The system lists any additional network protocols that you configure at the System >
Settings > Protocols screen.

Table 49-5 Protocol or Endpoint Destination condition parameters

Qualifier Condition Description

Is Any Of Triggers an action if an endpoint application file has been accessed.


Endpoint Application File
Is None Of Access Does not trigger action if an endpoint application file has been
accessed.

Is Any Of Triggers an action if an endpoint CD/DVD has been written to.


Endpoint CD/DVD
Is None Of Does not trigger action if an endpoint CD/DVD has been written to.

Is Any Of Triggers an action if the endpoint clipboard has been copied or pasted
to.
Endpoint Clipboard
Is None Of Does not trigger action if the endpoint clipboard has been copied or
pasted to.

Is Any Of Triggers an action if sensitive information is copied to or from a network


share.
Endpoint Copy to
Network Share
Is None Of Does not trigger action if sensitive information is copied to or from a
network share.

Is Any Of Triggers an action if sensitive files are discovered on the local drive.
Endpoint Local Drive
Is None Of Does not trigger action if sensitive files are discovered on the local
drive.

Is Any Of Triggers an action if an endpoint printer or fax has been sent to.
Endpoint Printer/Fax
Is None Of Does not trigger action if an endpoint printer or fax has been sent to.
Response rule conditions 1777
Configuring the SEP Intensity Level response condition

Table 49-5 Protocol or Endpoint Destination condition parameters (continued)

Qualifier Condition Description

Is Any Of Triggers an action if sensitive data is copied to a removable storage


device.
Endpoint Removable
Storage Device
Is None Of Does not trigger action if sensitive data is copied to a removable storage
device.

Is Any Of Triggers an action if sensitive data is copied through FTP.


FTP
Is None Of Does not trigger action if sensitive data is copied through FTP.

Is Any Of Triggers an action if sensitive data is sent through HTTP.


HTTP
Is None Of Does not trigger action if sensitive data is sent through HTTP.

Is Any Of Triggers an action if sensitive data is sent through HTTPS.


HTTPS
Is None Of Does not trigger action if sensitive data is sent through HTTPS.

Is Any Of Triggers an action if sensitive data is sent through NNTP.


NNTP
Is None Of Does not trigger action if sensitive data is sent through NNTP.

Is Any Of Triggers an action if sensitive data is sent through SMTP.


SMTP
Is None Of Does not trigger action if sensitive data is sent through SMTP.

See “Implementing response rules” on page 1758.


See “Manage response rules” on page 1761.

Configuring the SEP Intensity Level response


condition
The SEP Intensity Level condition triggers a response rule action based on the SEP Intensive
Protection intensity level.
See “About the SEP Intensive Protection file reputation service” on page 2501.
See “About response rule conditions” on page 1752.
Response rule conditions 1778
Configuring the Severity response condition

To configure the SEP Intensity Level condition


1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Select the SEP Intensity Level condition from the Conditions list.
See “Configuring response rule conditions” on page 1764.
3 Select one or more SEP intensity levels.
See Table 49-6 on page 1778.
Use the Ctrl key to select multiple levels. Use the Shift key to select a range.

Table 49-6 SEP Intensity Level condition parameters

Qualifier Condition Description

Is Any Of Malicious Triggers a response rule action if the application requesting file access is flagged Malicious.

Is None Of Malicious Does not trigger a response rule action if the application requesting file access is flagged
Malicious.

Is Any Of Suspicious Triggers a response rule action if the application requesting file access is flagged Suspicous.

Is None Of Suspicious Does not trigger a response rule action if the application requesting file access is flagged
Suspicious.

Is Any Of Unproven Triggers a response rule action if the application requesting file access is flagged Unproven.

Is None Of Unproven Does not trigger a response rule action if the application requesting file access is flagged
Unproven.

See “Implementing response rules” on page 1758.


See “Manage response rules” on page 1761.

Configuring the Severity response condition


The Severity condition triggers a response rule action based on the severity of the policy rule
violation.
See “About response rule conditions” on page 1752.
Response rule conditions 1779
Configuring the Severity response condition

To configure the Severity condition


1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Select the Severity condition from the Conditions list.
See “Configuring response rule conditions” on page 1764.
3 Select one or more severity levels.
Use the Ctrl key to select multiple levels. Use the Shift key to select a range.
See Table 49-7 on page 1779.

Table 49-7 Severity condition parameters

Parameter Severity Description

Is Any Of High Triggers a response rule action when a detection rule with
severity set to high is matched.

Is None Of High Does not trigger a response rule action when a detection rule
with severity set to high is matched.

Is Any Of Medium Triggers a response rule action when a detection rule with
severity set to medium is matched.

Is None Of Medium Does not trigger a response rule action when a detection rule
with severity set to medium is matched.

Is Any Of Low Triggers a response rule action when a detection rule with
severity set to low is matched.

Is None Of Low Does not trigger a response rule action when a detection rule
with severity set to low is matched.

Is Any Of Info Triggers a response rule action when a detection rule with
severity set to info is matched.

Is None Of Info Does not trigger a response rule action when a detection rule
with severity set to info is matched.

See “Implementing response rules” on page 1758.


See “Manage response rules” on page 1761.
Chapter 50
Response rule actions
This chapter includes the following topics:

■ Configuring the Add Note action

■ Configuring the Encrypt Smart Response action

■ Configuring the Limit Incident Data Retention action

■ Configuring the Log to a Syslog Server action

■ Configuring the Send Email Notification action

■ Configuring the Server FlexResponse action

■ Configuring the Set Attribute action

■ Configuring the Set Status action

■ Configuring the Cloud Storage: Add Visual Tag action

■ Configuring the Cloud Storage: Quarantine action

■ Configuring the Quarantine Smart Response action

■ Configuring the Network Protect: SharePoint Quarantine smart response action

■ Configuring the Network Protect: SharePoint Release from Quarantine smart response
action

■ Configuring the Remove Collaborator Access Smart Response action

■ Configuring the Remove Shared Links Smart Response action

■ Configuring the Restore File Smart Response action

■ Configuring the Custom Action on Data-at-Rest action

■ Configuring the Delete Data-at-Rest action


Response rule actions 1781

■ Configuring the Encrypt Data-at-Rest action

■ Configuring the Perform DRM on Data-at-Rest action

■ Configuring the Quarantine Data-at-Rest action

■ Configuring the Remove Shared Links in Data-at-Rest action

■ Configuring the Tag Data-at-Rest action

■ Configuring the Prevent download, copy, print action

■ Configuring the Remove Collaborator Access action

■ Configuring the Set Collaborator Access to 'Edit' action

■ Configuring the Set Collaborator Access to 'Preview' action

■ Configuring the Set Collaborator Access to 'Read' action

■ Configuring the Set File Access to 'All Read' action

■ Configuring the Set File Access to 'Internal Edit'

■ Configuring the Set File Access to 'Internal Read' action

■ Configuring the Add two-factor authentication action

■ Configuring the Block Data-in-Motion action

■ Configuring the Custom Action on Data-in-Motion action

■ Configuring the Encrypt Data-in-Motion action

■ Configuring the Perform DRM on Data-in-Motion action

■ Configuring the Quarantine Data-in-Motion action

■ Configuring the Redact Data-in-Motion action

■ Configuring the Endpoint: FlexResponse action

■ Configuring the Endpoint: ICT Classification And Tagging action

■ Configuring the Endpoint Discover: Quarantine File action

■ Configuring the Endpoint Prevent: Block action

■ Configuring the Endpoint Prevent: Encrypt action

■ Configuring the Endpoint Prevent: Notify action

■ Configuring the Endpoint Prevent: User Cancel action


Response rule actions 1782
Configuring the Add Note action

■ Configuring the Network Prevent for Web: Block FTP Request action

■ Configuring the Network Prevent for Web: Block HTTP/S action

■ Configuring the Network Prevent: Block SMTP Message action

■ Configuring the Network Prevent: Modify SMTP Message action

■ Configuring the Network Prevent for Web: Remove HTTP/S Content action

■ Configuring the Network Protect: Copy File action

■ Configuring the Network Protect: Quarantine File action

■ Configuring the Network Protect: Encrypt File action

Configuring the Add Note action


The Add Note response rule action lets an incident responder enter a note about a particular
incident.
The limit for the Add Note field is 4000 bytes.
See “About response rule actions” on page 1738.
The Add Note response rule action is available for all types of detection servers.
See “Response rule actions for all detection servers” on page 1739.
To configure the Add Note action
1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the All: Add Note action type from the Actions list.
The system displays a Note field. Generally you leave the field blank and allow remediators
to add comments when they evaluate incidents. However, you can add comments at this
level of configuration as well.
The limit for the Add Note field is 4000 bytes.
See “Configuring response rule actions” on page 1765.
3 Click Save to save the configuration.
See “Manage response rules” on page 1761.
See “Implementing response rules” on page 1758.
Response rule actions 1783
Configuring the Encrypt Smart Response action

Configuring the Encrypt Smart Response action


The Encrypt Smart Response action lets you encrypt sensitive files in cloud applications
through the Symantec Data Loss Prevention Cloud Detection Service.
See “About response rule actions” on page 1738.
This response rule is available for Cloud Applications and API appliance detectors.
See “Response rule actions for Cloud Applications and API appliance detectors” on page 1744.
To configure the Encrypt Smart Response action
1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Encrypt action type from the Actions list.
See “Configuring response rule actions” on page 1765.
3 Click Save to save the configuration.
See “Manage response rules” on page 1761.
See “Implementing response rules” on page 1758.

Configuring the Limit Incident Data Retention action


The Limit Incident Data Retention response rule action lets you modify the default incident
data retention behavior of the detection server.
See “About response rule actions” on page 1738.
This response rule is available for all types of detection servers except Endpoint Discover. If
existing policies use this response rule, policy violations do not trigger an incident.
See “Response rule actions for all detection servers” on page 1739.
To configure incident data retention
1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the action type All: Limit Incident Data Retention from the Actions list.
See “Configuring response rule actions” on page 1765.
3 Choose to retain Endpoint Incident data by selecting this option.
By default, the agent discards the original message and any attachments for endpoint
incidents.
See “Retaining data for endpoint incidents” on page 1784.
Response rule actions 1784
Configuring the Limit Incident Data Retention action

4 Choose to discard Network Incident data by selecting this option.


By default, the system retains the original message and any attachments for network
incidents.
See “Discarding data for network incidents” on page 1785.
5 Click Save to save the response rule configuration.
See “Manage response rules” on page 1761.
See “Implementing response rules” on page 1758.

Retaining data for endpoint incidents


By default, the system discards original messages (including files and attachments) for endpoint
incidents. You can implement the Limit Incident Data Retention response rule action to override
this default behavior and retain original email attachments for endpoint incidents.

Note: Limit Incident Data Retention does not apply to Endpoint Print or Clipboard incidents
and is not supported for Endpoint Discover.

See “Configuring the Limit Incident Data Retention action” on page 1783.
Select All Endpoint Incidents to retain the original file attachment for Endpoint Prevent
incidents.
If you combine a server-side detection rule (EDM/IDM/DGM) with a Limit Incident Data Retention
response rule action on the endpoint, consider the network bandwidth implications. When an
Endpoint Agent sends content to an Endpoint Server for analysis, it sends text or binary data
according to detection requirements. If possible, Symantec DLP Agents send text to reduce
bandwidth use. When you retain the original messages for endpoint incidents, in every case
the system requires agents to send binary data to the Endpoint Server. As such, make sure
that your network can handle the increased traffic between Endpoint Agents and Endpoint
Servers without degrading performance.
See “Two-tier detection for DLP Agents” on page 395.
Consider the system behavior for any policies that combine an agent-side detection rule (any
DCM rule, such as a keyword rule). If you implement the Limit Incident Data Retention response
rule action, the increased use bandwidth depends on the number of incidents the detection
engine matches. For such policies, the DLP Agent does not send all original files to the Endpoint
Server, but only those associated with confirmed incidents. If there are not many incidents,
the effect is small.
Response rule actions 1785
Configuring the Log to a Syslog Server action

Discarding data for network incidents


For network incidents, by default the detection server retains the original message and any
attachments that trigger an incident.
You can implement the Limit Incident Data Retention response rule action to override the
default behavior and discard original messages and some or all attachments.
See “Configuring the Limit Incident Data Retention action” on page 1783.

Note: The default data retention behavior for network incidents applies to Network Prevent for
Web and Network Prevent for Email incidents. The default behavior does not apply to Network
Discover incidents. For Network Discover incidents, the system provides a link in the Incident
Snapshot that points to the offending file at its original location. Incident data retention for
Network Discover is not configurable.

Table 50-1 Discarding data from network incidents

Parameter Description

Discard Original Check this option to discard the original message.


Message
Use this configuration to save disk space when you are only interested in statistical data.

Discard Attachment Select All to discard all message attachments.

Select Attachments with no Violations to save only relevant message attachments, that is,
those that trigger a policy violation.
Note: You must select something other than None for this action option. If you leave None
selected and do not check the box next to Discard Original Message, the action has no effect.
Such a configuration duplicates the default incident data retention behavior for network servers.

Configuring the Log to a Syslog Server action


The Log to a Syslog Server response rule action logs the incident to a syslog server. These
logs can be useful if you use a security information and events management (SIEM) system.
See “About response rule actions” on page 1738.
This response rule action is available for all types of detection servers.
See “Response rule actions for all detection servers” on page 1739.

Note: You use this response rule in conjunction with a syslog server. See “Enabling a syslog
server” on page 174.
Response rule actions 1786
Configuring the Send Email Notification action

To configure the Log to a Syslog Server response rule action


1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Log to a Syslog Server action type from the Actions list.
See “Configuring response rule actions” on page 1765.
3 Enter the Host name of the syslog server.
4 Edit the Port for the syslog server, if necessary.
The default port is 514.
5 Enter the text of the Message to log on the syslog server.
You can include response action variables in your syslog server messages.
See “Response action variables” on page 1847.

6 Select the Level to apply to the log message from the drop-down list.
The following options are available:
■ 0 - Kernel panic
■ 1 - Needs immediate attention
■ 2 - Critical condition
■ 3 - Error
■ 4 - Warning
■ 5 - May need attention
■ 6 - Informational
■ 7- Debugging

7 Save the response rule.


See “Manage response rules” on page 1761.
See “Implementing response rules” on page 1758.

Configuring the Send Email Notification action


The Send Email Notification action enables you to compose an email and send it to recipients
you specify.
See “About response rule actions” on page 1738.
This response rule action is available for all types of detection servers.
Response rule actions 1787
Configuring the Send Email Notification action

See “Response rule actions for all detection servers” on page 1739.
You must integrate the Enforce Server with an SMTP email server to implement this response
rule action.
See “Configuring the Enforce Server to send email alerts” on page 176.
To configure the Send Email Notification response rule action
1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the All: Send Email Notification action type from the Actions list.
See “Configuring response rule actions” on page 1765.
3 Configure the recipient(s), sender, format, incident inclusion, and messages per day.
See Table 50-2 on page 1787.
4 Configure the Notification Content of the email notification: language, subject, body.
See Table 50-3 on page 1788.
5 Click Save to save the configuration.
See “Manage response rules” on page 1761.

Table 50-2 Sender and recipient information

Parameter Description

To: Sender Select this option to send the email notification to the email sender. This recipient only applies
to email message violations.

To: Data Owner Select this option to send email notification to the data owner that the system identifies by email
address in the incident.

See “Discover incident snapshot” on page 1882.

To: Other Email This option can include any custom attributes designated as email addresses (such as
Address "manager@email"). For example, if you define a custom attribute that is an email address, or
retrieve one via a lookup plug-in, that address will appear in the "To" field for selection, to the
right of "To: Sender" and "To: Data Owner."

See “Configuring custom attributes” on page 1970.

Custom To Enter one or more specific email addresses separated by commas.

CC Enter one or more specific email addresses separated by commas for people you want to copy
on the notification.

Custom From You can specify the sender of the message.

If this field is blank, the message appears to come from the system email address.
Response rule actions 1788
Configuring the Server FlexResponse action

Table 50-2 Sender and recipient information (continued)

Parameter Description

Notification Format Select either HTML or plain-text format.

Include Original Select this option to include the message that generated the incident with the notification email.
Message

Max Per Day Enter a number to restrict the maximum number of notifications that the system sends in a day.

Table 50-3 Notification content

Parameter Description

Language Select the language for the message from the drop-down menu.

Add Language Click the icon to add multiple language(s) for the message.

See “About Endpoint Prevent response rules in different locales” on page 2314.

Subject Enter a subject for the message that indicates what the message is about.

Body Enter the body of the message.

Insert Variables You can add one or more variables to the subject or body of the email message by selecting
the desired value(s) from the Insert Variables list.

Variables can be used to include the file name, policy name, recipients, and sender in both the
subject and the body of the email message. For example, to include the policy and rules violated,
you would insert the following variables.

A message has violated the following rules in $POLICY$: $RULES$

See “Response action variables” on page 1847.

See “Implementing response rules” on page 1758.

Configuring the Server FlexResponse action


The All: Server FlexResponse action enables you to remediate any incident type using a
custom, server-side FlexResponse plug-in. You can configure a Server FlexResponse response
action for either automated response rules or smart response rules.
The All: Server FlexResponse action is available only if you have licensed Network Protect
and you have deployed one or more Server FlexResponse plug-ins to Symantec Data Loss
Prevention.
See “Deploying a Server FlexResponse plug-in” on page 2143.
Response rule actions 1789
Configuring the Set Attribute action

To configure a Server FlexResponse action


1 Log on to the Enforce Server administration console.
2 Create a new Response Rule for each custom Server FlexResponse plug-in.
Click Manage > Policies > Response Rules.
3 Click Add Response Rule.
4 Select either Automated Response or Smart Response. Click Next.
5 Enter a name for the rule in the Rule Name field. (For Smart Response rules, this name
appears as the label on the button that incident responders select during remediation.)
6 Enter an optional description for the rule in the Description field.
7 In the Actions (executed in the order shown) menu, select the action All: Server
FlexResponse.
8 Click Add Action.
9 In the FlexResponse Plugin menu, select a deployed Server FlexResponse plug-in to
execute with this Response Rule action.
The name that appears in this drop-down menu is the value specified in the display-name
property from either the configuration properties file or the plug-in metadata class.
See “Deploying a Server FlexResponse plug-in” on page 2143.

Note: If you have installed the Network Protect ICE license and configured the Enforce
Server to connect to the Symantec ICE Cloud, you can use the SharePoint Encrypt
response rule action which is made available through a Server FlexResponse plug-in for
encryption that is installed automatically with Symantec Data Loss Prevention. No additional
configuration or customization is required for the encryption plug-in.

10 Click Save.
11 Repeat this procedure, adding a Response Rule for any additional Server FlexResponse
plug-ins that you have deployed.

Configuring the Set Attribute action


The Set Attribute response rule action sets the incident status to the specified value.
See “About response rule actions” on page 1738.
This response rule action is available for all detection servers.
See “Response rule actions for all detection servers” on page 1739.
Response rule actions 1790
Configuring the Set Status action

The Set Attribute action is based on custom attributes you define at the System > Incident
Data > Attributes screen.
See “About custom attributes” on page 1968.
To configure the Set Attribute action
1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the All: Set Attribute action type from the Actions list.
See “Configuring response rule actions” on page 1765.
3 Select the Attribute from the drop-down list (if more than one custom attribute is defined).
4 Enter an incident status Value for the selected custom attribute.
5 Click Save to save the configuration.
See “Manage response rules” on page 1761.
See “Implementing response rules” on page 1758.

Configuring the Set Status action


The Set Status response rule action sets the incident status to the specified value.
See “About response rule actions” on page 1738.
This response rule is available for all detection servers.
See “Response rule actions for all detection servers” on page 1739.
This response rule action is based on the incident Status Values you configure at the System
> Incident Data > Attributes screen.
See “About incident status attributes” on page 1962.
To configure the Set Status response rule action
1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the All: Set Status action type from the Actions list.
3 See “Configuring response rule actions” on page 1765.
4 Select the Status to assign to the incident from the list.
The following are some example incident statuses you might configure and select from:
■ New
■ Escalated
Response rule actions 1791
Configuring the Cloud Storage: Add Visual Tag action

■ Investigation
■ Resolved
■ Dismissed

5 Click Save to save the configuration.


See “Manage response rules” on page 1761.
See “Implementing response rules” on page 1758.

Configuring the Cloud Storage: Add Visual Tag action


The Add Visual Tag rule action lets an incident responder apply visual tags as metadata to
sensitive content stored in your Box cloud storage target. The visual tag helps your Box cloud
storage users search for and self-remediate sensitive data. For example, you might want the
tag to read "This content is considered confidential." You can also remind them of additional
security features of Box, such as adding password protection to any download links.
To configure the Cloud Storage: Add Visual Tag action
1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Cloud Storage: Add Visual Tag action type from the Actions list.
The system displays the Add Visual Tag field. Enter the text you want to display in the
tag for your users.
See “Configuring response rule actions” on page 1765.
3 Click Save to save the configuration.
See “Manage response rules” on page 1761.
See “Implementing response rules” on page 1758.

Configuring the Cloud Storage: Quarantine action


The Cloud Storage: Quarantine response rule action quarantines content that the detection
server identifies as sensitive or protected.
Response rule actions 1792
Configuring the Cloud Storage: Quarantine action

To configure the Cloud Storage: Quarantine action


1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Cloud Storage: Quarantine action type from the Actions list.
The system displays the Cloud Storage: Quarantine field.
See “Configuring response rule actions” on page 1765.
3 Configure the Cloud Storage: Quarantine parameters.
See Table 50-4 on page 1792.
4 Click Save to save the configuration.
See “Manage response rules” on page 1761.

Table 50-4 Cloud Storage: Quarantine File configuration parameters

Parameter Description

Marker File Select Leave marker file in place of remediated file to create a marker text file to replace the original
file. This action notifies the user what happened to the file instead of quarantining or deleting the file
without any explanation.
Note: The marker file is the same type and has the same name as the original file, as long as it is a
text file. An example of such a file type is Microsoft Word. If the original file is a PDF or image file, the
system creates a plain text marker file. The system then gives the file the same name as the original
file with .txt appended to the end. For example, if the original file name is accounts.pdf, the marker file
name is accounts.pdf.txt.

Marker Specify the text to appear in the marker file. If you selected the option to leave the marker file in place
Text of the remediated file, you can use variables in the marker text.

To specify marker text, select the variable from the Insert Variable list.

For example, for Marker Text you might enter:

A message has violated the following rules in $POLICY$: $RULES

Or, you might enter:

$FILE_NAME$ has been moved to $QUARANTINE_PARENT_PATH$

Add visual Select this option to add a visual tag to the marker file. The visual tag helps your Box cloud storage
tag to users search for marker files for quarantined sensitive data
marker file

Tags Enter the visual tag text in this field.

See “Implementing response rules” on page 1758.


Response rule actions 1793
Configuring the Quarantine Smart Response action

Configuring the Quarantine Smart Response action


The Quarantine Smart Response action quarantines files in the Salesforce, Box, and OneDrive
cloud applications through the Cloud Detection Service. The quarantine path is relative to the
user's root folder.
To configure the Quarantine Smart Response action
1 Configure a Smart Response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Quarantine action type from the Actions list.
The system displays the Quarantine field.
See “Configuring response rule actions” on page 1765.
3 Configure the Quarantine parameters.
See Table 50-5 on page 1793.
4 Click Save to save the configuration.
See “Manage response rules” on page 1761.

Table 50-5 Quarantine (Smart Response) configuration parameters

Parameter Description

File Path Enter the file path for the quarantine location. This file path is relative to the user's root folder.

Use Marker Select Use Marker File to create a marker text file to replace the original file. This action notifies the
File user what happened to the file instead of quarantining or deleting the file without any explanation.

See “Implementing response rules” on page 1758.

Configuring the Network Protect: SharePoint


Quarantine smart response action
The SharePoint Quarantine smart response action quarantines files that are stored in Microsoft
SharePoint repositories. You can quarantine files to either a SharePoint repository or to a file
share (File System) location.

Note: Upon quarantine, file metadata is not saved for attachment-type SharePoint items such
as lists, announcements, tasks, and so on.
Response rule actions 1794
Configuring the Network Protect: SharePoint Quarantine smart response action

To configure the SharePoint Quarantine smart response action


1 Configure a Smart Response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Network Protect SharePoint Quarantine action type from the Actions list.
The system displays the Network Protect SharePoint Quarantine field.
See “Configuring response rule actions” on page 1765.
3 Configure the Network Protect SharePoint Quarantine parameters.
See Table 50-6 on page 1794.
4 Click Save to save the configuration.
See “Manage response rules” on page 1761.

Table 50-6 Network Protect SharePoint Quarantine parameters

Parameter Description

Source

Use Saved Select Use Saved Credentials to choose a named credential from the credential
Credentials store in the Use Saved Credentials drop-down menu if you don't want to enter it
manually.

To move the files for quarantine during remediation, the specified SharePoint user
account must have write access for the original file location.

Use These Select Use These Credentials to manually enter the write-access credential for
Credentials the original location of the scanned file. Then, enter the following: parameters

■ Name - The user name of the account with write access for the location of the
scanned file.
■ Password - The password of the account with write access for the location of
the scanned file.
■ Confirm Password - Confirm the password of the account with write access for
the location of the scanned file.

To move the files for quarantine during remediation, the specified SharePoint user
account must have write access for the original file location.

Destination

Target Repository Specify whether the files are to be quarantined in a SharePoint repository or in a
file share (File System).

Quarantine Path Enter the SharePoint path where the confidential files are to be quarantined.
Response rule actions 1795
Configuring the Network Protect: SharePoint Release from Quarantine smart response action

Table 50-6 Network Protect SharePoint Quarantine parameters (continued)

Parameter Description

Use Saved Select Use Saved Credentials to choose a named credential for the quarantine
Credentials location from the credential store in the Use Saved Credentials drop-down menu
if you don't want to enter it manually.

To move the files for quarantine during remediation, the specified SharePoint user
account must have write access for the quarantine location.

Use These Select Use These Credentials to manually enter the write-access credential for
Credentials the quarantine location. Then, enter the following: parameters

■ Name - The user name of the account with write access for the quarantine
location.
■ Password - The password of the account with write access for the quarantine
location.
■ Confirm Password - Confirm the password of the account with write access for
the quarantine location.

To move the files for quarantine during remediation, the specified SharePoint user
account must have write access for the quarantine location.

Marker File

(Optional) Leave Select Leave marker file in place of remediated file to create a marker text file
marker file in place to replace the original file. This action notifies the user about what happened to the
of remediated file file instead of moving the file without any explanation.

(Optional) Marker Specify the text that appears in the marker file to notify users about what happened
Text to the file that was quarantined. The marker text can contain substitution variables.
Click inside the Marker Text box to see a list of insertion variables.

See “Implementing response rules” on page 1758.

Configuring the Network Protect: SharePoint Release


from Quarantine smart response action
The SharePoint Release from Quarantine smart response action releases files that were
quarantined from SharePoint repositories. When you execute the SharePoint Release from
Quarantine smart response action, you can release files back to their original location in
sharePoint from either a SharePoint location or a file share location. Marker files that were
created when the file was originally quarantined are not deleted upon release from quarantine.
You can release files that were previously quarantined using the deprecated SharePoint
Quarantine FlexResponse Plug-in. If you have installed the SharePoint solution and if a
SharePoint file was quarantined using Symantec Data Loss Prevention 15.1, file metadata is
Response rule actions 1796
Configuring the Network Protect: SharePoint Release from Quarantine smart response action

restored when you release the file from quarantine. If the file was quarantined using a version
earlier than 15.1, the file is released without restoring its metadata.
When you attempt to release a quarantined file, if a file with the same name exists at the
destination location, the released file is named using the following format:
FileName.<N>Released.FileExtension, wherein <N> is a number in the range of 1 to 10.
Therefore, you can release a file that shares a name with another file in the destination directory
up to ten times before the release fails

Note: Network Protect does not access file metadata for inline attachments during the quarantine
process. As a result, file metadata for inline attachments cannot be restored upon release from
quarantine.

To configure the SharePoint Release from Quarantine smart response action


1 Configure a Smart Response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Network Protect SharePoint Release from Quarantine action type from the
Actions list.
The system displays the Network Protect SharePoint Release from Quarantine field.
See “Configuring response rule actions” on page 1765.
3 Configure the Network Protect SharePoint Release from Quarantine parameters.
See Table 50-7 on page 1796.
4 Click Save to save the configuration.
See “Manage response rules” on page 1761.

Table 50-7 Network Protect SharePoint Release from Quarantine parameters

Parameter Description

Add Row Click Add Row to start mapping a new file path. The file path could be either the
location to which files are quarantined, or the original SharePoint location to which
files should be released.

Path Specify the location to which files are quarantined, or the original SharePoint location
to which files should be released.

Credentials Specify the write-access credentials for the file path that you want to map.

Delete Delete the corresponding file path.

See “Implementing response rules” on page 1758.


Response rule actions 1797
Configuring the Remove Collaborator Access Smart Response action

Configuring the Remove Collaborator Access Smart


Response action
The Remove Collaborator Access Smart Response action removes collaborator access
from shared files in cloud applications through the Cloud Detection Service.
To configure the Remove Collaborator Access Smart Response action
1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Remove Collaborator Access action type from the Actions list.
See “Configuring response rule actions” on page 1765.
3 Click Save to save the configuration.
See “Manage response rules” on page 1761.
See “Implementing response rules” on page 1758.

Configuring the Remove Shared Links Smart Response


action
The Remove Shared Links Smart Response action removes shared links from files in cloud
applications through the Cloud Detection Service.
To configure the Remove Shared Links Smart Response action
1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Remove Shared Links action type from the Actions list.
See “Configuring response rule actions” on page 1765.
3 Click Save to save the configuration.
See “Manage response rules” on page 1761.
See “Implementing response rules” on page 1758.

Configuring the Restore File Smart Response action


The Restore File Smart Response action restores a quarantined file in the Salesforce, Box,
and OneDrive cloud applications through the Cloud Detection Service.
Response rule actions 1798
Configuring the Custom Action on Data-at-Rest action

To configure the Restore File Smart Response action


1 Configure a Smart Response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Restore File action type from the Actions list.
The system displays the Restore File field.
3 Click Save to save the configuration.
4 See “Manage response rules” on page 1761.
See “Implementing response rules” on page 1758.

Configuring the Custom Action on Data-at-Rest action


The Custom Action on Data-at-Rest action returns a recommendation to perform some
custom action on the sensitive data with the detection result.
You can configure a custom payload with additional details about this recommendation. The
custom payload appears in the customResponsePayload parameter of the detection response.
To configure the Custom Action on Data-at-Rest action
1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Custom Action on Data-at-Rest action type from the Actions list.
The system displays the Custom Action on Data-at-Rest field.
See “Configuring response rule actions” on page 1765.
3 Configure the Custom Action on Data-at-Rest parameter.
See Table 50-8 on page 1798.
4 Click Save to save the configuration.
See “Manage response rules” on page 1761.

Table 50-8 Custom Action on Data-at-Rest configuration parameter

Parameter Description

Custom Enter details about the Custom Action on Data-at-Rest action in the custom payload field. These
payload details are returned in the customResponsePayload parameter of the detection result.

See “Implementing response rules” on page 1758.


Response rule actions 1799
Configuring the Delete Data-at-Rest action

Configuring the Delete Data-at-Rest action


The Delete Data-at-Rest action deletes sensitive data in the following cloud applications
through the Cloud Detection Service:
■ Dropbox
■ Gmail
■ Microsoft Office 365 Email
To configure the Delete Data-at-Rest action
1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Delete Data-at-Rest action type from the Actions list.
The system displays the Delete Data-at-Rest field.
See “Configuring response rule actions” on page 1765.
3 Click Save to save the configuration.
See “Manage response rules” on page 1761.
See “Implementing response rules” on page 1758.

Configuring the Encrypt Data-at-Rest action


The Encrypt Data-at-Rest action encrypts sensitive data in the following applications through
the Cloud Detection Service:
■ OneDrive
■ SharePoint
You can configure a custom payload with additional details about this recommendation. The
custom payload appears in the customResponsePayload parameter of the detection response.
To configure the Encrypt Data-at-Rest action
1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Encrypt Data-at-Rest action type from the Actions list.
The system displays the Encrypt Data-at-Rest field.
See “Configuring response rule actions” on page 1765.
Response rule actions 1800
Configuring the Perform DRM on Data-at-Rest action

3 Configure the parameter.


See Table 50-9 on page 1800.
4 Click Save to save the configuration.
See “Manage response rules” on page 1761.

Table 50-9 Encrypt Data-at-Rest configuration parameter

Parameter Description

Custom Enter details about the Encrypt Data-at-Rest action in the Custom payload field. These details are
payload returned in the customResponsePayload parameter of the detection result.

See “Implementing response rules” on page 1758.

Configuring the Perform DRM on Data-at-Rest action


The Perform DRM on Data-at-Rest action applies Digital Rights Management (DRM) to
sensitive data in applications through the Cloud Detection Service or API Detection for
Developer Apps Appliance.
You can configure a custom payload with additional details about this recommendation. The
custom payload appears in the customResponsePayload parameter of the detection response.
To configure the Perform DRM on Data-at-Rest action
1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Perform DRM on Data-at-Rest action type from the Actions list.
The system displays the field.
See “Configuring response rule actions” on page 1765.
3 Configure the Perform DRM on Data-at-Rest parameter.
See Table 50-10 on page 1800.
4 Click Save to save the configuration.
See “Manage response rules” on page 1761.

Table 50-10 Perform DRM on Data-at-Rest configuration parameter

Parameter Description

Custom Enter details about the Perform DRM on Data-at-Rest action in the Custom payload field. These details
payload are returned in the customResponsePayload parameter of the detection result.
Response rule actions 1801
Configuring the Quarantine Data-at-Rest action

See “Implementing response rules” on page 1758.

Configuring the Quarantine Data-at-Rest action


The Quarantine Data-at-Rest action quarantines sensitive data in the following cloud
applications through the Cloud Detection Service:
■ Box
■ OneDrive
■ Salesforce
■ SharePoint
■ Slack
To configure the Quarantine Data-at-Rest action
1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Quarantine Data-at-Rest action type from the Actions list.
The system displays the Quarantine Data-at-Rest field.
See “Configuring response rule actions” on page 1765.
3 Configure the Quarantine Data-at-Rest parameter.
See Table 50-11 on page 1801.
4 Click Save to save the configuration.
See “Manage response rules” on page 1761.

Table 50-11 Quarantine Data-at-Rest configuration parameter

Parameter Description

File Path Enter the file path for the quarantine location. This file path is relative to the user's root folder.

Use Marker Select Use Marker File to create a marker text file to replace the original file. This action notifies the
File user what happened to the file instead of quarantining or deleting the file without any explanation.

Marker Text Enter the text you want to display in the marker file. You can select and insert variables from the Insert
Variable list.

See “Implementing response rules” on page 1758.


Response rule actions 1802
Configuring the Remove Shared Links in Data-at-Rest action

Configuring the Remove Shared Links in Data-at-Rest


action
The Remove Shared Links in Data-at-Rest action breaks links to sensitive data in the following
cloud applications through the Cloud Detection Service:
■ Box
■ Dropbox
■ Google Drive
■ OneDrive
■ Salesforce
You can configure a custom payload with additional details about this recommendation. The
custom payload appears in the customResponsePayload parameter of the detection response.
To configure the Remove Shared Links in Data-at-Rest action
1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Remove Shared Links in Data-at-Rest action type from the Actions list.
The system displays the Remove Shared Links in Data-at-Rest field.
See “Configuring response rule actions” on page 1765.
3 Configure the Remove Shared Links in Data-at-Rest parameter.
See Table 50-12 on page 1802.
4 Click Save to save the configuration.
See “Manage response rules” on page 1761.

Table 50-12 Remove Shared Links in Data-at-Rest configuration parameter

Parameter Description

Custom Enter details about the Remove Shared Links in Data-at-Rest action in the custom payload field.
payload These details are returned in the customResponsePayload parameter of the detection result.

See “Implementing response rules” on page 1758.

Configuring the Tag Data-at-Rest action


The Tag Data-at-Rest action tags sensitive data in applications through the Cloud Detection
Service or API Detection for Developer Apps Appliance.
Response rule actions 1803
Configuring the Prevent download, copy, print action

You can configure a custom payload with additional details about this recommendation. The
custom payload appears in the customResponsePayload parameter of the detection response.
To configure the Tag Data-at-Rest action
1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Tag Data-at-Rest action type from the Actions list.
The system displays the Tag Data-at-Rest field.
See “Configuring response rule actions” on page 1765.
3 Configure the Tag Data-at-Rest parameter.
See Table 50-13 on page 1803.
4 Click Save to save the configuration.
See “Manage response rules” on page 1761.

Table 50-13 Tag Data-at-Rest configuration parameter

Parameter Description

Custom Enter details about the Tag Data-at-Rest action in the Custom payload field. These details are returned
payload in the customResponsePayload parameter of the detection result.

See “Implementing response rules” on page 1758.

Configuring the Prevent download, copy, print action


The Prevent download, copy, print action prevents sensitive data files from being downloaded,
copied, or printed from the Google Drive cloud application through the Cloud Detection Service.
To configure the Prevent download, copy, print action
1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Prevent download, copy, print action type from the Actions list.
See “Configuring response rule actions” on page 1765.
3 Click Save to save the configuration.
See “Manage response rules” on page 1761.
See “Implementing response rules” on page 1758.
Response rule actions 1804
Configuring the Remove Collaborator Access action

Configuring the Remove Collaborator Access action


The Remove Collaborator Access action removes access from collaborators to sensitive
data files in the following cloud applications through the Cloud Detection Service:
■ Box
■ Dropbox
■ Google Drive
■ Salesforce
■ SharePoint
To configure the Remove Collaborator Access action
1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Remove Collaborator Access action type from the Actions list.
See “Configuring response rule actions” on page 1765.
3 Click Save to save the configuration.
See “Manage response rules” on page 1761.
See “Implementing response rules” on page 1758.

Configuring the Set Collaborator Access to 'Edit'


action
The Set Collaborator Access to 'Edit' action grants collaborators edit access to sensitive
data files in the following cloud applications through the Cloud Detection Service:
■ Box
■ Dropbox
■ Google Drive
■ Salesforce
■ SharePoint
Response rule actions 1805
Configuring the Set Collaborator Access to 'Preview' action

To configure the Set Collaborator Access to 'Edit' action


1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Set Collaborator Access to 'Edit' action type from the Actions list.
See “Configuring response rule actions” on page 1765.
3 Click Save to save the configuration.
See “Manage response rules” on page 1761.
See “Implementing response rules” on page 1758.

Configuring the Set Collaborator Access to 'Preview'


action
The Set Collaborator Access to 'Preview' action grants collaborators preview access to
sensitive data files in the Box cloud application through the Cloud Detection Service.
To configure the Set Collaborator Access to 'Preview' action
1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Set Collaborator Access to 'Preview' action type from the Actions list.
See “Configuring response rule actions” on page 1765.
3 Click Save to save the configuration.
See “Manage response rules” on page 1761.
See “Implementing response rules” on page 1758.

Configuring the Set Collaborator Access to 'Read'


action
The Set Collaborator Access to 'Read' action grants collaborators read access to sensitive
data files in the following cloud applications through the Cloud Detection Service:
■ Box
■ Dropbox
■ Google Drive
■ Salesforce
Response rule actions 1806
Configuring the Set File Access to 'All Read' action

■ SharePoint
To configure the Set Collaborator Access to 'Read' action
1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Set Collaborator Access to 'Read' action type from the Actions list.
See “Configuring response rule actions” on page 1765.
3 Click Save to save the configuration.
See “Manage response rules” on page 1761.
See “Implementing response rules” on page 1758.

Configuring the Set File Access to 'All Read' action


The Set File Access to 'All Read' action grants public read access to sensitive data files in
the following cloud applications through the Cloud Detection Service.
■ Google Drive
■ OneDrive
■ SharePoint
To configure the Set File Access to 'All Read' action
1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Set File Access to 'All Read' action type from the Actions list.
See “Configuring response rule actions” on page 1765.
3 Click Save to save the configuration.
See “Manage response rules” on page 1761.
See “Implementing response rules” on page 1758.

Configuring the Set File Access to 'Internal Edit'


The Set File Access to 'Internal Edit' action grants edit access to all members of your
organization to sensitive files in the following cloud applications through the Cloud Detection
Service:
■ Box
■ Google Drive
Response rule actions 1807
Configuring the Set File Access to 'Internal Read' action

■ OneDrive
■ Salesforce
■ SharePoint
To configure the Set File Access to 'Internal Edit' action
1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Set File Access to 'Internal Edit' action type from the Actions list.
See “Configuring response rule actions” on page 1765.
3 Click Save to save the configuration.
See “Manage response rules” on page 1761.
See “Implementing response rules” on page 1758.

Configuring the Set File Access to 'Internal Read'


action
The Set File Access to 'Internal Read' action grants read access to all members of your
organization to sensitive data files in the following cloud applications through the Cloud Detection
Service:
■ Box
■ Google Drive
■ Salesforce
■ SharePoint
To configure the Set File Access to 'Internal Read' action
1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Set File Access to 'Internal Read' action type from the Actions list.
See “Configuring response rule actions” on page 1765.
3 Click Save to save the configuration.
See “Manage response rules” on page 1761.
See “Implementing response rules” on page 1758.
Response rule actions 1808
Configuring the Add two-factor authentication action

Configuring the Add two-factor authentication action


The Add two-factor authentication action adds two-factor authentication to sensitive data
files in applications through the Cloud Detection Service or API Detection for Developer Apps
Appliance.
To configure the Add two-factor authentication action
1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Add two-factor authentication action type from the Actions list.
See “Configuring response rule actions” on page 1765.
3 Click Save to save the configuration.
See “Manage response rules” on page 1761.
See “Implementing response rules” on page 1758.

Configuring the Block Data-in-Motion action


The Block Data-in-Motion action blocks sensitive data in applications through the Cloud
Detection Service or API Detection for Developer Apps Appliance.
You can configure a message for your users to inform them why the sensitive data was blocked.
The message appears in the message parameter of the detection response.
To configure the Data-in-Motion (DIM) REST API action
1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Block Data-in-Motion action type from the Actions list.
The system displays the Block Data-in-Motion field.
See “Configuring response rule actions” on page 1765.
3 Configure the Block Data-in-Motion parameter.
See Table 50-14 on page 1809.
4 Click Save to save the configuration.
See “Manage response rules” on page 1761.
Response rule actions 1809
Configuring the Custom Action on Data-in-Motion action

Table 50-14 Block Data-in-Motion configuration parameter

Parameter Description

Message Enter a user-facing message for the Block Data-in-Motion action in the message field. These details
are returned in the message parameter of the detection result.

See “Implementing response rules” on page 1758.

Configuring the Custom Action on Data-in-Motion


action
The Custom Action on Data-in-Motion action returns a recommendation to take some custom
action on the sensitive data with the detection result.
You can configure a custom payload with additional details about this recommendation. The
custom payload appears in the customResponsePayload parameter of the detection response.
To configure the Custom Action on Data-in-Motion action
1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Custom Action on Data-in-Motion action type from the Actions list.
The system displays the Custom Action on Data-in-Motion field.
See “Configuring response rule actions” on page 1765.
3 Configure the parameter.
See Table 50-15 on page 1809.
4 Click Save to save the configuration.
See “Manage response rules” on page 1761.

Table 50-15 Custom Action on Data-in-Motion configuration parameter

Parameter Description

Custom Enter details about the Custom Action on Data-in-Motion action in the custom payload field. These
payload details are returned in the customResponsePayload parameter of the detection result.

See “Implementing response rules” on page 1758.


Response rule actions 1810
Configuring the Encrypt Data-in-Motion action

Configuring the Encrypt Data-in-Motion action


The Encrypt Data-in-Motion action encrypts sensitive data in the Box cloud application through
the Cloud Detection Service.
You can configure a custom payload with additional details about this recommendation. The
custom payload appears in the customResponsePayload parameter of the detection response.
To configure the Encrypt Data-in-Motion action
1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Encrypt Data-in-Motion action type from the Actions list.
The system displays the Encrypt Data-in-Motion field.
See “Configuring response rule actions” on page 1765.
3 Configure the Encrypt Data-in-Motion parameter.
See Table 50-16 on page 1810.
4 Click Save to save the configuration.
See “Manage response rules” on page 1761.

Table 50-16 Encrypt Data-in-Motion configuration parameter

Parameter Description

Custom Enter details about the Encrypt Data-in-Motion action in the custom payload field. These details are
payload returned in the customResponsePayload parameter of the detection result.

See “Implementing response rules” on page 1758.

Configuring the Perform DRM on Data-in-Motion


action
The Perform DRM on Data-in-Motion action applies Digital Rights Management (DRM) to
sensitive data in cloud applications through the Cloud Detection Service.
You can configure a custom payload with additional details about this recommendation. The
custom payload appears in the customResponsePayload parameter of the detection response.
Response rule actions 1811
Configuring the Quarantine Data-in-Motion action

To configure the Perform DRM on Data-in-Motion action


1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Perform DRM on Data-in-Motion action type from the Actions list.
The system displays the Perform DRM on Data-in-Motion field.
See “Configuring response rule actions” on page 1765.
3 Configure the Perform DRM on Data-in-Motion parameter.
See Table 50-17 on page 1811.
4 Click Save to save the configuration.
See “Manage response rules” on page 1761.

Table 50-17 Perform DRM on Data-in-Motion configuration parameter

Parameter Description

Custom Enter details about the Perform DRM on Data-in-Motion action in the custom payload field. These
payload details are returned in the customResponsePayload parameter of the detection result.

See “Implementing response rules” on page 1758.

Configuring the Quarantine Data-in-Motion action


The Quarantine Data-in-Motion action quarantines sensitive data in the Salesforce, Box, and
OneDrive cloud applications through the Cloud Detection Service.
You can configure a custom payload with additional details about this recommendation. The
custom payload appears in the customResponsePayload parameter of the detection response.
To configure the Quarantine Data-in-Motion action
1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Quarantine Data-in-Motion action type from the Actions list.
The system displays the Quarantine Data-in-Motion field.
See “Configuring response rule actions” on page 1765.
Response rule actions 1812
Configuring the Redact Data-in-Motion action

3 Configure the Quarantine Data-in-Motion parameter.


See Table 50-18 on page 1812.
4 Click Save to save the configuration.
See “Manage response rules” on page 1761.

Table 50-18 Quarantine Data-in-Motion configuration parameter

Parameter Description

Custom Enter details about the Quarantine Data-in-Motion action in the custom payload field. These details
payload are returned in the customResponsePayload parameter of the detection result.

See “Implementing response rules” on page 1758.

Configuring the Redact Data-in-Motion action


The Redact Data-in-Motion action redacts sensitive data in applications through the Cloud
Detection Service or API Detection for Developer Apps Appliance.
You can configure a message for your users to inform them why the sensitive data was
redacted. The message appears in the message parameter of the detection response.
To configure the Redact Data-in-Motion action
1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Redact Data-in-Motion action type from the Actions list.
The system displays the Redact Data-in-Motion field.
See “Configuring response rule actions” on page 1765.
3 Configure the Redact Data-in-Motion parameter.
See Table 50-19 on page 1812.
4 Click Save to save the configuration.
See “Manage response rules” on page 1761.

Table 50-19 Redact Data-in-Motion configuration parameter

Parameter Description

Message Enter a user-facing message for the Redact Data-in-Motion action in the message field. These details
are returned in the message parameter of the detection result.
Response rule actions 1813
Configuring the Endpoint: FlexResponse action

See “Implementing response rules” on page 1758.

Configuring the Endpoint: FlexResponse action


The Endpoint: FlexResponse response rule action lets you implement one or more custom
responses you have developed using the FlexResponse API.
See “About Endpoint FlexResponse” on page 2479.
This response rule is available for Endpoint Discover.

Note: This feature is not available for agents running on Mac endpoints.

See “Response rule actions for endpoint detection” on page 1740.


To configure the Endpoint: FlexResponse response rule action
1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Endpoint: FlexResponse action type from the Actions list.
See “Configuring response rule actions” on page 1765.
3 Enter the FlexResponse plug-in Name and configure its Parameters.
See Table 50-20 on page 1813.
4 Click Save to save the configuration.
See “Manage response rules” on page 1761.

Table 50-20 Endpoint: FlexResponse response rule action parameters

Parameter Description

FlexResponse Enter the script module name with packages separated by a period (.).
Python Plugin

Plugin parameters Click Add Parameter to add one or more parameters to the script.

Enter the Key/Value pair for each parameter.

Credentials You can add credentials for accessing the plugin.

You can add and store credentials at the System > Settings > Credentials screen.

See “About the credential store” on page 160.

See “Implementing response rules” on page 1758.


Response rule actions 1814
Configuring the Endpoint: ICT Classification And Tagging action

Configuring the Endpoint: ICT Classification And


Tagging action
The Endpoint: ICT Classification And Tagging response rule action lets you define a rule
by selecting from imported Information Centric Tagging (ICT) tags. Tags are then applied
according to Data Loss Prevention policy, either during an Endpoint Discover scan or by end
users. If scans apply the tags, the tags can be inserted in response to a policy violation or
solely for a baseline Classification Scan. If users are applying the tags, you can further define
alerts.

Note: Scans can apply tags on both Windows and Mac systems on which the DLP Agent is
installed. End users can apply tags only on Windows systems, on which the ICT agent is
installed.

See “Response rule actions for endpoint detection” on page 1740.


To configure the Endpoint: ICT Classification And Tagging response rule action
1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Endpoint: ICT Classification And Tagging action type from the Actions list.
See “Configuring response rule actions” on page 1765.
3 Enter a Rule Name and configure the tag to be applied, as well as alerts, if appropriate.
See Table 50-21 on page 1814.
4 Click Save to save the configuration.
See “Manage response rules” on page 1761.

Table 50-21 Endpoint: ICT Classification And Tagging parameters

Parameter Description

Classification To select a classification that you want applied to


content, build a tag. ICT tags have three parts:
Organization-Scope-Sensitivity Level. For example:
SYMC-MKT-CONFID.

You are building the tag using the imported ICT


classification taxonomy. Select the three parts from
the three drop-down menus for Organization,
Scope, and Sensitivity Level.
Response rule actions 1815
Configuring the Endpoint Discover: Quarantine File action

Table 50-21 Endpoint: ICT Classification And Tagging parameters (continued)

Parameter Description

Language (For end-user application of tags) Select the language in which to display alerts. The
default is English (United States). Alerts are
displayed by the ICT plugin, which provides the
tagging user interface for Microsoft Office or
Microsoft Outlook.

Display an alert when the classification is To provide an explanation to a user about why a
applied. (For end-user application of tags) particular tag was applied, select this check box.
The user sees the alert when they open the file or
email.

Allow the user to change the classification. (For To allow a user to change the classification that you
end-user application of tags) have applied, select this check box. An alert notifies
the user that they can change the classification,
when they open the file or email.

Note: A file may violate multiple policies and you may have associated each policy with a
different classification response rule. While there is a system-defined execution priority for
different types of response rule actions, you can affect the order of execution for response
rule actions of the same type that contain conflicting instructions. To affect the order of
execution, use the Order column on the Manage > Policies > Response Rules screen.
Defining the order is especially important to ensure that the response rule with the highest
level of classification be defined as the highest order.

See “Implementing response rules” on page 1758.


See “Modifying response rule ordering” on page 1769.

Configuring the Endpoint Discover: Quarantine File


action
The Endpoint Discover: Quarantine File response rule action removes a file containing sensitive
information from a non-secure location and places it in a secure location.
See “About Endpoint Quarantine” on page 2325.
This response rule action is specific to Endpoint Discover incidents. This response rule is not
applicable to two-tiered detection methods requiring a Data Profile.
See “Setting up and configuring Endpoint Discover” on page 2325.
Response rule actions 1816
Configuring the Endpoint Discover: Quarantine File action

If you use multiple endpoint response rules in a single policy, make sure that you understand
the order of precedence for such rules.
See “About response rule action execution priority” on page 1753.

Note: This feature is not available for agents running on Mac endpoints.

To configure the Endpoint Discover: Quarantine File response rule action


1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Endpoint Discover: Quarantine File action type from the Actions list.
See “Configuring response rule actions” on page 1765.
3 Enter the Quarantine Path and the Marker File settings.
See Table 50-22 on page 1816.
4 Click Save to save the configuration.
See “Manage response rules” on page 1761.

Table 50-22 Endpoint Discover: Quarantine File response rule action parameters

Parameter Description

Quarantine Enter the path to the secured location where you want files to be placed. The secure location can
Path either be on the local drive of the endpoint, or can be on a remote file share. EFS folders can also
be used as the quarantine location.
Response rule actions 1817
Configuring the Endpoint Prevent: Block action

Table 50-22 Endpoint Discover: Quarantine File response rule action parameters (continued)

Parameter Description

Access Mode If your secure location is on a remote file share, you must select how the Symantec DLP Agent
accesses that file share.
Select one of the following credential access types:

■ Anonymous Access
■ Use Saved Credentials

In anonymous mode, the Symantec DLP Agent runs as LocalSystem user to move the confidential
file. You can use anonymous mode to move files to a secure location on a local drive or to remote
share if it allows anonymous access.
Note: EFS folders cannot accept anonymous users.

A specified credential lets the Symantec DLP Agent impersonate the specified user to access the
secure location. The credentials must be in the following format:

domain\user

You must enter the specified credentials you want to use through the System Credentials page.

See “Configuring endpoint credentials” on page 161.

Marker File Select the Leave marker in place of the remediated file check box to create a placeholder file
that replaces the confidential file.

Marker Text Specify the text to appear in the marker file. If you selected the option to leave the marker file in
place of the remediated file, you can use variables in the marker text.

To specify the marker text, select the variable from the Insert Variable list.

For example, for Marker Text you might enter:

A message has violated the following rules in $POLICY$: $RULES

Or, you might enter:

$FILE_NAME$ has been moved to $QUARANTINE_PARENT_PATH$

See “About response rule actions” on page 1738.


See “Response rule actions for endpoint detection” on page 1740.

Configuring the Endpoint Prevent: Block action


The Endpoint Prevent: Block response rule action blocks the movement of confidential data
on the endpoint and optionally displays an on-screen notification to the endpoint user.
See “About response rule actions” on page 1738.
Response rule actions 1818
Configuring the Endpoint Prevent: Block action

This response rule action is specific to Endpoint Prevent incidents. This response rule is not
applicable to two-tiered detection methods requiring a Data Profile.
See “Setting up and configuring Endpoint Discover” on page 2325.
If you combine multiple endpoint response rules in a single policy, make sure that you
understand the order of precedence for such rules.
See “About response rule action execution priority” on page 1753.

Note: The block action is not triggered for a copy of sensitive data to a local drive.

To configure the Endpoint Prevent: Block response rule action


1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Endpoint Prevent: Block action type from the Actions list.
3 See “Configuring response rule actions” on page 1765.
4 Enter the Endpoint Notification Content settings.
See Table 50-23 on page 1818.
5 Click Save to save the configuration.
See “Manage response rules” on page 1761.

Table 50-23 Endpoint Prevent: Block response rule action parameters

Parameter Configuration

Language Select the language you want the response rule to execute on. Click Add Language to add more
than one language.

See “About Endpoint Prevent response rules in different locales” on page 2314.

See “Setting Endpoint Prevent response rules for different locales” on page 2315.
Response rule actions 1819
Configuring the Endpoint Prevent: Block action

Table 50-23 Endpoint Prevent: Block response rule action parameters (continued)

Parameter Configuration

Display Alert This field is optional for Endpoint Block actions. Select an Endpoint Block action to display an
Box with this on-screen notification to the endpoint user when the system blocks an attempt to copy confidential
message data.

Enter the notification message in the text box. You can add variables to the message by selecting
the appropriate value(s) from the Insert Variable box.

Optionally, you can configure the on-screen notification to include user justifications as well as an
option for users to enter their own justification.

You can also add hyperlinks to refer users to URLs that contain company security information. To
add hyperlinks you use standard HTML syntax, tags, and URLs. Tags are case-sensitive. You can
include hyperlinked text between regular text. For example, you would enter:

The $CONTENT_TYPE$ "$CONTENT_NAME$" contains sensitive information. <a


href="http://www.company.com">Click here for information</a>. Contact the <a
href="mailto:[email protected]">administrator</a> if you have questions.

Insert Variable Select the variables to include in the on-screen notification to the endpoint when the system blocks
an attempt to copy confidential data.
You can select variables based on the following types:

■ Application
■ Content Name
■ Content Type
■ Matching Attachments
■ Matching Recipient Domains
■ Device Type
■ Matching Recipients
■ Policy Names
■ Protocol
Response rule actions 1820
Configuring the Endpoint Prevent: Block action

Table 50-23 Endpoint Prevent: Block response rule action parameters (continued)

Parameter Configuration

Allow user to Select this option to display up to four user justifications in the on-screen notification. When the
choose notification appears on the endpoint, the user is required to choose one of the justifications. (If you
explanation select Allow user to enter text explanation, the user can enter a justification.) Symantec Data Loss
Prevention provides four default justifications, which you can modify or remove as needed.
Justification:

■ User Education
■ Broken Business Process
■ Manager Approved
■ False positive
Each justification entry consists of the following options:

■ Check box
This option indicates whether to include the associated justification in the notification. To remove
a justification, clear the check box next to it. To include a justification, select the check box next
to it.
■ Justification
The system label for the justification. This value appears in reports (for ordering and filtering
purposes), but the user does not see it. You can select the desired option from the drop-down
list.
■ Option Presented to End User
The justification text the system displays in the notification. This value appears in reports with the
justification label. You can modify the default text as desired.
To add a new justification, select New Justification from the drop-down list. In the Enter new
justification text box that appears, enter the justification name. When you save the rule, Symantec
Data Loss Prevention includes it as an option (in alphabetical order) in all Justification drop-down
lists.
Note: You should be selective when adding new justifications. Deleting new justifications is not
currently supported.

Allow user to Select this option to include a text box into which users can enter their own justification.
enter text
explanation

See “Response rule actions for endpoint detection” on page 1740.


See “Recovering sensitive files on Mac endpoints” on page 2368.
Response rule actions 1821
Configuring the Endpoint Prevent: Encrypt action

Configuring the Endpoint Prevent: Encrypt action


The Endpoint Prevent: Encrypt response rule action automatically encrypts a sensitive file and
displays a notification when a user attempts to do any of the following:
■ Transfer a sensitive file to a removable storage device
A user can copy a sensitive file to the removable storage device through Windows Explorer,
Command Line, or PowerShell. The DLP Agent blocks the Save As operation for an
encrypted file on a removable storage device. The ICE Utility decrypts the encrypted files,
and opens them in the native applications on removable storage devices. The ICE Utility
allows the Save operation when a user updates an encrypted file on removable storage
device.
■ Transfer a sensitive file to a cloud storage application
Examples of commonly used cloud storage applications are Box, Google Drive, Microsoft
OneDrive, and so on.
■ Upload a sensitive file or folder with encrypted files with browsers using HTTPS on Windows
endpoints
A user can upload ICE encrypted files or folders from a local disk, network share, or a
removable storage device using a browser. When a user uploads a sensitive file or folder
using a browser, the DLP Agent blocks a user action and automatically encrypts the file
with an .html extension and replaces the original file at the source location. A user is then
prompted to upload this encrypted file or folder using the browser to protect sensitive
information.
The maximum supported file size for the Endpoint Prevent: Encrypt response action is 150
MB.
See “About response rule actions” on page 1738.
For information about the Endpoint Prevent: Encrypt response rule action, See “Response
rule best practices” on page 1759.
This response rule action is available after you apply the Endpoint Prevent ICE license. If the
ICE license is not applied or the ICE settings are not enabled on the System > Agents >
Agent Configuration > Settings page, then the encrypt response action blocks the file.
For using this response action, ensure you must complete the following:
■ Install the ICE Utility on the endpoints.

Note: If the ICE Utility is not installed, then the DLP Agent does not block the file.

■ Apply the Endpoint Prevent ICE license and configure the Enforce Server to connect to
the Symantec Information Centric Encryption Cloud.
Response rule actions 1822
Configuring the Endpoint Prevent: Encrypt action

For information about how Symantec Data Loss Prevention interacts with Symantec ICE,
refer to the Symantec Information Centric Encryption Deployment Guide.
See “Configuring the Enforce Server to connect to the Symantec ICE Cloud” on page 224.
■ Enable Information Centric Encryption settings for DLP Agents on the System > Agents
> Agent Configuration > Settings page.
See “Agent settings” on page 2364.
See “Information Centric Encryption settings for DLP Agents” on page 2371.
When a violation is detected, the DLP Agent encrypts the file, the data transfer completes,
and an incident is created. You can provide a reason for the notification as well as options for
the endpoint user to enter a justification for the action. This response rule action is available
for Endpoint Prevent on Windows and Mac endpoints.
See “How to implement Endpoint Prevent” on page 2312.
To configure the Endpoint Prevent: Encrypt action
1 Navigate to Policies > Response Rules, click Add Response Rule, and select the type
of response rule to add: Automated Response or Smart Response.
2 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
Add the Endpoint Prevent: Encrypt action type from the Actions list.
See “Configuring response rule actions” on page 1765.
3 Configure the Endpoint Prevent: Encrypt parameters.
See Table 50-24 on page 1822.
4 Click Save to save the configuration.
See “Manage response rules” on page 1761.

Table 50-24 Endpoint Prevent: Encrypt parameters

Parameter Description

Language Select the language you want the response rule to apply to. Click Add
Language to add more than one language.

See “About Endpoint Prevent response rules in different locales” on page 2314.

See “Setting Endpoint Prevent response rules for different locales” on page 2315.

Display Block Alert Box This field is required to notify users that the data transfer was blocked.
with this message
Enter the notification message in the text box. You can add variables to the
message by selecting the appropriate value(s) from the Insert Variable box.

A user must click OK to acknowledge the alert and dismiss the pop-up dialog.
Response rule actions 1823
Configuring the Endpoint Prevent: Encrypt action

Table 50-24 Endpoint Prevent: Encrypt parameters (continued)

Parameter Description

Display Encrypt Alert This field is required to notify users that the file that they tried to transfer was
Box with this message encrypted.

Enter the notification message in the text box. You can add variables to the
message by selecting the appropriate value(s) from the Insert Variable box.

User must click OK to acknowledge the alert and dismiss the pop-up dialog.

Display Retry Alert with This field is required to notify users that the file they tried to upload using the
this message browser was encrypted at the source location, and the original file was deleted.
The users should upload this encrypted file using the browser.

Enter the notification message in the text box. You can add variables to the
message by selecting the appropriate value(s) from the Insert Variable box.

User must click OK to acknowledge the alert and dismiss the pop-up dialog.

Insert Variable Select the variables that you want to include in the on-screen notification to
the endpoint user.

You can select variables based on the following types:

■ Application
■ Content Name
■ Content Type
■ Device Type
■ Policy Name
■ Protocol
Response rule actions 1824
Configuring the Endpoint Prevent: Encrypt action

Table 50-24 Endpoint Prevent: Encrypt parameters (continued)

Parameter Description

Allow user to choose Select this option to display up to four user justifications in the on-screen
explanation notification. When the notification appears on the endpoint, the user is required
to choose one of the justifications. (If you select Allow user to enter text
explanation, the user can enter a justification.) Symantec Data Loss Prevention
provides four default justifications, which you can modify or remove as needed.

Available justifications:

■ Broken Business Process


■ False positive
■ Manager Approved
■ User Education
■ New justification (custom)

Each justification entry consists of the following options:

■ Check box
This option indicates whether to include the associated justification in the
notification. To remove a justification, clear the check box next to it. To
include a justification, select the check box next to it.
■ Justification
The system label for the justification. This value appears in reports (for
ordering and filtering purposes), but the user does not see it. You can select
the desired option from the drop-down list.
■ Option Presented to End User
The justification text Symantec Data Loss Prevention displays in the
notification. This value appears in reports with the justification label. You
can modify the default text as desired.

To add a new justification, select New justification from the appropriate


drop-down list. In the Enter new justification text box that appears, type the
justification name. When you save the rule, the system includes the new
justification as an option (in alphabetical order) in all Justification drop-down
lists.
Note: You should be selective in adding new justifications. Deleting new
justifications is not currently supported.

Allow user to enter text Select this option to include a text box into which users can enter their own
explanation justification.

See “Implementing response rules” on page 1758.


Response rule actions 1825
Configuring the Endpoint Prevent: Notify action

Configuring the Endpoint Prevent: Notify action


The Endpoint Prevent: Notify response rule action displays an on-screen notification to the
endpoint user when the user attempts to copy or send a sensitive file. You can provide a reason
for the notification as well as options for the endpoint user to give a justification for the action.
See “About response rule actions” on page 1738.
This response rule action is available for Endpoint Prevent.
See “How to implement Endpoint Prevent” on page 2312.

Note: The notify action is not triggered for a copy of sensitive data to a local drive.

To configure the Endpoint Prevent: Notify action


1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
Add the Endpoint Prevent: Notify action type from the Actions list.
See “Configuring response rule actions” on page 1765.
2 Configure the action parameters.
See Table 50-25 on page 1825.
3 Click Save to save the configuration.
See “Manage response rules” on page 1761.

Table 50-25 Endpoint Prevent: Notify response rule action parameters

Parameter Description

Language Select the language you want the response rule to execute on.

Click Add Language to add more than one language.

See “About Endpoint Prevent response rules in different locales” on page 2314.

See “Setting Endpoint Prevent response rules for different locales” on page 2315.
Response rule actions 1826
Configuring the Endpoint Prevent: Notify action

Table 50-25 Endpoint Prevent: Notify response rule action parameters (continued)

Parameter Description

Display Alert Box This field is required for Endpoint Notify actions. Select this option to display an on-screen
with this message notification to the endpoint user.

Enter the notification message in the text box. You can add variables to the message by selecting
the appropriate value(s) from the Insert Variable box.

Optionally, you can configure the on-screen notification to include user justifications as well as
the option for users to enter their own justifications.

You can also add hyperlinks to refer users to URLs that contain company security information.
To add hyperlinks you use standard HTML syntax, tags, and URLs. Tags are case-sensitive.
You can include insert hyperlinked text between regular text. For example, you would enter:

The $CONTENT_TYPE$ "$CONTENT_NAME$" contains sensitive information. <a


href="http://www.company.com">Click here for information</a>. Contact the <a
href="mailto:[email protected]">administrator</a> if you have questions.

Insert Variable Select the variables that you want to include in the on-screen notification to the endpoint user.
You can select variables based on the following types:

■ Application
■ Content Name
■ Content Type
■ Device Type
■ Policy Names
■ Protocol
Response rule actions 1827
Configuring the Endpoint Prevent: Notify action

Table 50-25 Endpoint Prevent: Notify response rule action parameters (continued)

Parameter Description

Allow user to choose Select this option to display up to four user justifications in the on-screen notification. When
explanation the notification appears on the endpoint, the user is required to choose one of the justifications.
(If you select Allow user to enter text explanation, the user can enter a justification.) Symantec
Data Loss Prevention provides four default justifications, which you can modify or remove as
needed.
Available Justifications:

■ Broken Business Process


■ False positive
■ Manager Approved
■ User Education
■ Custom (new justification)
Each justification entry consists of the following options:

■ Check box
This option indicates whether to include the associated justification in the notification. To
remove a justification, clear the check box next to it. To include a justification, select the
check box next to it.
■ Justification
The system label for the justification. This value appears in reports (for ordering and filtering
purposes), but the user does not see it. You can select the desired option from the drop-down
list.
■ Option Presented to End User
The justification text Symantec Data Loss Prevention displays in the notification. This value
appears in reports with the justification label. You can modify the default text as desired.

To add a new justification, select New Justification from the appropriate drop-down list. In the
Enter new justification text box that appears, type the justification name. When you save the
rule, the system includes the new justification as an option (in alphabetical order) in all
Justification drop-down lists.
Note: You should be selective in adding new justifications. Deleting new justifications is not
currently supported.

Allow user to enter Select this option to include a text box into which users can enter their own justification.
text explanation

See “Response rule actions for endpoint detection” on page 1740.


Response rule actions 1828
Configuring the Endpoint Prevent: User Cancel action

Configuring the Endpoint Prevent: User Cancel action


The Endpoint Prevent: User Cancel response rule action displays a time-sensitive notification
to the user when a policy is violated.
See “About response rule actions” on page 1738.
Users have a limited amount of time to decide to ignore the policy violation or not. If the violation
is ignored, the data transfer completes and an incident is created. If the violation is not ignored,
the data transfer is stopped and an incident is created. If the user does not make a decision
in the allotted time, the data transfer is automatically blocked and an incident is created. You
can provide a reason for the notification as well as options for the endpoint user to enter a
justification for the action.
This response rule action is available for Endpoint Prevent on Windows endpoints only.
See “How to implement Endpoint Prevent” on page 2312.
To configure the Endpoint Prevent: User Cancel action
1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
Add the Endpoint Prevent: User Cancel action type from the Actions list.
See “Configuring response rule actions” on page 1765.
2 Configure the Endpoint Prevent: User Cancel parameters.
See Table 50-26 on page 1828.
3 Click Save to save the configuration.
See “Manage response rules” on page 1761.

Table 50-26 Endpoint Prevent: User Cancel parameters

Parameter Description

Language Select the language you want the response rule to execute on.

Click Add Language to add more than one language.

See “About Endpoint Prevent response rules in different locales” on page 2314.

See “Setting Endpoint Prevent response rules for different locales” on page 2315.

Pre-timeout warning This field is required to notify users that they have a limited amount of time to respond to the
incident.

Enter the notification message in the text box. You can add variables to the message by selecting
the appropriate value(s) from the Insert Variable box.
Response rule actions 1829
Configuring the Endpoint Prevent: User Cancel action

Table 50-26 Endpoint Prevent: User Cancel parameters (continued)

Parameter Description

Post-timeout This field notifies users that the amount of time to override the policy has expired. The data
message transfer was blocked.

Enter the notification message in the text box. You can add variables to the message by selecting
the appropriate value(s) from the Insert Variable box.

Display Alert Box This field is required for Endpoint User Cancel actions. Select this option to display an on-screen
with this message notification to the endpoint user.

Enter the notification message in the text box. You can add variables to the message by selecting
the appropriate value(s) from the Insert Variable box.

Optionally, you can configure the on-screen notification to include user justifications as well as
the option for users to enter their own justifications.

You can also add hyperlinks to refer users to URLs that contain company security information.
To add hyperlinks you use standard HTML syntax, tags, and URLs. Tags are case-sensitive.
You can include insert hyperlinked text between regular text. For example, you would enter:

The $CONTENT_TYPE$ "$CONTENT_NAME$" contains sensitive information. <a


href="http://www.company.com">Click here for information</a>. Contact the <a
href="mailto:[email protected]">administrator</a> if you have questions.

Insert Variable Select the variables that you want to include in the on-screen notification to the endpoint user.
You can select variables based on the following types:

■ Application
■ Content Name
■ Content Type
■ Device Type
■ Policy Name
■ Protocol
■ Timeout Counter

Note: You must use the Timeout Counter variable to display how much time remains before
blocking the data transfer.
Response rule actions 1830
Configuring the Endpoint Prevent: User Cancel action

Table 50-26 Endpoint Prevent: User Cancel parameters (continued)

Parameter Description

Allow user to choose Select this option to display up to four user justifications in the on-screen notification. When
explanation. the notification appears on the endpoint, the user is required to choose one of the justifications.
(If you select Allow user to enter text explanation, the user can enter a justification.) Symantec
Data Loss Prevention provides four default justifications, which you can modify or remove as
needed.
Available Justifications:

■ Broken Business Process


■ False positive
■ Manager Approved
■ User Education
■ Custom (new justification)
Each justification entry consists of the following options:

■ Check box
This option indicates whether to include the associated justification in the notification. To
remove a justification, clear the check box next to it. To include a justification, select the
check box next to it.
■ Justification
The system label for the justification. This value appears in reports (for ordering and filtering
purposes), but the user does not see it. You can select the desired option from the drop-down
list.
■ Option Presented to End User
The justification text Symantec Data Loss Prevention displays in the notification. This value
appears in reports with the justification label. You can modify the default text as desired.

To add a new justification, select New Justification from the appropriate drop-down list. In the
Enter new justification text box that appears, type the justification name. When you save the
rule, the system includes the new justification as an option (in alphabetical order) in all
Justification drop-down lists.
Note: You should be selective in adding new justifications. Deleting new justifications is not
currently supported.

Allow user to enter Select this option to include a text box into which users can enter their own justification.
text explanation.

See “Implementing response rules” on page 1758.


Response rule actions 1831
Configuring the Network Prevent for Web: Block FTP Request action

Configuring the Network Prevent for Web: Block FTP


Request action
The Network Prevent for Web: Block FTP Request response rule action blocks any file transfer
by FTP on your network device.
See “About response rule actions” on page 1738.
This response rule is available only for Network Prevent for Web integrated with a proxy server.
See “Configuring Network Prevent for Web Server” on page 2064.
To configure the Network Prevent for Web: Block FTP Request response rule action
1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Network Prevent for Web: Block FTP Request action type from the Actions
list.
The Block FTP Request response rule action does not require any further configuration.
Once the response rule is deployed to a policy, this action blocks any FTP attempt.
See “Configuring response rule actions” on page 1765.
3 Click Save to save the configuration.
See “Manage response rules” on page 1761.
See “Implementing response rules” on page 1758.

Configuring the Network Prevent for Web: Block


HTTP/S action
The Network Prevent for Web: Block HTTP/S response rule action blocks the transmission of
Web content that Network Prevent for Web detects. This action also blocks Web-based email
messages and attachments.
See “About response rule actions” on page 1738.
This response rule action blocks the transmission of Web content using the Internet Content
Adaptation Protocol (ICAP). To implement this response rule action you must integrate the
detection server with a Web proxy server.
See “Configuring Network Prevent for Web Server” on page 2064.
Response rule actions 1832
Configuring the Network Prevent: Block SMTP Message action

To configure the Network Prevent: Block HTTP/S response rule action


1 Integrate Network Prevent for Web with a proxy server and, if necessary, a VPN server.
See “Network Prevent for Web Server—basic configuration” on page 259.
2 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
3 Add the Network Prevent for Web: Block HTTP/S action type from the Actions list.
See “Configuring response rule actions” on page 1765.
4 Edit the Rejection Message, as necessary.
The system presents this message to the user's browser when the action blocks content.
For example, you might include some HTML-coded text to display in a browser.

Note: If the requesting client does not expect an HTML response, the Rejection Message
may not be displayed in the client browser. For example, a client expecting an XML
response to a Web post may only indicate a Javascript error.

5 Click Save to save the configuration of the response rule.


Certain applications may not provide an adequate response to the Network Prevent for Web:
Block HTTP/S response action. This behavior has been observed with the Yahoo! Mail
application when a detection server blocks a file upload. If a user tries to upload an email
attachment and the attachment triggers a Network Prevent for Web: Block HTTP/S response
action, Yahoo! Mail does not respond or display an error message to indicate that the file is
blocked. Instead, Yahoo! Mail appears to continue uploading the selected file, but the upload
never completes. The user must manually cancel the upload at some point by pressing Cancel.
Other applications may also exhibit this behavior, depending on how they handle the block
request. In these cases a detection server incident is created and the file upload is blocked
even though the application provides no such indication.
See “Implementing response rules” on page 1758.

Configuring the Network Prevent: Block SMTP


Message action
The Network Prevent: Block SMTP Message response rule action blocks SMTP email messages
that cause an incident on the Network Prevent for Email detection server and the Cloud Service
for Email.
See “About response rule actions” on page 1738.
Response rule actions 1833
Configuring the Network Prevent: Modify SMTP Message action

See “Response rule actions for Network Prevent detection” on page 1741.
You must integrate the Network Prevent for Email detection server with a Mail Transfer Agent
(MTA) to implement this response rule action. Refer to the Symantec Data Loss Prevention
MTA Integration Guide for Network Prevent for Email for details.
To configure the Block SMTP Message response rule action
1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Network Prevent: Block SMTP Message action type from the Actions list.
See “Configuring response rule actions” on page 1765.
3 Configure the Block SMTP Message action parameters.
See Table 50-27 on page 1833.
4 Click Save to save the response rule.
See “Manage response rules” on page 1761.

Table 50-27 Network Prevent: Block SMTP Message parameters

Parameter Description

Bounce Message to Sender Enter the text that you want to appear in the SMTP error that Network Prevent
for Email returns to the MTA. Some MTAs display this text in the message that
is bounced to the sender.

If you leave this field blank, the message does not bounce to the sender but
the MTA sends its own message.

Redirect Message to this Address If you want to redirect blocked messages to a particular address (such as the
Symantec Data Loss Prevention administrator), enter that address in this field.

If you leave this field blank, the bounced message goes to the sender only.

See “Implementing response rules” on page 1758.

Configuring the Network Prevent: Modify SMTP


Message action
The Network Prevent: Modify SMTP Message response rule action lets you modify a sensitive
email. For example, you can use this action to change an email subject header to include
information about the policy violation type. The Modify SMTP Message response rule also
works with Cloud Service for Email.
See “About response rule actions” on page 1738.
Response rule actions 1834
Configuring the Network Prevent: Modify SMTP Message action

See “Response rule actions for Network Prevent detection” on page 1741.
To configure the Network Prevent: Modify SMTP Message action
1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Network Prevent: Modify SMTP Message action type from the Actions list.
See “Configuring response rule actions” on page 1765.
3 Configure the action parameters.
See Table 50-28 on page 1834.
4 Click Save to save the configuration.
See “Manage response rules” on page 1761.

Table 50-28 Network Prevent: Modify SMTP Message parameters

Parameter Description

Subject Select the type of modification to make to the subject of the message from the following options:

■ Do not Modify – No text is changed in the subject.


■ Prepend – New text is added to the beginning of the subject.
■ Append – New text is added to the end of the subject.
■ Replace With – New text completely replaces the old subject text.

If the subject text is currently modified, specify the new text.

For example, if you want to prepend "VIOLATION" to the subject of the message, select Prepend
and enter VIOLATION in the text field.

Headers Enter a unique name and a value for each header you want to add to the message (up to three).

Enable Email Select this option to enable integration with Symantec Messaging Gateway. When this option is
Quarantine enabled, Symantec Data Loss Prevention adds preconfigured x-headers to the message that
Connect (requires inform Symantec Messaging Gateway that the message should be quarantined.
Symantec
For more information, see the Symantec Data Loss Prevention Email Quarantine Connect
Messaging
FlexResponse Implementation Guide.
Gateway)

See “Implementing response rules” on page 1758.


Response rule actions 1835
Configuring the Network Prevent for Web: Remove HTTP/S Content action

Configuring the Network Prevent for Web: Remove


HTTP/S Content action
The Network Prevent for Web: Remove HTTP/S Content response action removes confidential
data that is posted to Web mail sites (such as Gmail), blogs (such as Blogspot), and other
sites. This action also removes confidential data that is included in any files that users upload
to Web sites or attach to Web mail. This action only applies to HTTP/S POST commands; it
does not apply to GET commands.
See “About response rule actions” on page 1738.
This response rule action is only available for Network Prevent for Web.
See “Response rule actions for Network Prevent detection” on page 1741.
Symantec Data Loss Prevention recognizes Web form fields for selected Web mail, blog, and
social networking sites. If Network Prevent for Web cannot remove confidential data for a Web
site it recognizes, it creates a system event and performs a configured fallback option.

Note: Symantec Data Loss Prevention removes content for file uploads and, for Network
Prevent, Web mail attachments even for those sites that it does not recognize for HTTP content
removal.

To configure the Network Prevent for Web: Remove HTTP/S Content action
1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Network Prevent for Web: Remove HTTP/S Content action type from the
Actions list.
See “Configuring response rule actions” on page 1765.
3 Configure the action parameters.
See Table 50-29 on page 1835.
4 Click Save to save the configuration.
See “Manage response rules” on page 1761.

Table 50-29 Network Prevent for Web: Remove HTTP/S Content parameters

Field Description

Removal The message that appears in content (Web postings, Web mail, or files) from which the system has
Message removed confidential information. Only the recipient sees this message.
Response rule actions 1836
Configuring the Network Protect: Copy File action

Table 50-29 Network Prevent for Web: Remove HTTP/S Content parameters (continued)

Field Description

Fallback option The action to take if Network Prevent for Web cannot remove confidential information that was
detected in an HTTP or HTTPS post.

The available options are Block (the default) and Allow.


Note: Symantec Data Loss Prevention removes confidential data in file uploads and, for Network
Prevent, Web mail attachments, even for sites in which it does not perform content removal. The
Fallback option is taken only in cases where Symantec Data Loss Prevention detects confidential
content in a recognized Web form, but it cannot remove the content.

Rejection The message that Network Prevent for Web returns to a client when it blocks an HTTP or HTTPS
Message post. The client Web application may or may not display the rejection message, depending on how
the application handles error messages.

See “Implementing response rules” on page 1758.

Configuring the Network Protect: Copy File action


The Network Protect: Copy File response rule action copies a sensitive file to the local file
system.
See “About response rule actions” on page 1738.
This response rule action is only available for Network Discover that is configured for Network
Protect.
See “Response rule actions for Network Prevent detection” on page 1741.
To configure the Network Protect: Copy File response rule action
1 Configure a network file share and specify a location to copy files to.
See “Configuring Network Protect for file shares” on page 2177.
2 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
3 Select the Network Protect: Copy File action type from the Actions list.
This action does not require you to configure any parameters.
See “Configuring response rule actions” on page 1765.
4 Click Save to save the configuration.
See “Manage response rules” on page 1761.
See “Implementing response rules” on page 1758.
Response rule actions 1837
Configuring the Network Protect: Quarantine File action

Configuring the Network Protect: Quarantine File


action
The Network Protect: Quarantine File response rule action quarantines a file that the detection
server identifies as sensitive or protected.
See “About response rule actions” on page 1738.
This response rule action is only available for Network Discover that is configured for Network
Protect.
See “Response rule actions for Network Prevent detection” on page 1741.
To configure the Network Protect: Quarantine File response rule action
1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Network Protect: Quarantine File action type from the Actions list.
See “Configuring response rule actions” on page 1765.
3 Configure the Network Protect: Quarantine File parameters.
See Table 50-30 on page 1837.
4 Click Save to save the configuration.
See “Manage response rules” on page 1761.

Table 50-30 Network Protect: Quarantine File configuration parameters

Parameter Description

Marker File Select this option to create a marker text file to replace the original file. This action notifies the user
what happened to the file instead of quarantining or deleting the file without any explanation.
Note: The marker file is the same type and has the same name as the original file, as long as it is a
text file. An example of such a file type is Microsoft Word. If the original file is a PDF or image file, the
system creates a plain text marker file. The system then gives the file the same name as the original
file with .txt appended to the end. For example, if the original file name is accounts.pdf, the marker file
name is accounts.pdf.txt.
Response rule actions 1838
Configuring the Network Protect: Encrypt File action

Table 50-30 Network Protect: Quarantine File configuration parameters (continued)

Parameter Description

Marker Text Specify the text to appear in the marker file. If you selected the option to leave the marker file in place
of the remediated file, you can use variables in the marker text.

To specify marker text, select the variable from the Insert Variable list.

For example, for Marker Text you might enter:

A message has violated the following rules in $POLICY$: $RULES

Or, you might enter:

$FILE_NAME$ has been moved to $QUARANTINE_PARENT_PATH$

See “Implementing response rules” on page 1758.

Configuring the Network Protect: Encrypt File action


The Network Protect: Encrypt File response rule action encrypts a file that the detection server
identifies as sensitive or protected. This functionality is available only if the Network Discover
ICE license is installed and the Enforce Server has been configured to connect to the Symantec
ICE Cloud. The maximum supported file size is 2047 MB.
See “Configuring the Enforce Server to connect to the Symantec ICE Cloud” on page 224.
For information about how Symantec Data Loss Prevention interacts with Symantec ICE, refer
to the Symantec Information Centric Encryption Deployment Guide.

Note: When a file is encrypted, the file extension changes to .html You must manually update
any links that point to the original unencrypted file.

See “About response rule actions” on page 1738.


For information about the limitations of the Network Protect: Encrypt File response rule action,
See “Response rule best practices” on page 1759.
This response rule action is only available for Network Discover that is configured for Network
Protect.
See “Response rule actions for Network Prevent detection” on page 1741.
Response rule actions 1839
Configuring the Network Protect: Encrypt File action

To configure the Network Protect: Encrypt File response rule action


1 Configure a response rule at the Configure Response Rule screen.
See “Configuring response rules” on page 1763.
2 Add the Network Protect: Encrypt File action type from the Actions list.
See “Configuring response rule actions” on page 1765.
3 Click Save to save the configuration.
See “Manage response rules” on page 1761.
See “Implementing response rules” on page 1758.
Section 6
Remediating and managing
incidents

■ Chapter 51. Remediating incidents

■ Chapter 52. Remediating Network incidents

■ Chapter 53. Remediating Endpoint incidents

■ Chapter 54. Remediating Discover incidents

■ Chapter 55. Working with Application incidents

■ Chapter 56. Managing and reporting incidents

■ Chapter 57. Hiding incidents

■ Chapter 58. Working with incident data

■ Chapter 59. Working with user risk

■ Chapter 60. Implementing lookup plug-ins


Chapter 51
Remediating incidents
This chapter includes the following topics:

■ About incident remediation

■ Remediating incidents

■ Executing Smart response rules

■ Incident remediation action commands

■ Response action variables

About incident remediation


As incidents occur in your system, individuals in your organization must analyze the incidents,
determine why they occurred, identify trends, and remediate the problems.
Symantec Data Loss Prevention provides a rich set of capabilities which can be used to build
an effective incident remediation process. Once you are ready to take action, you can use a
series of incident commands on the Incident Snapshot and Incident List pages.
Since the Incident Snapshot page displays details about one specific incident, you can select
a command to perform an action on the displayed incident.
On the Incident List page, you can perform an action on multiple incidents at one time. You
can select more than one incident from the list and then choose the desired command.
Table 51-1 describes the options that are involved in incident remediation:
Remediating incidents 1842
About incident remediation

Table 51-1 Options involved in incident remediation

Remediation options Description

Role-based access control Access to incident information in the Symantec Data Loss Prevention system
can be tightly controlled with role-based access control. Roles control which
incidents a particular remediator can take action on, as well as what
information within that incident is available to the remediator. For example,
access control can be used to ensure that a given remediator can act only
on incidents originating within a particular business unit. In addition, it might
prevent that business unit's staff from ever seeing high-severity incidents,
instead routing those incidents to the security department.

See “About role-based access control” on page 109.

Severity level assignment Incident severity is a measure of the risk that is associated with a particular
incident. For example, an email message containing 50 customer records
can be considered more severe than a message containing 50 violations of
an acceptable use policy. Symantec Data Loss Prevention lets you specify
what constitutes a severe incident by configuring it at the policy rule level.
Symantec Data Loss Prevention then uses the severity of the incident to
drive subsequent responses to the incident. This process lets you prioritize
incidents and devote your manual remediation resources to the areas where
they are needed most.

Custom attribute lookup Custom attribute lookup is the process of collecting additional information
about the incident from data sources outside of Enforce and the incident
itself. For example, a corporate LDAP server can be queried for additional
information about the message sender, such as the sender's manager name
or business unit.

See “About using custom attributes” on page 1969.

For example, you can use custom attributes as input to subsequent automated
responses to automatically notify the sender's manager about the policy
violation.

See “Setting the values of custom attributes manually” on page 1972.


Remediating incidents 1843
About incident remediation

Table 51-1 Options involved in incident remediation (continued)

Remediation options Description

Automated incident A powerful feature of the Enforce Server is the ability to automatically respond
responses to incidents as they arise. For example, you can configure the system to
respond to a serious incident by blocking the offending communication. You
can send an email message to the sender's manager. You can send an alert
to a security event management system. You can escalate the incident to
the security department. On the other hand, an acceptable use incident might
be dispensed with by sending an email message to the sender. Then you
can mark the incident as closed, requiring no further work. Between these
extremes, you can establish a policy that automatically encrypts transmissions
of confidential data to a business partner. All of these scenarios can be
handled automatically without user intervention.

See “Configuring response rule actions” on page 1765.

Smart Response Although the automated response is an important part of the remediation
process, SmartResponse is necessary at times, particularly in the case of
more serious incidents. Symantec Data Loss Prevention provides a detailed
Incident Snapshot with all of the information necessary to determine the next
steps in remediation. You can use SmartResponse to manually update
incident severity, status, and custom attributes, add comments to the incident.
You can move the incident through the remediation workflow to resolve it.

See “Configuring response rule actions” on page 1765.


The following standard SmartResponse actions are available:

■ Add Note
■ Log to a Syslog Server
■ Send Email Notification
■ Set Status

See “Configuring the Server FlexResponse action” on page 1788.

Distribution of aggregated You can create and automatically distribute aggregated incident reports to
incident reports data owners for remediation.

The Enforce Server handles all of these steps, except for Smart Response. You can handle
incidents in an entirely automated way. You can reserve manual intervention (Smart Response)
for only the most serious incidents.
See “Network incident snapshot” on page 1857.
See “Discover incident snapshot” on page 1882.
See “Endpoint incident snapshot” on page 1866.
Remediating incidents 1844
Remediating incidents

Remediating incidents
When you remediate an incident, you can perform the following actions:
■ Set the incident’s status or severity.
■ Apply a Smart Response rule to the incident.
■ Set the incident’s custom attributes.
■ Add comments to the incident record.
■ Remediate incidents by going to an incident list or incident snapshot and selecting actions
to perform on one or more incidents.
■ Perform some combination of these actions.
You can import a solution pack during installation. Solution packs prepopulate incident lists
and incident snapshots with several remediation options and custom attributes. For complete
descriptions of all solution packs (including information about all remediation options and
custom attributes they contain), refer to the documentation for each of the solution packs in
the solutions packs directory in the documentation.
To remediate incidents
1 Access an incident list or incident snapshot.
In incident lists, Symantec Data Loss Prevention displays available remediation options
in the Incident Actions drop-down menu. The menu becomes active when you select
one or more incidents in the list (with the check box). In incident snapshots, Symantec
Data Loss Prevention also displays the available remediation options. You can set a
Status or Severity from the drop-down menus.
See “Viewing incidents” on page 1911.
You can also edit the Attributes and provide related information.
2 Take either of the following actions:
■ When you view an incident list, select the incident(s) to be remediated (check the box).
You can select incidents individually or select all incidents on the current screen. Then
select the wanted action from the Incidents Actions drop-down menu. For example,
select Incident Actions > Set Status > Escalated.
You can perform as many actions as needed.
■ When you view an incident snapshot, you can set the Status and Severity from the
drop-down menus.
If a Smart Response has been previously set up, you can select a Smart Response
rule in the remediation bar.
See “About response rules” on page 1738.
Remediating incidents 1845
Executing Smart response rules

For example, if one of the Solution Packs was installed, you can select Dismiss False
Positive in the remediation bar. When the Execute Response Rule screen appears,
click OK. This Smart Response rule changes the incident status from New to
Dismissed and sets the Dismissal Reason attribute to False Positive.
You can perform as many remediation actions as needed.

Executing Smart response rules


When you execute a response rule that sends an email, you can manually compose the
contents of the email notification.

Note: Sending an email notification to the sender applies to SMTP incidents only. Also, the
notification addressees that are based on custom attributes (such as "manager email") work
correctly only if populated by the attribute lookup plug-in.

To compose an email notification response


1 Enter optional emails for copies in the CC field.
2 Select the language.
3 Compose or edit the subject and body of the email.
4 Insert variables for the fields in the incident. The supported variables appear as links to
the right of the editable fields.
For example, if you want to include the policy and rules violated, you might enter:

A message has violated the following rules in $POLICY$:


$RULES$

5 Click OK to send the notification.


See “Adding a new response rule” on page 1762.
See “About incident remediation” on page 1841.
See “Response action variables” on page 1847.

Incident remediation action commands


In an incident list, use the Incident Actions drop-down to select remediation actions.
The following incident actions are available for an incident list:
Remediating incidents 1846
Incident remediation action commands

Add Note Add a brief note to the selected incident(s). The comment appears
on the Incident History tab of the Incident Snapshot page for each
selected incident.

The limit for the Add Note field is 4000 bytes.

Delete Incidents Delete the selected incident(s) from the Symantec Data Loss
Prevention system.

Proceed cautiously when deleting incidents. All data that is


associated with the incident(s) is removed. This operation cannot
be reversed.

Export Selected: CSV Export the selected incident(s) to a comma-separated (.csv) file.

Export Selected: XML Export the selected incident(s) to an XML file.

Hide/Unhide Select one of the following incident hiding actions to set the hidden
state for the selected incidents:

■ Hide Incidents—Flags the selected incidents as archived.


■ Unhide Incidents—Restores the selected incidents to the
non-archived state.
■ Do Not Hide—Prevents the selected incidents from being
archived.
■ Allow Hiding—Allows the selected incidents to be archived.

See “About incident hiding” on page 1958.

Lookup Attributes Use the configured lookup plug-ins to look up the configured
attributes.

Set Attributes Display the Set Attributes page so you can enter or edit the attribute
values for the selected incident(s).

Set Data Owner Set the following Data Owner attributes:

■ Name
■ Email Address

Set Severity Change the severity that is set for the selected incident(s) to one of
the options under Set Severity.

Set Status Change the status of the selected incident(s) to one of the options
under Set Status. A system administrator can customize the options
that appear on this list on the Incident Attributes page.

See “About incident status attributes” on page 1962.


Remediating incidents 1847
Response action variables

Run Smart Response Perform one of the listed responses on the selected incident(s).
When you click a response rule, the Execute Response Rule page
appears.

These manual response rules are available only if you have


permission to remediate.

See “About incident remediation” on page 1841.

Response action variables


Response action variables can be used in response rules.
See “Executing Smart response rules” on page 1845.
The response action variables vary by incident type.
See “General incident variables” on page 1847.
See “Endpoint incident variables” on page 1849.
See “ Network Monitor and Network Prevent incident variables” on page 1848.
See “Discover incident variables” on page 1848.

General incident variables


The following general variables are available for all incident types:

$APPLICATION_NAME$ Specifies the name of the application that is associated with the
incident.

$ATTACHMENT_FILENAME$ Specifies the name of the attached file.

$BLOCKED$ Indication of whether or not Symantec Data Loss Prevention blocked


the message (yes or no).

$DESTINATION_IP$ Specifies the destination IP address.

$INCIDENT_ID$ The unique identifier of the incident.

$INCIDENT_SNAPSHOT$ The fully qualified URL to the incident snapshot page for the incident.

$MATCH_COUNT$ The incident match count.

$OCCURED_ON$ Specifies the date on which the incident occurred. This date may be
different than the date the incident was reported.

$POLICY$ The name of the policy that was violated.


Remediating incidents 1848
Response action variables

$POLICY_RULES$ A comma-separated list of one or more policy rules that were violated.

$PROTOCOL$ The protocol, device type, and target type of the incident, where
applicable.

$RECIPIENTS$ A comma-separated list of one or more message recipients.

$REPORTED_ON$ Specifies the date on which the incident was reported.

$MONITOR_NAME$ Specifies the detection server or cloud detector that created the
incident.

$SENDER$ The message sender.

$SEVERITY$ The severity that is assigned to incident.

$STATUS$ Specifies the remediation status of the incident.

$SUBJECT$ The subject of the message.

$URL$ Specifies the file path or location.

Network Monitor and Network Prevent incident variables


The following Network Monitor and Network Prevent variables are available:

$DATAOWNER_NAME$ The person responsible for remediating the incident. This field must
be set manually, or with one of the lookup plug-ins.

Reports can automatically be sent to the data owner for remediation.

$DATAOWNER_EMAIL$ The email address of the person responsible for remediating the
incident. This field must be set manually, or with one of the lookup
plug-ins.

Discover incident variables


The following Network Discover/Cloud Storage Discover and Network Protect incident variables
are available:

$DATAOWNER_NAME$ The person responsible for remediating the incident. This field must
be set manually, or with one of the lookup plug-ins.

Reports can automatically be sent to the data owner for remediation.

$DATAOWNER_EMAIL$ The email address of the person responsible for remediating the
incident. This field must be set manually, or with one of the lookup
plug-ins.
Remediating incidents 1849
Response action variables

$ENDPOINT_MACHINE$ The name of the endpoint computer that generated the violation.

$PATH$ The full path to the file in which the incident was found.

$FILE_NAME$ The name of the file in which the incident was found.

$PARENT_PATH$ The path to the parent directory of the file in which the incident was
found.

$QUARANTINE_PARENT_PATH$ The path to the parent directory in which the file was quarantined.

$SCAN_DATE$ The date of the scan that found the incident.

$TARGET$ The name of the target in which the incident was found.

Endpoint incident variables


The following Endpoint incident variables are available:

$APPLICATION_USER$ The name of the application user.

$DATAOWNER_NAME$ The person responsible for remediating the incident. This field must
be set manually, or with one of the lookup plug-ins.

Reports can automatically be sent to the data owner for remediation.

$DATAOWNER_EMAIL$ The email address of the person responsible for remediating the
incident. This field can be set manually, or with one of the lookup
plug-ins.

$ENDPOINT_LOCATION$ The location of the endpoint computer.

$ENDPOINT_MACHINE$ The name of the endpoint computer that generated the violation.

$ENDPOINT_USER_NAME$ The name of the endpoint user.

$MACHINE_IP$ The corporate IP address of the endpoint computer.

$USER_JUSTIFICATION$ The justification that was provided by the endpoint user.

Application incident variables


The following Application incident variables are available:

$DATAOWNER_NAME$ The person responsible for remediating the incident. This field must
be set manually.

Reports can automatically be sent to the data owner for remediation.


Remediating incidents 1850
Response action variables

$DATAOWNER_EMAIL$ The email address of the person responsible for remediating the
incident. This field must be set manually.
Chapter 52
Remediating Network
incidents
This chapter includes the following topics:

■ Network incident list

■ Network incident list—Actions

■ Network incident list—Columns

■ Network incident snapshot

■ Network incident snapshot—Heading and navigation

■ Network incident snapshot—General information

■ Network incident snapshot—Matches

■ Network incident snapshot—Attributes

■ Network summary report

Network incident list


A network incident list shows multiple network incident records with information about the
incident such as: the severity, associated policy, number of matches, and status of the incident.
Click a row of the incident list to view more details about a specific incident. Select specific
incidents (or groups of incidents) to modify or remediate by clicking the check boxes at the
left.
Network incidents include incidents from Network Monitor and Network Prevent, as well as
Symantec WSS incidents generated by the Symantec Cloud Detection Service for WSS.
When IPv6 addresses appear in reports, they follow these rules:
Remediating Network incidents 1852
Network incident list

■ Addresses are normalized in the Source IP and Destination IP fields.


■ In the Recipient (URL) fields, addresses are represented as they have been provided,
which is usually a hostname and varies by protocol.
■ In the Sender fields, representation of addresses varies by protocol.
■ Normalized fields are used for IP-based filtering.
When IPv6 addresses appear in incident list filters, they follow these rules:
■ Addresses are normalized in the Source IP and Destination IP fields.
■ In the Recipient (URL) field, addresses are represented as they have been provided in
the Recipient (URL), Domain, and Sender fields.
■ Normalized fields are used for IP-based filtering.
When IPv6 addresses appear in incident details, they follow these rules:
■ Addresses are normalized in the Source IP and Destination IP fields.
■ In the Recipient (URL) field, addresses are represented as they have been provided.
■ In the Sender field, addresses are represented as they have been provided.
■ Links to filtered lists behave like user input.
You can view normalized IPv6 addresses in an incident summary:
■ Addresses are summarized by the Source IP, Destination IP, Sender, and Domain fields.
■ Normalization occurs for fields as it does in the incident details.
You can view non-normalized IPv6 addresses in an incident summary:
■ Addresses are summarized by the Source IP, Destination IP, Sender, and Domain fields.
■ Normalization occurs for fields as it does in the incident details.

Note: Use caution when you click Select All. This action selects all incidents in the report (not
only those on the current page). Any incident command you subsequently apply affects all
incidents. To select only the incidents on the current page, select the checkbox at top left of
the incident list.

Incident information is divided into several columns. Click any column header to sort
alpha-numerically by that column's data. To sort in reverse order, click the column header a
second time. By default, Symantec Data Loss Prevention sorts incidents by date.
The Type column shows the icons that indicate the type of network incident. Table 52-1
describes the icons.
Remediating Network incidents 1853
Network incident list

Table 52-1 Type of network incident

Icon Description

SMTP
The addition of the second icon indicates a message
attachment.

HTTP

Symantec Data Loss Prevention also detects the


Yahoo and MSN IM traffic that is tunneled through
HTTP.

The addition of the second icon indicates an


attachment to Web-based email.

HTTPS

FTP

NNTP

IM:MSN

IM:AIM

IM:Yahoo

TCP:custom_protocol

This column also indicates whether the communication was blocked or altered. Table 52-2
shows the possible values.

Table 52-2 Incident block or altered status

Icon Description

No icon. Blank if the communication was not blocked.

Indicates Symantec Data Loss Prevention blocked


the communication containing the matched text.
Remediating Network incidents 1854
Network incident list—Actions

Table 52-2 Incident block or altered status (continued)

Icon Description

Indicates Symantec Data Loss Prevention removed


confidential data from Web postings or Web-based
email messages. This icon can also indicate that a
file was uploaded to a Web site or attached to a
Web-based email message.

Indicates that Symantec Data Loss Prevention


added or modified the headers on the message that
generated the incident.

Use the following links to learn more about the Network incident list page:

To learn more about See this section

Columns of the incident list table See “Network incident list—Columns” on page 1856.

Actions to perform on selected incidents See “Network incident list—Actions” on page 1854.

Details of a specific incident See “Network incident snapshot” on page 1857.

Viewing a summary of all network incidents See “Network summary report” on page 1861.

Common features of all Symantec Data Loss See “About incident reports” on page 1902.
Prevention reports
See “Common incident report features” on page 1933.

See “Saving custom incident reports” on page 1914.

Network incident list—Actions


You can select one or more incidents and then remediate them using commands in the Incident
Actions drop-down list. The incident commands are as follows:

Action Description

Add Note Select to open a dialog box, type a comment, and


then click OK.
Remediating Network incidents 1855
Network incident list—Actions

Action Description

Hide/Unhide Select one of the following archive actions to set


the archive state for the selected incidents:

■ Hide Incidents—Flags the selected incidents


as archived.
■ Unhide Incidents—Restores the selected
incidents to the non-archived state.
■ Do Not Hide—Prevents the selected incidents
from being archived.
■ Allow Hiding—Allows the selected incidents to
be archived.

See “About incident hiding” on page 1958.

Delete Incidents Select to delete specified incidents.

Export Selected: CSV Select to save specified incidents in a


comma-separated text (.csv) file or XML file, which
Export Selected: XML
can be displayed in several common applications,
such as Microsoft Excel.

Lookup Attributes Use lookup plug-ins to look up incident custom


attributes.

Run Smart Response Select to run a Smart Response rule that you or
your administrator configured. (To configure a Smart
Response rule, navigate to Policy > Response
Rules, click Add Response Rule, and select Smart
Response.

Set Attributes Select to set attributes for the selected incidents.

Set Data Owner Set the data owner name or email address. The
data owner is the person responsible for remediating
the incident.

Reports can automatically be sent to the data owner


for remediation.

Set Severity Select to set severity.

Set Status Select to set status.

See “About incident remediation” on page 1841.


See “Network incident list” on page 1851.
Remediating Network incidents 1856
Network incident list—Columns

Network incident list—Columns


Incident information is divided into several columns. Click any column header to sort
alpha-numerically by that column's data. To sort in reverse order, click the column header a
second time. By default, Symantec Data Loss Prevention lists incidents by date.
The report includes the following columns:
■ Check boxes that let you select incidents to remediate.
You can select one or more incidents to which to apply commands from the Incident
drop-down menu at the top of the list. Click the checkbox at the top of the column to select
all incidents on the current page. (Note that you can also click Select All at far right to select
all incidents in the report.)
■ Type
The protocol over which the match was detected.
See “Network incident list” on page 1851.
■ Subject/Sender/Recipient(s)
Message subject, sender email address or IP address, recipient email address(es), or
URL(s).
■ Sent
Date and time the message was sent.
■ ID/Policy
Symantec Data Loss Prevention incident ID number and the policy against which the
incident was logged.
■ Matches
Number of matches in the incident.
■ Sev
Incident severity as determined by the severity setting of the rule the incident matched.
The possible values are as follows:

Icon Description

High

Medium

Low

For information only

■ Status
Remediating Network incidents 1857
Network incident snapshot

Current incident status.


The possible values are as follows:
■ New
■ In Process
■ Escalated
■ False Positive
■ Configuration Errors
■ Resolved
You or your administrator can add new status designations on the Attribute Setup page.
See “Network incident list” on page 1851.

Network incident snapshot


An incident snapshot provides detailed information about a particular incident. It displays
general incident information, matches detected in the intercepted text, and incident attributes.
The snapshot also enables you to execute any Smart Response rules that you have configured.
The incident snapshot is divided into three panes, with navigation and Smart Response options.
Click on a link to view more help about the incident snapshot:

To learn more about See the section

Navigation and Smart Response options See “Network incident snapshot—Heading and
navigation” on page 1857.

General incident information (left-hand pane) See “Network incident snapshot—General


information” on page 1858.

Matches in incident (middle pane) See “Network incident snapshot—Matches”


on page 1860.

Attributes (right-hand pane) See “Network incident snapshot—Attributes”


on page 1861.

Network incident snapshot—Heading and navigation


The following page navigation tools appear near the top of the incident snapshot:

Previous Displays the previous incident in the source report.

Next Displays the next incident in the source report.


Remediating Network incidents 1858
Network incident snapshot—General information

Returns to the source report (where you clicked the


link to get to this screen).

Updates the snapshot with any new data, such as


a new comment in the History section or a modified
status.

If you configured any Smart Response rules, Symantec Data Loss Prevention displays the
response options for executing the rules at the top of the page. Depending on the number of
Smart Response rules, a drop-down menu may also appear.
See “Network incident snapshot” on page 1857.

Network incident snapshot—General information


The left section of the snapshot displays general incident information. You can click on many
values to view an incident list that is filtered on that value. An icon may appear next to the
Status drop-down list to indicate whether the request that generated the incident was blocked
or altered.
See Table 52-2 on page 1853.
The current status and severity of the incident appear to the right of the snapshot heading. To
change one of the current values, click on it and choose another value from the drop-down
list.
The remaining portion of the general information pane is divided into four tabs.
■ Key Info
■ History
■ Notes
■ Correlations
Information in this section is divided into the following categories (not all of which appear for
every incident type):
Remediating Network incidents 1859
Network incident snapshot—General information

Table 52-3 Incident general information tabs

Tab Name Description

Key Info The Key Info tab shows the policy that was violated in the incident. It also
shows the total number of matches for the policy, as well as matches per
policy rule. Click the policy name to view a list of all incidents that violated
the policy. Click view policy to view a read-only version of the policy.

This section also lists other policies that the same file violated. To view
the snapshot of an incident that is associated with a particular policy, click
go to incident next to the policy name. To view a list of all incidents that
the file created, click show all.

The Key Info tab also includes the following information:

■ The name of the detection server that recorded the incident.


■ The date and time the message was sent
■ The sender email or IP address
■ The recipient email or IP address(es)
■ The SMTP heading or the NNTP subject heading
■ The Is Hidden field displays the archived state of the incident, whether
or not the incident is hideable, and allows you to toggle the Do Not
Hide flag for the incident.
■ Attachment file name(s). Click to open or save the file.
If a response rule tells Symantec Data Loss Prevention to discard the
original message, you cannot view the attachment.
■ The person responsible for remediating the incident (Data Owner
Name). This field must be set manually, or with a lookup plug-in.
Reports can automatically be sent to the data owner for remediation.
If you click on a hyperlinked Data Owner Name, a filtered list of
incidents by Data Owner Name is displayed.
■ The email address of the person responsible for remediating the
incident (Data Owner Email Address). This field must be set
manually, or with a lookup plug-in.
If you click on the hyperlinked Data Owner Email Address, a filtered
list of incidents by Data Owner Email Address is displayed.

History View the actions that were performed on the incident. For each action,
Symantec Data Loss Prevention displays the action date and time, the
actor (a user or server), and the action or the comment.

See “Executing Smart response rules” on page 1845.

See “Manage response rules” on page 1761.


Remediating Network incidents 1860
Network incident snapshot—Matches

Table 52-3 Incident general information tabs (continued)

Tab Name Description

Notes View any notes that you or others have added to the incident. Click Add
Note to add a note.

See “Incident snapshot notes tab” on page 1937.

Correlations You can view a list of those incidents that share attributes of the current
incident. For example, you can view a list of all incidents that a single
account generated. The Correlations tab shows a list of correlations that
match single attributes. Click on attribute values to view lists of those
incidents that are related to those values.

To search for other incidents with the same attributes, click Find Similar.
In the Find Similar Incidents dialog box that appears, select the desired
search attributes. Then click Find Incidents.
Note: The list of correlated incidents does not display related incidents
that have been hidden.

See “Network incident snapshot” on page 1857.


See “About incident hiding” on page 1958.

Network incident snapshot—Matches


Beneath the general information, Symantec Data Loss Prevention displays the message
content (if applicable) and the matches that caused the incident. Symantec Data Loss Prevention
displays the following types of message content, depending on protocol type:

Protocol Message content

SMTP Message body

HTTP Name value pairs of the HTTP request

FTP Nothing shown

NNTP Message body

IM (all providers) IM conversation

TCP Data that was transmitted through custom protocol

Matches are highlighted in yellow and organized according to the message component (such
as header, body, or attachment) in which they were detected. Symantec Data Loss Prevention
displays the total relevant matches for each message component. It shows matches by the
Remediating Network incidents 1861
Network incident snapshot—Attributes

order in which they appear in the original text. To view the rule that triggered a match, click
on the highlighted match.
See “About the Similarity Threshold and Similarity Score” on page 667.
See “Network incident snapshot” on page 1857.

Network incident snapshot—Attributes


Note: This section appears only if a system administrator has configured custom attributes.

You can view a list of custom attributes and their values, if any have been specified. Click on
attribute values to view an incident list that is filtered on that value. To add new values or edit
existing ones, click Edit. In the Edit Attributes dialog box that appears, type the new values
and click Save.
See “Setting the values of custom attributes manually” on page 1972.
See “Network incident snapshot” on page 1857.

Network summary report


The Network summary report provides summary information about the incidents that are found
on your network. You can organize the report by one or two summary criteria. A single-summary
report is organized by a single summary criterion, such as the policy that is associated with
each incident. A double-summary report is organized by two criteria, such as policy and incident
status.
To view the primary criteria and the secondary summary criteria available for the current report,
click the Advanced Filters & Summarization bar. The bar is near the top of the report. The
Summarize By: listboxes show the primary criteria and the secondary summary criteria. In
each listbox, Symantec Data Loss Prevention displays all out-of-the-box criteria in alphabetical
order, followed by any custom criteria that your system administrator has defined. Summary
reports take their name from the primary summary criterion (the value of the first listbox). If
you rerun a report with new criteria, the report name changes accordingly.
Summary entries are divided into several columns. Click any column header to sort
alpha-numerically by that column's data. To sort in reverse order, click the column header a
second time.
Remediating Network incidents 1862
Network summary report

Table 52-4 Summary report columns

Column name Description

summary_criterion This column is named for the primary summary


criterion. It lists primary and (for double summaries)
secondary summary items. In a Policy Summary,
this column is named Policy and it lists policies.
Click on a summary item to view a list of incidents
that are associated with that item.

Total The total number of incidents that are associated


with the summary item. In a Policy Summary, this
column gives the total number of incidents that are
associated with each policy.

High Number of high-severity incidents that are


associated with the summary item. (The severity
setting of the rule that was matched determines the
incident severity.)

Med Number of medium-severity incidents that are


associated with the summary item.

Low Number of low-severity incidents that are associated


with the summary item.

Info The number of informational incidents that are


associated with the summary item.

Bar Chart A visual representation of the number of incidents


(of all severities) associated with the summary item.
The bar is broken into proportional, colored sections
to represent the various severities.

Matches Total number of matches associated with the


summary item.

If any of the severity columns contain totals, you can click on them to view a list of incidents
of the chosen severity.
See “Common incident report features” on page 1933.
See “About dashboard reports and executive summaries” on page 1903.
See “About incident reports” on page 1902.
See “Saving custom incident reports” on page 1914.
Chapter 53
Remediating Endpoint
incidents
This chapter includes the following topics:

■ About endpoint incident lists

■ Endpoint incident snapshot

■ Reporting on Endpoint Prevent response rules

■ Endpoint incident destination or protocol-specific information

■ Endpoint incident summary reports

About endpoint incident lists


An endpoint incident list shows endpoint incidents that contain basic information such as
protocol or destination, severity, associated policy, number of matches, and status. Click on
any incident to view a snapshot containing more incident details. You can select specific
incidents (or groups of incidents) to modify or remediate.

Note: Endpoint reports show only the incidents that were captured by Endpoint Prevent.
Incidents that were captured by Endpoint Discover appear in Network Discover reports.

Incident information is divided into several columns. Click any column header to sort
alpha-numerically by the data in that column. To sort in reverse order, click the column header
a second time. By default, Symantec Data Loss Prevention lists incidents by date.
The report includes the following columns:
■ Check boxes that let you select incidents to remediate
Remediating Endpoint incidents 1864
About endpoint incident lists

You can select one or more incidents to which to apply commands from the Incident drop-down
menu at the top of the list. Click the checkbox at the top of the column to select all incidents
on the current page. (You can click Select All at far right to select all incidents in the report.)

Table 53-1 Type of endpoint incident

Graphic Type of incident

CD/DVD burner (for example, Windows Media


burner)

Removable media (for example, a USB flash drive


or SD card)

Fixed drive (for example, the C:\ drive)

Endpoint copy to network share

Email/SMTP

HTTP

HTTPS

FTP

IM: MSN

IM: Yahoo

Print/Fax

Clipboard

Application File Access

A response column that indicates whether Symantec Data Loss Prevention blocked an
attempted violation or notified the end user about the violation of confidential data.
The possible values are as follows:
Remediating Endpoint incidents 1865
About endpoint incident lists

■ Blank if Symantec Data Loss Prevention did not block the violation or notify the end user
■ A red icon indicates the violation was blocked by Symantec Data Loss Prevention, by the
user, or if the user cancel option time limit expired.
■ A notification icon indicates Symantec Data Loss Prevention notified the end user about
the violated confidential data policies. The notification icon also appears if the user allowed
the violating data transfer. The icon also appears if the user cancel time limit option has
expired and the default action is set to allow data transfers.
The other columns of this section appear as follows:

Table 53-2 Endpoint incident columns

Column Definition

File Name/Machine/User/Subject/Recipient File name, computer, endpoint user (domain and


logon name), subject title (if Email/SMTP violation),
and recipient user that is associated with the
incident.

When temporary files generate incidents on Mac


agents, the temporary file name displays in the File
Name column.

Occurred On Date ■ Incident date and time


■ Reported On Date
■ Time and date that the incident was reported. If
the endpoint is disconnected from the corporate
network, incidents are reported when the
connection is restored.

ID/Policy Symantec Data Loss Prevention incident ID number


and the policy against which the incident was
logged.

Matches Number of matches in the incident.

Severity Incident severity as determined by the severity


setting of the rule the incident matched.

The possible values are as follows:

■ High
■ Medium
■ Low
■ For information only
Remediating Endpoint incidents 1866
Endpoint incident snapshot

Table 53-2 Endpoint incident columns (continued)

Column Definition

Status Current incident status.


The possible values are as follows:

■ New
■ In Process
■ Escalated
■ False positive
■ Configuration Errors
■ Resolved

You or your administrator can add new status designations on the Attribute Setup page.
See “Endpoint incident snapshot” on page 1866.
See “About incident remediation” on page 1841.
See “About incident reports” on page 1902.
See “Saving custom incident reports” on page 1914.

Endpoint incident snapshot


An incident snapshot provides detailed information about a particular Endpoint Prevent incident.
It displays general incident information, matches detected in the intercepted text, and details
about attributes, incident history, and the violated policy. You can also search for similar
incidents in the Correlations area.

Note: Endpoint Discover incidents are captured in Network Discover reports.

See “Discover incident lists” on page 1879.


Current status and severity appear under the snapshot heading. To change one of the current
values, click on it and choose another value from the drop-down list. If any action icon is
associated, it also appears here.
If you have configured any Smart Response rules, Symantec Data Loss Prevention displays
a Remediation bar (under the Status bar). The Remediation bar includes options for executing
the rules. Depending on the number of Smart Response rules, a drop-down menu may also
appear.
The top left section of the snapshot displays general incident information. You can click most
information values to view an incident list that is filtered on that value. Information in this section
is divided into the following categories (not all of which appear for every incident type):
Remediating Endpoint incidents 1867
Endpoint incident snapshot

Table 53-3 Type of incident

Icon Incident type

CD/DVD burners (for example, Windows Media


burner)

Removable media (for example, a USB flash drive


or SD card)

Local drive

Network Share

Email/SMTP

HTTP

HTTPS/SSL

FTP

IM: MSN

IM: Yahoo

Print/Fax

Clipboard

Application File Access

The following table contains the other informational sections:


Remediating Endpoint incidents 1868
Endpoint incident snapshot

Table 53-4 Incident sections

Section Description

Server Name of the Endpoint Server that detected the


incident for two-tier detection. Or, it is the name of
the Endpoint Server that received the incident from
the Symantec DLP Agent.

Agent response The Endpoint Block, Endpoint Notify, Endpoint


Quarantine, Endpoint FlexResponse, Action
Encrypted, Action Encryption Blocked, or User
Cancel action, if any. The possible values are as
follows:

■ Blank or no icon if Symantec Data Loss


Prevention did not block the copy or notify the
end user.
■ A red circle icon indicates Symantec Data Loss
Prevention blocked confidential data.
■ A message icon indicates Symantec Data Loss
Prevention notified the end user that the data is
confidential.
■ A green tick mark with a key indicates that
Symantec Data Loss Prevention blocked the
user's action and encrypted the file or files that
the user was trying to copy or move.
■ A red X icon with a key indicates that Symantec
Data Loss Prevention blocked the user's action
and but did not encrypt the file or files that the
user was trying to copy or move.

See Reporting on Endpoint Prevent Response


Rules.

Incident Occurred On Date and time the incident occurred.

Incident Reported On Date and time the Endpoint Server detected the
incident.

Is Hidden Displays the hidden state of the incident, whether


or not the incident is hideable, and allows you to
toggle the Do Not Hide flag for the incident. See
“About incident hiding” on page 1958.

User Endpoint user name (for example,


MYDOMAIN\bsmith).
Remediating Endpoint incidents 1869
Endpoint incident snapshot

Table 53-4 Incident sections (continued)

Section Description

User Justification The justification label precedes by the text that is


presented to the end user in the on-screen
notification (for example, Manager Approved: "My
manager approved the transfer of this data.")
Symantec Data Loss Prevention uses the label for
classification and filtering purposes in reports, but
the endpoint user never sees it. Click the label to
view a list of incidents in which the end user chose
this justification.

Machine Name Computer on which the incident occurred.

Machine IP (Corporate) The IP address of the violating computer if the


computer was on the corporate network.

File name Name of the file that violated the policy. The file
name field appears only for fixed-drive incidents.

Quarantine Result If you have Endpoint Discover: Quarantine response


rules configured, you may see one of the following
quarantine scenarios:

■ File Quarantined
■ Quarantine Failed
■ Quarantine Result Timeout

Quarantine Location Displays the file path of the secure location where
the file was moved.

Quarantine Details Displays the reason that the quarantine task failed
to move the confidential file. For example, the action
may fail because the source file is missing, or the
credentials to access the secure location are
incorrect.

The Quarantine Details file also displays information


if the status of the quarantined file is unknown
because of a Quarantine Result Timeout event.

Endpoint Location Indicates whether or not the endpoint was


connected to the corporate network at the time the
incident occurred.

Application Name The name of the application that caused the


incident.
Remediating Endpoint incidents 1870
Endpoint incident snapshot

Table 53-4 Incident sections (continued)

Section Description

Destination The destination location or file path for the


confidential data, depending on the device or
protocol.

Destination IP The destination IP address for the confidential data.


The Destination IP address appears only for specific
network incidents.

Source The original file or data for the violation. The source
primarily appears in file-transfer incidents.

Sender The sender of the confidential data for network


violations.

Recipient The intended recipient of the confidential data for


network violations.

FTP User Name The originating user name for violating FTP
transfers.

Attachments The associated file(s) or attachments sent (for


network incidents). If your administrator has
configured Symantec Data Loss Prevention to retain
endpoint incident data, you can click on a file name
to view file contents.

Data Owner The specified owner of the confidential data.

Data Owner Email Address The email address for the owner of the confidential
data.

Access information The available ACL information. Only applicable to


Endpoint Discover and Endpoint Prevent local drive
monitoring.

See “Incident snapshot access information section”


on page 1938.

Other sections of the incident snapshot are common across all Symantec Data Loss Prevention
products. These common sections include:
■ Incident snapshot matches
See “Incident snapshot matches section” on page 1938.
■ Incident snapshot policy section
See “Incident snapshot policy section” on page 1938.
Remediating Endpoint incidents 1871
Reporting on Endpoint Prevent response rules

■ Incident snapshot correlations section


See “Incident snapshot correlations tab” on page 1937.
■ Incident snapshot attributes section. (This section appears only if a system administrator
has configured custom attributes.)
See “Incident snapshot policy section” on page 1938.
■ Incident snapshot history tab
See “Incident snapshot history tab” on page 1936.
■ Incident snapshot notes tab
See “Incident snapshot notes tab” on page 1937.
The Endpoint incident snapshot also contains two sections that are not common across other
product lines. Those sections are:
■ Destination or protocol-specific information
See “Endpoint incident destination or protocol-specific information” on page 1872.
■ Reporting on Endpoint Prevent response rules
See “Reporting on Endpoint Prevent response rules” on page 1871.

Reporting on Endpoint Prevent response rules


If user activity on the endpoint triggers more than one response rule, Symantec Data Loss
Prevention determines which policy to apply based on an established order of precedence.
Only the response rule that is associated with the prevailing policy is executed. Symantec
Data Loss Prevention creates incidents for all policies that are violated. It indicates (in the
relevant incident snapshots) that the response rules were superseded.
See “Endpoint incident snapshot” on page 1866.
By default, the following list is the main order of precedence for Endpoint Prevent incidents:
■ Block
■ User Cancel
■ Endpoint FlexResponse
■ Notify

Note: For Endpoint Discover, Quarantine incidents always take precedence over Endpoint
FlexResponse incidents.

Be aware of the following behavior regarding reporting of superseded incidents:


■ The snapshot of a superseded Endpoint Block or User Cancel incident still displays the
Blocked icon, because Symantec Data Loss Prevention did block the content in question.
Remediating Endpoint incidents 1872
Endpoint incident destination or protocol-specific information

The icon also indicates if the content was blocked because the user elected to block the
content. Alternately, the icon indicates that the user cancel time limit was exceeded and
the content was blocked.
■ The snapshot of a superseded Endpoint Notify incident does not include the Notify icon.
The Notify icon is not included because Symantec Data Loss Prevention did not display
the particular on-screen notification that was configured in the policy.
■ The snapshot of a superseded Endpoint Quarantine incident displays the Blocked icon
because the data did not move out of the secured area. The icon also indicates if the content
was blocked because the user elected to block the content. Alternately, the icon indicates
that the user cancel time limit was exceeded and the content was blocked. The History tab
of the incident snapshot always displays information on whether the Endpoint FlexResponse
rule was successful.
■ The snapshot of a superseded Endpoint FlexResponse incident displays the Blocked icon
because the data did not move out of the secured area. The icon also indicates if an
Endpoint Quarantine response rule was activated.
If you have configured Endpoint Prevent response rules to display on-screen notifications
prompting users to justify their actions, the following statements are true:
■ Symantec Data Loss Prevention displays the user justification in the snapshots of all the
incidents that are generated by the policies that include the executed response rule.
■ Symantec Data Loss Prevention displays the justification Superseded – Yes in the
snapshots of all superseded incidents that do not include the executed response rule.
■ If there is no user to enter a justification, for example if a user accesses a remote computer,
the justification reads N/A.
See “Network incident snapshot” on page 1857.
See “Configuring response rule conditions” on page 1764.
See “About incident reports” on page 1902.
See “Manage response rules” on page 1761.

Endpoint incident destination or protocol-specific


information
Depending on the type of incident, additional information that is associated with the incident
snapshot is visible.
Remediating Endpoint incidents 1873
Endpoint incident destination or protocol-specific information

Table 53-5 Destination or protocol-specific information

Destination or protocol Description

URL For network incidents, denotes the URL where the


incident occurred.

Source IP and Port For network incidents, denotes the IP address or


port of the endpoint that originated the incident. This
information is only shown if the incident is created
on this endpoint.

Destination IP and Port The IP address of the destination endpoint that is


associated with the incident. This information is only
shown if the incident is created on this endpoint.

Sender/Recipient Email For Email/SMTP and IM incidents, incidents also


contain the email addresses of the sender and
recipient. The sender or recipient email address are
only shown if the incident occurs on them.

Subject The subject line of the Email/SMTP message is


displayed.

FTP user name at the FTP Destination For FTP incidents, the user name at the FTP
destination is displayed.

Server IP For FTP incidents, the server IP address is shown.

File Name/Location For print/fax incidents, the name of the file and the
location of the file on the endpoint is displayed.

Print Job Name For print/fax incidents, the print job name is the file
name of the printing job that generated the incident.

Printer Name/Type For print/fax incidents, the printer name and type
are only displayed if the file cannot be named
through from the Print Job name. Or, if the file was
generated from an Internet browser.

Application Window For Clipboard incidents, the application window is


the application name from which the contents of the
Clipboard were taken.

Source Application For Clipboard incidents, the application name from


which the contents of the Clipboard were taken.

Source Application Window Title For Clipboard incidents, the application window
name from which the contents of the Clipboard were
taken.
Remediating Endpoint incidents 1874
Endpoint incident summary reports

Table 53-5 Destination or protocol-specific information (continued)

Destination or protocol Description

Title Bar For Clipboard incidents, the title bar is the window
from which the data was copied.

See “Endpoint incident snapshot” on page 1866.

Endpoint incident summary reports


Endpoint incident summary reports provide information about those Endpoint incidents that
has been summarized by specific criteria. You can summarize incidents by one or more types
of criteria. A single-summary report is organized by a single summary criterion, such as the
policy that is associated with each incident. A double-summary report is organized by two or
more criteria, such as policy and incident status.

Note: Endpoint reports show only the incidents that are captured by Endpoint Prevent. Incidents
from Endpoint Discover appear in Network Discover reports.

To view the primary and the secondary summary criteria available for the report, go to the
Summarize By link. Click Edit. In the Primary and Secondary drop-down menus, Symantec
Data Loss Prevention displays all of the criteria in alphabetical order, followed by custom
criteria your system administrator defined. You can select criteria from the Primary and
Secondary drop-down menus and then click Run Now to create a new summary report.
Summary reports take their name from the primary summary criterion. If you rerun a report
with new criteria, the report name changes accordingly.
See “About filters and summary options for reports” on page 1940.
Summary entries are divided into several columns. Click any column header to sort
alpha-numerically by that column's data. To sort in reverse order, click the column header a
second time.

Table 53-6 Endpoint incident summary report details

Field Description

Summary criteria This column contains the name of whichever


summery criteria you selected. If you select a
primary and a secondary summary criteria, only the
primary criteria is displayed.
Remediating Endpoint incidents 1875
Endpoint incident summary reports

Table 53-6 Endpoint incident summary report details (continued)

Field Description

Total Total number of the incidents that are associated


with the summary item. For example, in a Policy
Summary this column gives the total number of
incidents that are associated with each policy.

High Number of high-severity incidents that are


associated with the summary item. (The severity
setting of the rule that was matched determines the
level of severity.)

Med Number of medium-severity incidents that are


associated with the summary item.

Low Number of low-severity incidents that are associated


with the summary item.

Info Number of the informational incidents that are


associated with the summary item.

Bar Chart A visual representation of the number of incidents


(of all severities) associated with the summary item.
The bar is broken into proportional colored sections
that represent the various severities.

Matches Total number of matches associated with the


summary item.

If any of the severity columns contain totals, you


can click on them to view a list of incidents of the
chosen severity.
Chapter 54
Remediating Discover
incidents
This chapter includes the following topics:

■ About reports for Network Discover

■ About incident reports for Network Discover/Cloud Storage Discover

■ Discover incident reports

■ Discover incident lists

■ Discover incident actions

■ Discover incident entries

■ Discover incident snapshot

■ Discover summary reports

About reports for Network Discover


Symantec Data Loss Prevention has reports for incidents, Network Discover/Cloud Storage
Discover targets, scan details, and scan history.
The Network Discover/Cloud Storage Discover incident reports contain details about the
confidential data that is exposed.
See “About incident reports for Network Discover/Cloud Storage Discover” on page 1877.
For information about Network Discover/Cloud Storage Discover targets and scan history, go
to Manage > Discover Scanning > Discover Targets, then select one of the Discover targets
from the list. For information about Network Discover/Cloud Storage Discover scan details,
Remediating Discover incidents 1877
About incident reports for Network Discover/Cloud Storage Discover

go to Manage > Discover Scanning > Scan History, then select one of the Discover scans
from the list.
See “Managing Network Discover/Cloud Storage Discover target scans” on page 2111.
Table 54-1 lists the Network Discover/Cloud Storage Discover reports.

Table 54-1 Network Discover/Cloud Storage Discover Reports

Report Navigation

Network Discover/Cloud This report is on the Enforce Server administration console, Manage menu,
Storage Discover Targets Discover Scanning > Discover Targets.

See “About the Network Discover/Cloud Storage Discover scan target list”
on page 2111.

Scan Status This report is on the Enforce Server administration console, Manage menu,
Discover Scanning > Discover Servers.

See “Viewing Network Discover/Cloud Storage Discover server status”


on page 2121.

Scan History (single This report is from the Enforce Server administration console, Manage
target) menu, Discover Scanning > Discover Targets. Click the link in the Scan
Status column to see the history of a particular scan target.

See “About Discover and Endpoint Discover scan histories” on page 2114.

Scan History (all targets) This report is from the Enforce Server administration console, Manage
menu, Discover Scanning > Scan History.
See “About Discover and Endpoint Discover scan histories” on page 2114.

Scan Details This report is from the Enforce Server administration console, Manage
menu, Discover Scanning > Scan History. Click the link in the Scan Status
column to see the scan details.

See “About Discover scan details” on page 2117.

About incident reports for Network Discover/Cloud


Storage Discover
Use incident reports to track and respond to Network Discover/Cloud Storage Discover incidents.
You can save, send, export, or schedule Symantec Data Loss Prevention reports.
See “About Symantec Data Loss Prevention reports” on page 1899.
In the Enforce Server administration console, on the Incidents menu, click Discover This
incident report displays all incidents for all Discover targets. You can select the standard reports
Remediating Discover incidents 1878
Discover incident reports

for all incidents, new incidents, target summary, policy by target, status by target, or top shares
at risk.
Summaries and filter options can select which incidents to display.
See “About custom reports and dashboards” on page 1912.
See “About filters and summary options for reports” on page 1940.
You can create custom reports with combinations of filters and summaries to identify the
incidents to remediate.
For example you can create the following reports:
■ A summary report of the number of incidents in each remediation category.
Select the summary Protect Status.
■ A report of all the incidents that were remediated with copy or quarantine.
Select the filter Protect Status with values of File Copied and File Quarantined.
■ A report of the Network Discover incidents that have not been seen before (to identify these
incidents and notify the data owners to remediate them).
Select the filter Seen Before?. Set a value of No.
■ A report of the Network Discover incidents that are still present (to know which incidents
to escalate for remediation).
Select the filter Seen Before?. Set a value of Yes.
■ A report using the summary filters, such as months since first detected.
Select the summary Months Since First Detected.

Discover incident reports


Use Network Discover/Cloud Storage Discover incident reports to monitor and respond to
Network Discover/Cloud Storage Discover incidents. You can save, send, export, or schedule
Symantec Data Loss Prevention reports.
In the Enforce Server administration console, on the Incidents menu, click Discover This
incident report displays all incidents for all Discover targets. You can select the standard reports
for all incidents, new incidents, target summary, policy by target, status by target, or top shares
at risk.
Summaries and filter options can select which incidents to display.
See “Incident report filter and summary options” on page 1934.
You can create custom reports with combinations of filters and summaries to identify the
incidents to remediate.
See “About custom reports and dashboards” on page 1912.
Remediating Discover incidents 1879
Discover incident lists

Network Discover has the following types of reports:


■ Incident list
See “Discover incident lists” on page 1879.
■ Incident snapshot
See “Discover incident snapshot” on page 1882.
■ Incident summary
See “Discover summary reports” on page 1885.

Discover incident lists


A Discover incident list shows the incidents that are reported during Discover scans (including
the incidents from Endpoint Discover). Individual incident records contain information such as
severity, associated policy, number of matches, and status.
See “ Discover incident entries” on page 1880.
You can select specific incidents (or a group of incidents) to modify or remediate.
See “Discover incident actions” on page 1879.
You can click on any incident to view a snapshot containing more details.
See “Discover incident snapshot” on page 1882.
See “Discover incident reports” on page 1878.

Discover incident actions


You can select one or more incidents and then remediate them using commands in the Incident
Actions drop-down list.
The incident commands are as follows:
■ Add Note
Select to open a dialog box, type a comment, and then click OK.
■ Delete Incidents
Select to delete specified incidents.
■ Export Selected: CSV
Select to save specified incidents in a comma-separated text (.csv) file, which can be
displayed in several common applications, such as Microsoft Excel.
■ Export Selected: XML
Select to save specified incidents in an XML file, which can be displayed in several common
applications.
Remediating Discover incidents 1880
Discover incident entries

■ Hide/Unhide
Select one of the following actions to set the display state for the selected incidents:
■ Hide Incidents—Flags the selected incidents as hidden.
■ Unhide Incidents—Restores the selected incidents to the unhidden state.
■ Do Not Hide—Prevents the selected incidents from being hidden.
■ Allow Hiding—Allows the selected incidents to be hidden.
See “About incident hiding” on page 1958.
■ Set Attributes
Select to set attributes for the selected incidents.
■ Set Data Owner
Set the data owner name or email address. The data owner is the person responsible for
remediating the incident.
Reports can automatically be sent to the data owner for remediation.
■ Set Status
Select to set status.
■ Set Severity
Select to set severity.
■ Lookup Attributes
Use the lookup plug-ins to look up incident custom attributes.
■ Run Smart Response
Select to run a Smart Response rule you or your administrator configured.
See “Discover incident lists” on page 1879.

Discover incident entries


Incident information is divided into several columns. Click any column header to sort
alpha-numerically by that column's data. To sort in reverse order, click the column header a
second time.
The report includes the following columns:
■ Check boxes that let you select incidents to remediate.
You can select one or more incidents to which to apply commands from the Incident
Actions drop-down menu.
Click the checkbox at the top of the column or click Select All to select all incidents on the
current page.
Remediating Discover incidents 1881
Discover incident entries

Note: Use caution when you use Select All. This option selects all incidents in the report,
not only those on the current page. Any incident command you subsequently apply affects
all incidents. You may want to configure the maximum-incident-batch-size property to
limit the number of incidents that a Server FlexResponse plug-in processes at one time.
See “Adding a Server FlexResponse plug-in to the plug-ins properties file” on page 2143.

■ Type
Type of target in which the match was detected.
An icon represents each target type.
This column also displays a remediation icon, if any response rule applied.
The possible values are as follows:

Blank if no response rule applied

Copied

Quarantined

Remediation Error

When you use a Server FlexResponse action for an Automated or Smart response rule,
one of the following icons may appear:

This incident was successfully remediated using a Server FlexResponse action.

The Server FlexResponse action is in process.

The Server FlexResponse action has an error.

These same icons may appear for other incident types as well, and you can execute Server
FlexResponse actions on those incidents.
See “Configuring the Server FlexResponse action” on page 1788.
■ Location/Target/Scan
Repository or file location, target name, and date and time of most recent scan.
■ File Owner
Username of file owner (for example, MYDOMAIN\Administrator).
■ ID/Policy
The Symantec Data Loss Prevention incident number and the policy the incident violated.
■ Matches
The number of matches in the incident.
Remediating Discover incidents 1882
Discover incident snapshot

■ Severity
Incident severity as determined by the severity setting of the rule the incident matched.
The possible values are:

High

Medium

Low

For information only

■ Status
The current incident status.
The possible values are:
■ New
■ In Process
■ Escalated
■ False Positive
■ Configuration Errors
■ Resolved
The following icon may be displayed near the status if this incident was seen before:

This icon is displayed if this incident has an earlier connected incident.

You or your administrator can add new status designations on the attribute setup page.
See “Configuring custom attributes” on page 1970.
See “Discover incident lists” on page 1879.

Discover incident snapshot


An incident snapshot provides detailed information about a particular incident. It displays
general incident information, matches detected in the content, and details about policy,
attributes, and incident history. You can also search for similar incidents in the Correlations
area.
Current status and severity appear under the snapshot heading. To change one of the current
values, click it and choose another value from the drop-down list.
Remediating Discover incidents 1883
Discover incident snapshot

Use the icons at the top right to print the report, or send it as email. To send reports, you or
your administrator must first enable report distribution in system settings.
See “Configuring the Enforce Server to send email alerts” on page 176.
If any Smart Response rules are set up, Symantec Data Loss Prevention displays a remediation
bar that includes buttons for executing the rules. Depending on the number of Smart Response
rules, a drop-down menu may also appear.
See “About incident remediation” on page 1841.
Incident data is divided into the following sections:
■ Key Info tab
■ Policy Matches
See “Incident snapshot policy section” on page 1938.
■ Incident Details
The following details are included:

Server Name of the Discover Server that detected the incident.

Remediation The latest remediation status of the file that generated the incident.
Detection Status

Target Network Discover target name.

Scan The date and time of the scan that registered the incident.

Detection Date The date and time that the incident was detected.

Protect Status For Box incidents, displays the remediation status of the content that
generated the incident.

Seen Before No, if this incident was not previously detected. Yes, if this incident was
previously detected.

Subject Email subject for integrated Exchange scans.

Sender Email sender for integrated Exchange scans.

Recipient Email recipient for integrated Exchange scans.

File Location Location of the file, repository, or item.

Click go to file to view the item or file, or go to directory to view the


directory. If you view an Endpoint Discover incident, you do not see the
go to file or go to directory links.

Is Hidden Displays the hidden state of the incident, whether or not the incident is
hideable, and lets you toggle the Do Not Hide flag for the incident. See
“About incident hiding” on page 1958.
Remediating Discover incidents 1884
Discover incident snapshot

URL For SharePoint, this URL is the item on the SharePoint server. Click
this URL to go to the item on the SharePoint server.

Document Name File or item name(s)

File Owner Creator of the file or item.

For SharePoint and Exchange incident snapshots the File Owner is


listed as unknown because it is not applicable to these target types.

Extraction Date Date custom target adapter was run ( In the Firefox browser, these links
do not work without additional setup.

Applies to custom targets only.)

Scanned Machine Host name of the scanned computer.

For SharePoint this name is the web application name.

Notes Database Name of the IBM (Lotus) Notes database (Applies to IBM (Lotus) Notes
only.)

File Created The date and time that the file or item was created.

Last Modified Date and time of last change to the file or item.

Last Accessed Date and time of last user access to the file or item.

For SharePoint, this date is not valid.

Created By The user who created the file.

Modified By The user who last modified the file.

Data Owner Name The person responsible for remediating the incident. This field must be
set manually, or with a lookup plug-in.

Reports can automatically be sent to the data owner for remediation.

If you click on the hyperlinked Data Owner Name, a filtered list of


incidents by Data Owner Name is displayed.

Data Owner Email The email address of the person responsible for remediating the incident.
Address This field must be set manually, or with a lookup plug-in.

If you click on the hyperlinked Data Owner Email Address, a filtered


list of incidents by Data Owner Email Address is displayed.

■ Access Information
See “Incident snapshot access information section” on page 1938.
For SharePoint incident snapshots, the permission levels show the permissions from
SharePoint, for example Contribute or Design. The list in the incident snapshot shows
Remediating Discover incidents 1885
Discover summary reports

only the first 50 entries. All the ACL entries can be exported to a CSV file. The
permissions are comma-separated. Users or groups having Limited Access permission
levels are not recorded or shown.

Note: If you are scanning a SharePoint repository without using the SharePoint solution,
the incident snapshot will not show any SharePoint permissions information.

Box incident snapshots display collaborative folder information, including the


collaborators and their roles.
■ Shared Link Information
Cloud storage incident snapshots display shared link information, including whether a
link is shared, if it is password protected, if it can be downloaded, and the expiration
date of the link.
■ Message Body
For a SharePoint list item, the message body shows the name and value pairs in the
list.

■ Attributes
See “Incident snapshot attributes section” on page 1937.
■ History tab
See “Incident snapshot history tab” on page 1936.
■ Notes tab
See “Incident snapshot notes tab” on page 1937.
■ Correlations tab
See “Incident snapshot correlations tab” on page 1937.
■ Matches and file content
See “Incident snapshot matches section” on page 1938.
See “Discover incident reports” on page 1878.

Discover summary reports


Discover Summary Reports provide summary information about the incidents that are found
during Discover scans.
If you are running Endpoint Discover, the Discover Summary Reports also include Endpoint
Discover incidents.
You can filter or summarize the options in the reports.
See “Incident report filter and summary options” on page 1934.
Remediating Discover incidents 1886
Discover summary reports

You can extract the report information in selected formats.


You can click highlighted elements, such as the entries in the Totals column, to view details.
Icons provide navigation through long reports.
See “Page navigation in incident reports” on page 1934.
See “Discover incident reports” on page 1878.
Chapter 55
Working with Application
incidents
This chapter includes the following topics:

■ About Applications incident reports

■ Applications incident list

■ Applications incident entries

■ Applications incident actions

■ Applications incident snapshot

■ Applications summary reports

About Applications incident reports


Use Applications incident reports to monitor and manage incidents from the REST Cloud
Detection Service and API Detection for Developer Apps Appliances. You can save, send,
export, or schedule Symantec Data Loss Prevention reports.
In the Enforce Server administration console, on the Incidents menu, click Applications. This
incident report displays all incidents for all REST Cloud Detection Service detectors and API
Detection for Developer Apps Appliances.
You can pre-filter your Applications incident reports by the Data-at-Rest and Data-in-Motion
data types:
■ Incidents > Applications > Data-at-Rest
■ Incidents > Applications > Data-in-Motion
You can select the following standard reports for all incidents:
Working with Application incidents 1888
About Applications incident reports

■ Incidents - All
Displays a list of all incidents.
See “Applications incident list” on page 1889.
■ DIM - Incidents - All
Displays a list of all Data-in-Motion (DIM) incidents
See “Applications incident list” on page 1889.
■ DIM - Incidents - New
Displays a list of all DIM incidents with a status of New.
See “Applications incident list” on page 1889.
■ DIM - Policy Summary
Displays a summary of DIM incidents by policy.
See “Applications summary reports” on page 1896.
■ DIM - Status by Policy
Displays a summary of DIM incidents by policy and incident status.
See “Applications summary reports” on page 1896.
■ DIM - High Risk Users - Last 30 Days
Displays a summary of DIM incidents associated with high-risk users in the last 30 days.
See “Applications summary reports” on page 1896.
■ DAR - Incidents - All
Displays a list of all Data-at-Rest (DAR) incidents.
See “Applications incident list” on page 1889.
■ DAR - Incidents - New
Displays a list of all DAR incidents with a status of New.
See “Applications incident list” on page 1889.
■ DAR - Application Summary
Displays a summary of DAR incidents by cloud application.
See “Applications summary reports” on page 1896.
■ DAR - Policy Summary
Displays a summary of DAR incidents by policy.
See “Applications summary reports” on page 1896.
■ DAR - Status by Application
Displays a summary of DAR incidents by status and cloud application.
See “Applications summary reports” on page 1896.
■ DAR - High Risk Users
Displays a summary of DAR incidents associated with high-risk users.
See “Applications summary reports” on page 1896.
Working with Application incidents 1889
Applications incident list

Summaries and filter options can select which incidents to display.


See “Incident report filter and summary options” on page 1934.
You can create custom reports with combinations of filters and summaries to monitor the
incidents.
See “About custom reports and dashboards” on page 1912.
Applications have the following types of reports:
■ Incident list
See “Applications incident list” on page 1889.
■ Incident snapshot
See “Applications incident snapshot” on page 1892.
■ Incident summary
See “Applications summary reports” on page 1896.

Applications incident list


An Applications incident list shows the incidents that are reported by the REST Cloud Detection
Service or API Detection for Developer Apps Appliance. Individual incident records contain
information such as severity, associated policy, number of matches, and status.

Note: If you have an existing Symantec Web Security Service (WSS) implementation using
the REST Cloud Detection Service, your WSS incidents appear in the Applications >
Data-in-Motion incident list. If you have a Symantec WSS implementation using the Cloud
Detection Service for WSS, your WSS incidents appear in the Network incident list.

See “Applications incident entries” on page 1889.


You can select specific incidents (or a group of incidents) to modify or manage.
See “Applications incident actions” on page 1891.
You can click on any incident to view a snapshot containing more details.
See “Applications incident snapshot” on page 1892.
See “About Applications incident reports” on page 1887.

Applications incident entries


Incident information is divided into several columns. Click any column header to sort
alpha-numerically by the data in that column. To sort in reverse order, click the column header
a second time.
Working with Application incidents 1890
Applications incident entries

The report includes the following columns:


■ Checkboxes that let you select incidents to manage.
You can select one or more incidents to which to apply commands from the Incident
Actions drop-down menu.
Click the checkbox at the top of the column or click Select All to select all incidents on the
current page.

Note: Use caution when you use Select All. This option selects all incidents in the report,
not only those on the current page. Any incident command you subsequently apply affects
all incidents.

■ Data Type
Specifies whether the incident is from Data-at-Rest (DAR) or Data-in-Motion (DIM).
■ Location/Application/Detection Date
The location of the sensitive data, the application with which the incident is associated,
and the date on which the policy violation was detected.
■ User
Displays the information of the user associated with the incident, if applicable.
■ ID/Policy
The Symantec Data Loss Prevention incident number and the policy the incident violated.
■ Matches
The number of matches in the incident.
■ Severity
Incident severity as determined by the severity setting of the rule the incident matched.
The possible values are:

High

Medium

Low

For information only

■ Status
The current incident status. The possible values are:
■ New
Working with Application incidents 1891
Applications incident actions

■ In Process
■ Escalated
■ False Positive
■ Configuration Errors
■ Resolved

See “Applications incident list” on page 1889.

Applications incident actions


You can select one or more incidents and then manage them using commands in the Incident
Actions drop-down list.
The incident commands are as follows:
■ Add Note
Select to open a dialog box, type a comment, and then click OK.
■ Delete Incidents
Select to delete specified incidents.
■ Export Selected: CSV
Select to save specified incidents in a comma-separated text (.csv) file, which can be
displayed in several common applications, such as Microsoft Excel.
■ Export Selected: XML
Select to save specified incidents in an XML file, which can be displayed in several common
applications.
■ Mark Accepted
Select to set the remediation status to Accepted.
■ Run Smart Response
Select to run the Quarantine or Restore File Smart Response rules.
■ Hide/Unhide
Select one of the following actions to set the display state for the selected incidents:
■ Hide Incidents—Flags the selected incidents as hidden.
■ Unhide Incidents—Restores the selected incidents to the unhidden state.
■ Do Not Hide—Prevents the selected incidents from being hidden.
■ Allow Hiding—Allows the selected incidents to be hidden.
See “About incident hiding” on page 1958.
■ Set Attributes
Working with Application incidents 1892
Applications incident snapshot

Select to set attributes for the selected incidents.


■ Set Data Owner
Select to set the data owner by user name or email address.
■ Set Severity
Select to set severity.
■ Set Status
Select to set status.
See “Applications incident list” on page 1889.

Applications incident snapshot


An incident snapshot provides detailed information about a particular incident. It displays
general incident information, matches detected in the content, and details about policy,
attributes, and incident history. You can also search for similar incidents in the Correlations
area.
Current status and severity appear under the snapshot heading. To change one of the current
values, click it and choose another value from the drop-down list.
You can use the Accepted checkbox to set the remediation status to User Accepted. This
remediation status indicates that the incident was remediated by the user, CASB administrator,
or another incident responder.
Use the icons at the top right to print the report, or send it as email. To send reports, you or
your administrator must first enable report distribution in system settings.
See “Configuring the Enforce Server to send email alerts” on page 176.
Application incident data is divided into the following sections:
■ Key Info tab:
■ Policy Matches
See “Incident snapshot policy section” on page 1938.
■ Incident Details
The following details are included for both DAR and DIM incidents:

Data Type Specifies the DAR or DIM data type.

Detector Specifies the cloud detector that created the incident.

Is Hidden Displays the hidden state of the incident, whether or not the incident is hideable,
and lets you toggle the Do Not Hide flag for the incident. See “About incident
hiding” on page 1958.
Working with Application incidents 1893
Applications incident snapshot

Recipient For data uploads, the recipient is the site to which the data is uploaded.

For data downloads, the recipient is the user who downloads the data.

Date The date the incident was created.

Subject The subject field of the sensitive data. Click the subject link to view all incidents
with the same subject.

Data Owner The person responsible for remediating the incident. This field must be set
Name manually.

Reports can be sent automatically to the data owner for remediation.

Click Data Owner Name to view a filtered list of incidents for that data owner.

Data Owner The email address of the person responsible for remediating the incident. This
Email field must be set manually.
Address
Click Data Owner Email Address to view a filtered list of incidents for that data
owner email address.

Request ID The unique detection request identifier from the Cloud Detection Service. You can
use this identifier to track this incident in external cloud consoles, such as Symantec
CloudSOC.

User Name The name of the user who is associated with the incident.

User Specifies the type of user activity on the file. The possible activities are:
Activity ■ Create
Type ■ Edit
■ Rename
■ Delete
■ Upload/Download

External The unique transaction identifier that is provided by the cloud application. You
Transaction can use this identifier to track this incident in external cloud consoles, such as
ID Symantec CloudSOC.

■ Site/Application Details
Specifies the following details about the website or cloud application that is associated
with the DAR or DIM incident:

Service Score The Shadow IT score provided by Symantec CloudSOC.

Application The name of the cloud application associated with the incident.
Name
Working with Application incidents 1894
Applications incident snapshot

Site Risk The site risk score provided by Blue Coat WSS, based on information from the
Score Global Intelligence Network.

HTTP URL The HTTP URL accessed by the user.

■ User Details
This section provides the following details about the user who is associated with the
DAR or DIM incident:

User Threat Specifies the user threat score as provided by Symantec CloudSOC or Blue
score Coat WSS.

Documents Specifies the number of exposed documents for that user. Click More Info to
Exposed view document exposure information in your external cloud console.
Count

User Activity Provides a link to user activity details in your external cloud console.

■ Data Exposure Details (DAR only)


This section provides the following details about the exposure of the sensitive data:

Document is Specifies if the document is exposed in a publically accessible location.


Publically
Exposed

Document is Specifies if the document is shared with or accessible to all members of your
Internally organization.
Shared

Document is Specifies if the document is shared with anyone or accessible to outside of


Exposed your organization, or shared with or accessible to all members of your
organization.

Document is Specifies if the document is within your organization.


Internal

Document Specifies the number of times the document has been accessed.
Activity Count

Document The identifier of the document creator.


Creator ID

Document ID The identifier of the document.

Document The identifier of the folder containing the document.


Parent Folder ID
Working with Application incidents 1895
Applications incident snapshot

■ File Information (DAR only)


This section specifies the following information about the file containing the sensitive
data:

File Folder Specifies the folder that contains the file. Click More Info to go to exposures
panel for that file.

Last Modified Specifies the date and time the file was last modified.

Sharing URL Specifies the URL at which the file is shared.

Document Specifies the document type of the file.


Type

File Activity Click More Info to view the file activity in your external cloud console.

Alert in CASB Click More Info to view incident information in your external cloud console.

■ Data Transfer (DIM Only)


Specifies the following details about the device that is associated with the DIM incident:

Network Specifies the direction of the network traffic, upload or download.


Direction

Connector Specifies the network protocol of the data transfer, such as https.
Source
Protocol

Source IP Specifies the originating IP address of the network traffic.

Destination IP Specifies the destination IP address of the network traffic.

Device is Specifies if the device complies with your organization's standards.


Compliant

Device is Specifies if the device is not managed by your organization.


Unmanaged

Device is Specifies if the device is the personal property of the user.


Personal

Device is Specifies if the device is trusted by your organization.


Trusted

HTTP Method Specifies the HTTP method that was called when the incident was created.

HTTP Cookies Lists any cookies that are associated with the incident.

Device OS Specifies the operating system of the device.


Working with Application incidents 1896
Applications summary reports

Device Type Specifies the type of device.

■ Location (DIM Only)


Specifies the following device location information:

Location Specifies the city and country location of the device.

Latitude Specifies the latitude coordinate of the device.

Longitude Specifies the longitude coordinate of the device.

■ Message Body
Provides a link to the original JSON-formatted message.

■ History
See “Incident snapshot history tab” on page 1936.
■ Notes
The notes tab displays any notes for this incident.
■ Correlations
See “Incident snapshot correlations tab” on page 1937.
■ Matches
See “Incident snapshot matches section” on page 1938.
See “About Applications incident reports” on page 1887.

Applications summary reports


Applications Summary Reports provide summary information about Application incidents.
You can filter or summarize the options in the reports.
See “Incident report filter and summary options” on page 1934.
You can extract the report information in selected formats.
You can click highlighted elements, such as the entries in the Totals column, to drill down into
details.
Icons provide navigation through long reports.
See “Page navigation in incident reports” on page 1934.
See “About Applications incident reports” on page 1887.
Chapter 56
Managing and reporting
incidents
This chapter includes the following topics:

■ About Symantec Data Loss Prevention reports

■ About strategies for using reports

■ Setting report preferences

■ About incident reports

■ About dashboard reports and executive summaries

■ Viewing dashboards

■ Creating dashboard reports

■ Configuring dashboard reports

■ Choosing reports to include in a dashboard

■ About summary reports

■ Viewing summary reports

■ Creating summary reports

■ Viewing incidents

■ About custom reports and dashboards

■ Using IT Analytics to manage incidents

■ Filtering reports
Managing and reporting incidents 1898

■ Saving custom incident reports

■ Scheduling custom incident reports

■ Delivery schedule options for incident and system reports

■ Delivery schedule options for dashboard reports

■ Using the date widget to schedule reports

■ Editing custom dashboards and reports

■ Exporting incident reports

■ Exported fields for Network Monitor

■ Exported fields for Network Discover/Cloud Storage Discover

■ Exported fields for Endpoint Discover

■ Deleting incidents

■ Deleting custom dashboards and reports

■ Common incident report features

■ Page navigation in incident reports

■ Incident report filter and summary options

■ Sending incident reports by email

■ Printing incident reports

■ Incident snapshot history tab

■ Incident snapshot notes tab

■ Incident snapshot attributes section

■ Incident snapshot correlations tab

■ Incident snapshot policy section

■ Incident snapshot matches section

■ Incident snapshot access information section

■ Customizing incident snapshot pages

■ About filters and summary options for reports

■ General filters for reports


Managing and reporting incidents 1899
About Symantec Data Loss Prevention reports

■ Summary options for incident reports

■ Advanced filter options for reports

About Symantec Data Loss Prevention reports


Use incident reports to track and respond to incidents. Symantec Data Loss Prevention reports
an incident when it detects data that matches the detection parameters of a policy rule.
Such data may include specific file content, an email sender or recipient, attachment file
properties, or many other types of information.
Each piece of data that matches detection parameters is called a match, and a single incident
may include any number of individual matches.
You can set a hiding flag on an incident to indicate that the incident has been hidden. By
default, hidden incidents do not appear in incident reports, but you can include them in incident
reports by setting Advanced Filters on the report. Including hidden incidents in a report may
slow down reporting activities. See “About incident hiding” on page 1958.
Symantec Data Loss Prevention tracks incidents for all detection servers. These servers include
Network Discover/Cloud Storage Discover Server, Network Monitor Server, Network Prevent
for Email Server, Network Prevent for Web Server, and Endpoint Server.
You can specify the reports Symantec Data Loss Prevention displays in the navigation panel.
See “Setting report preferences” on page 1901.
Symantec Data Loss Prevention provides the following types of incident reports:
■ Incident lists show the individual incident records that contain information such as severity,
associated policy, number of matches, and status. You can click on any incident to see a
snapshot containing more details. And you can select specific incidents or groups of
incidents to modify or remediate.
Symantec Data Loss Prevention provides separate reports for incidents by selecting
Network, Endpoint, Discover, or User.
■ Summaries provide summary information about the incidents on your system. They are
organized with either one or two summary criteria. A single-summary report is organized
with a single summary criterion, such as the policy that is associated with each incident.
A double-summary report is organized with two criteria, such as policy and incident status.
By default, hidden incidents do not appear in the counts that display in summary reports,
but you can set Advanced Filters to include the hidden incidents. (See “About incident
hiding” on page 1958.)
■ Dashboards combine information from several reports. They include graphs and incident
totals representing the contents of various incident lists and summaries. Graphs can
sometimes contain lists of high-severity incidents or lists of summary groups. You can click
Managing and reporting incidents 1900
About strategies for using reports

on report portlets (the individual tiles that contain report data) to drill down to the detailed
versions of the reports.
Executive summaries are similar to dashboards. They include similar information layed out
in an intuitive and easy-to-read manner. You cannot customize an executive summary.
Executive summaries do not include report portlets.
Symantec Data Loss Prevention ships with executive summaries for Network, Endpoint,
and Discover incidents.
You can create and save customized versions of all reports (except executive summaries) for
continued use.
See “About custom reports and dashboards” on page 1912.
Symantec Data Loss Prevention displays reports in separate sections on the Incident > All
Reports screen as follows:
■ The Saved Reports section contains any shared reports that are associated with your
current role. This section appears only if you or other users in your current role have created
saved reports.
See “About custom reports and dashboards” on page 1912.
■ The Network section contains Symantec-provided incident lists, summaries, and dashboards
for network incidents.
■ The Endpoint section contains Symantec-provided incident lists, summaries, and
dashboards for endpoint incidents. Endpoint reports include the incidents that Endpoint
captures, such as Endpoint Block and Endpoint Notify incidents.
Incidents that Endpoint Discover captures appear in Discover reports.
■ The Discover section contains Symantec-provided incident lists, summaries, and
dashboards for Network Discover/Cloud Storage Discover and Endpoint Discover incidents.
■ The Applications section contains Symantec-provided incident lists and summaries for
cloud application incidents.
■ The Users section contains the user list and user risk summary, which displays users and
their associated Email and Endpoint incidents.

About strategies for using reports


Many companies configure their Symantec Data Loss Prevention reporting to accommodate
the following primary roles:
■ An executive responsible for overall risk reduction who monitors risk trends and develops
high-level initiatives to respond to those trends.
The executive monitors dashboards and summary reports (to get a general picture of data
loss trends in the organization). The executive also develops programs and initiatives to
Managing and reporting incidents 1901
Setting report preferences

reduce risk, and communicates this information to policy authors and incident responders.
The executive often monitors reports through email or some other exported report format.
Symantec Data Loss Prevention dashboards and summary reports let you monitor risk
trends in your organization. These reports provide a high-level overview of incidents.
Executives and managers can quickly evaluate risk trends and advise policy authors and
incident responders how to address these trends. You can view existing summary reports
and dashboards and create customized versions of these reports.
See “About dashboard reports and executive summaries” on page 1903.
See “About summary reports” on page 1909.
■ An incident responder, such as an InfoSec Analyst or InfoSec Manager, who monitors and
responds to particular incidents.
The responder monitors incident reports and snapshots to respond to the incidents that
are associated with a particular policy group, organizational department, or geographic
location. The responder may also author policies to reduce risk. These policies can originate
either at the direction of a risk reduction manager or based on their own experience tracking
incidents.
See “About incident remediation” on page 1841.

Setting report preferences


You can specify the reports that Symantec Data Loss Prevention displays in the navigation
panel for each of the report types.
To set reporting preferences
1 In the Enforce Server administration console, on the Incidents menu, click All Reports.
2 On the All Reports screen, click Edit Preferences.
The Edit Report Preferences screen lists any saved reports (for all your assigned roles).
The screen also lists Network, Endpoint, and Discover reports.
3 To display a report in the list, check the Show Report box for that report. To remove a
report from the list, clear Show Report for that report.
The selected list of reports displays in a left navigation panel for each of the types of
reports.
For example, to see the list of Network reports, on the Incidents menu, click Network.
4 After changing your preferences, click Save.
See “About custom reports and dashboards” on page 1912.
Managing and reporting incidents 1902
About incident reports

About incident reports


Use incident reports to track and respond to incidents on your network. Symantec Data Loss
Prevention reports an incident when it detects data that matches a detection rule in an active
policy. Such data may include specific file content, an email sender or recipient, attachment
file properties, or many other types of information. Each piece of data that matches a detection
rule is called a match, and a single incident may include any number of individual matches.

Note: You can configure which reports appear in navigation panel. To do so, go to All Reports
and then click on Edit Preferences

Symantec Data Loss Prevention provides the following types of incident reports:

Incident lists These show individual incident records containing information such as severity,
associated policy, number of matches, and status. You can click on any incident
to view a snapshot containing more details. You can select specific incidents or
groups of incidents to modify or remediate.

Summaries These show incident totals organized by a specific incident attribute such as status
or associated policy. For example, a Policy Summary includes rows for all policies
that have associated incidents. Each row includes a policy name, the total number
of associated incidents, and incident totals by severity. You can click on any severity
total to view the list of relevant incidents.

Double summaries These show incident totals organized by two incident attributes. For example, a
policy trend summary shows the total incidents by policy and by week. Similar to
the policy summary, each entry includes a policy name, the total number of
associated incidents, and incident totals by severity. In addition, each entry includes
a separate line for each week, showing the week's incident totals and incidents by
severity.

Dashboards and These are quick-reference dashboards that combine information from several
executive reports. They include graphs and incident totals representing the contents of various
summaries incident lists, summaries, and double summaries. Graphs are sometimes beside
lists of high-severity incidents or lists of summary groups. You can click on
constituent report names to drill down to the reports that are represented on the
dashboard.

Symantec Data Loss Prevention ships with executive summaries for Network,
Endpoint, and Discover reports, and these are not customizable.

You can create dashboards yourself, and customize them as desired.

Custom Lists the shared reports that are associated with your current role. (Such reports
appear only if you or other users in your current role have created them.)

Network Lists the network incident reports.


Managing and reporting incidents 1903
About dashboard reports and executive summaries

Endpoint Lists the Endpoint incident reports. Endpoint reports include incidents such as
Endpoint Block and Endpoint Notify incidents.

Incidents from Endpoint Discover are included in Discover reports.

Discover Lists Network Discover/Cloud Storage Discover and Endpoint Discover incident
reports.

The folder risk report displays file share folders ranked by prioritized risk. The risk
score is based on the relevant information from the Symantec Data Loss Prevention
incidents plus the information from the VML Management Server.

See the Symantec Data Loss Prevention Data Insight Implementation Guide.

Users The User List lists the data users in your organization. The User Risk Summary
lists all users with their associated Email and Endpoint incidents.

See “About custom reports and dashboards” on page 1912.


See “Common incident report features” on page 1933.
See “Network incident snapshot” on page 1857.
See “Discover incident snapshot” on page 1882.
See “Endpoint incident snapshot” on page 1866.
See “Network incident list” on page 1851.
See “Discover incident lists” on page 1879.
See “About endpoint incident lists” on page 1863.

About dashboard reports and executive summaries


Dashboards and executive summaries are the quick-reference report screens that present
summary information from several incident reports.
See “About incident reports” on page 1902.
Dashboards have two columns of reports. The left column displays a pie chart or graph and
an incident totals bar. The right column displays the same types of information as in the left
column. The right column also displays either a list of the most significant incidents or a list of
summary items with associated incident totals. The most significant incidents are ranked using
severity and match count. You can click on a report to see the full report it represents.
Dashboards consist of up to six portlets, each providing a quick summary of a report you
specify.
You can create customized dashboards for users with specific security responsibilities. If you
choose to share a dashboard, the dashboard is accessible to all users in the role under which
you create it. (Note that the Administrator user cannot create shared dashboards.)
Managing and reporting incidents 1904
About dashboard reports and executive summaries

Dashboards have two columns of report portlets (tiles that contain report data). Portlets in the
left column display a pie chart or graph and the totals bar. Portlets in the right column display
the same types of information as those in the left. However, they also display either a list of
the most significant incidents or a list of summary criteria and associated incidents. The incidents
are ranked using severity and match count. The summary criteria highlights any high-severity
incident totals. You can choose up to three reports to include in the left column and up to three
reports to include in the right column.
To create custom dashboards, click Incident Reports at the top of the navigation panel and,
in the Incident Reports screen that appears, click Create Dashboard. The Administrator can
create only private dashboards, but other users can decide whether to share a new dashboard
or keep it private.
See “About custom reports and dashboards” on page 1912.
To edit the contents of any custom dashboard, go to the desired dashboard and click Customize
near the top of the screen.
See “Configuring dashboard reports” on page 1907.
To display a custom dashboard at logon, specify it as the default logon report.
See “Setting report preferences” on page 1901.
Symantec Data Loss Prevention includes three executive summaries: Executive Summary
- Discover, Executive Summary - Endpoint, and Executive Summary - Network. Unlike
dashboards, executive summaries cannot be created or customized.
Executive summaries include the following reports:
Executive Summary - Discover
■ Policy Distribution across Targets: A pie chart that specifies the distribution of policies
across various Discover scan targets, including the percentage and number of incidents
generated per policy.
■ Top 5 Content Roots: A bar graph displaying the top five content roots that have generated
incidents, including the severity of the incidents generated for each content root.
■ Top 5 Target Summary: A bar graph displaying the top five incident-generating targets
from the last completed Discover scan, including the severity of the incidents generated
on each target.
■ Status by Target: A pie chart that specifies the status of various Discover scan targets,
including the percentage and number of incidents generated per policy.
Executive Summary - Endpoint
■ Policy Summary: A pie chart that specifies the number and percentage of incidents for
each Endpoint policy.
■ Top 5 Highest Offenders: A bar graph that displays the top five incident generating
endpoints, including the severity of the incidents associated with each endpoint.
Managing and reporting incidents 1905
Viewing dashboards

■ Top 5 Incident Type Summary: A bar graph that displays the top five incident types, such
as Clipboard or Local Drive.
■ User Justification Summary: A pie chart displaying the types of user justifications for
endpoint incidents, including the percentage for each justification.
■ Endpoint Location Summary: A pie chart displaying the connection status for
incident-generating endpoints.
■ Incident Status Summary: A pie chart displaying the status of all endpoint incidents, with
a percentage for each status category.
Executive Summary - Network
■ Policy Summary: A pie chart that specifies the number and percentage of incidents for
each Network policy.
■ Top 5 High Risk Senders: A bar graph that displays the top five high-risk senders, including
the severity of the incidents associated with each sender.
■ Top 5 Protocol Summary: A bar graph that displays the top five incident-generating
network protocols, including the severity of the incidents associated with each protocol.
■ Top 5 Recipient Domains: A bar graph that displays the top five incident-generating
recipient domains, including the severity of the incidents associated with each domain.
■ Status by Week: A bar graph displaying the incidents of the last 30 days, broken down by
week, and including the severity of the incidents generated.
■ Sender IP Summary: A pie chart displaying the incident-generating sender IP addresses,
including the number and percentage of incidents per sender IP.

Viewing dashboards
This procedure shows you how to view a dashboard.
To view a dashboard
1 In the Enforce Server administration console, on the Incidents menu, click Incident
Reports. Under Reports, click the name of a dashboard.
Dashboards consist of up to six portlets that each provide a summary of a particular report.
2 To see the entire report for a portlet, click the portlet.
Symantec Data Loss Prevention displays the appropriate incident list or summary report.
3 Browse through the incident list or summary report.
See “Viewing incidents” on page 1911.
See “About summary reports” on page 1909.
Managing and reporting incidents 1906
Creating dashboard reports

Creating dashboard reports


You can create custom dashboards and reports.
If you are logged on as a user other than the administrator, Symantec Data Loss Prevention
lets you choose whether to share your dashboard or keep it private.
To create a dashboard
1 In the Enforce Server administration console, on the Incidents menu, click Incident
Reports.
2 On the Incident Reports screen that appears, click Create Dashboard.
The Configure Dashboard screen appears.
3 Choose whether to share your dashboard or keep it private.
If you choose to share a dashboard, the dashboard is accessible to all users assigned
the role under which you create it.
If you are logged on as Administrator, you do not see this choice.

Note: Symantec Data Loss Prevention automatically designates all dashboards that the
administrator creates as private.

Click Next.
4 In the General section, for Name, type a name for the dashboard.
5 For Description, type an optional description for the dashboard.
Managing and reporting incidents 1907
Configuring dashboard reports

6 In the Delivery Schedule section, you can regenerate and send the dashboard report to
specified email accounts.
If SMTP is not set up on your Enforce Server, you do not see the Delivery Schedule
section.
If you have configured your system to send alerts and reports, you can set a time to
regenerate and send the dashboard report to specified email accounts.
See “Configuring the Enforce Server to send email alerts” on page 176.
If you have not configured Symantec Data Loss Prevention to send reports, skip to the
next step.
To set a schedule, locate the Delivery Schedule section and select an option from the
Schedule drop-down list. (You can alternatively select No Schedule.)
For example, select Send Weekly On.
Enter the data that is required for your Schedule choice. Required information includes
one or more email addresses (separated by commas). It may also include calendar date,
time of day, day of the week, day of the month, or last date to send.
See “Delivery schedule options for dashboard reports” on page 1919.
7 For the Left Column, you can choose what to display in a pie chart or graph. For the
Right Column, you can also display a table of the information.
See “Choosing reports to include in a dashboard” on page 1909.
Select a report from as many as three of the Left Column (Chart Only) drop-down lists.
Then select a report from as many as three of the Right Column (Chart and Table)
drop-down lists.
8 Click Save.
9 You can edit the dashboard later from the Edit Report Preferences screen.
To display a custom dashboard at logon, specify it as the default logon report on the Edit
Report Preferences screen.
See “Editing custom dashboards and reports” on page 1921.

Configuring dashboard reports


You can create the custom dashboards that are tailored for users with specific roles.
Dashboards consist of up to six portlets, each providing a quick summary of a report you
specify.
If you choose to share a dashboard, the dashboard is accessible to all users assigned the role
under which you create it.
Managing and reporting incidents 1908
Configuring dashboard reports

Note: The Administrator user cannot create shared dashboards.

To configure a custom dashboard


1 In the General section, for Name, type a name for the dashboard.
2 For Description, type an optional description for the dashboard.
3 In the Delivery Schedule section, you can regenerate and send the dashboard report to
specified email accounts.
If SMTP is not set up on your Enforce Server, you do not see the Delivery Schedule
section.
If you have configured your system to send alerts and reports, you can set a time to
regenerate and send the dashboard report to specified email accounts.
See “Configuring the Enforce Server to send email alerts” on page 176.
If you have not configured Symantec Data Loss Prevention to send reports, skip to the
next step.
To set a schedule, locate the Delivery Schedule section and select an option from the
Schedule drop-down list. (You can alternatively select No Schedule.)
For example, select Send Weekly On.
Enter the data that is required for your Schedule choice. Required information includes
one or more email addresses (separated by commas). It may also include calendar date,
time of day, day of the week, day of the month, or last date to send.
See “Delivery schedule options for dashboard reports” on page 1919.
4 For the Left Column, you can choose what to display in a pie chart or graph. For the
Right Column, you can also display a table of the information.
See “Choosing reports to include in a dashboard” on page 1909.
Select a report from as many as three of the Left Column (Chart Only) drop-down lists.
Then select a report from as many as three of the Right Column (Chart and Table)
drop-down lists.
5 Click Save.
6 You can edit the dashboard later from the Edit Report Preferences screen.
To display a custom dashboard at logon, specify it as the default logon report on the Edit
Report Preferences screen.
See “Editing custom dashboards and reports” on page 1921.
Managing and reporting incidents 1909
Choosing reports to include in a dashboard

Choosing reports to include in a dashboard


Dashboards have two columns of report portlets.
Portlets in the left column display a pie chart or graph.
Portlets in the right column display the same information as those in the left. They also display
either a list of the most significant incidents or a summary. Incidents are ranked with severity
and match count. You can display a list of summary criteria and associated incidents that
highlight any high-severity incident totals.
You can choose up to three reports to include in the left column, and up to three reports to
include in the right column.
To choose reports to include
1 Choose a report from as many as three of the Left Column (Chart Only) drop-down lists.
2 Choose a report from as many as three of the Right Column (Chart and Table) drop-down
lists.
3 After you configure the dashboard, click Save.
See “Configuring dashboard reports” on page 1907.

About summary reports


Symantec Data Loss Prevention provides two types of summary reports: single summaries
and double summaries.
Single summaries show incident totals organized by a specific incident attribute such as status
or associated policy. For example, a policy summary includes a row for each policy that has
associated incidents. Each row includes a policy name, the total number of associated incidents,
and incident totals by severity.
Double summaries show incident totals organized by two incident attributes. For example, a
policy trend summary shows the total incidents which are organized with policy and week. As
in a policy summary, each entry includes a policy name, the total number of associated
incidents, and incident totals by severity. In addition, each entry includes a separate line for
each week, showing the week's incident totals and incidents by severity.
See “Summary options for incident reports” on page 1944.
You can create custom summary reports from any incident list.

Viewing summary reports


This procedure shows you how to view a summary report.
Managing and reporting incidents 1910
Creating summary reports

To view a summary report


1 In the Enforce Server administration console, on the Incidents menu, select one of the
types of reports.
For example, select Network, and then click Policy Summary.
The report consists of summary entries (rows) that are divided into several columns. The
first column is named for the primary summary criterion. It lists primary and (for double
summaries) secondary summary items. For example, in a Policy Summary this column
is named Policy and it lists policies. Each entry includes a column for total number of
associated incidents. It also includes columns showing the number of incidents of High,
Medium, Low, and Informational severity. Finally, it includes a bar chart that represents
the number of incidents by severity.
2 Optionally, you can sort the report alpha-numerically by a particular column's data. To do
so, click the wanted column heading. To sort in reverse order, click the column heading
a second time.
3 To identify areas of potential risk, click the High column heading to display summary
entries by number of high-severity incidents.
4 Click an entry to see a list of associated incidents. In any of the severity columns, you can
click the total to see a list of incidents of the chosen severity.
See “Viewing incidents” on page 1911.

Creating summary reports


This procedure shows you how to create a summary report.
To create a summary report from an incident list
1 In the Enforce Server administration console, on the Incidents menu, select one of the
types of reports, and then click an incident list.
For example, select Discover, and then the report Incidents-All Scans.
2 Click the Advanced Filters & Summarization bar (near the top of the report).
In Summarize By for the primary listbox and secondary listbox that appear, Symantec
Data Loss Prevention displays all Symantec-provided criteria in alphabetical order. The
criteria precedes any custom criteria the administrator has defined.
See “Summary options for incident reports” on page 1944.
3 Select a criterion from the primary listbox, and an optional criterion from the secondary
listbox. For example, select Policy Group and then Policy. (Note that options in the
secondary listbox appear only after you choose an option from the primary listbox.)
Managing and reporting incidents 1911
Viewing incidents

4 To create the summary report, click Apply.


Summary reports take their name from the primary summary criterion. If you rerun a report
with new criteria, the report name changes accordingly.
5 Save the report.
See “Saving custom incident reports” on page 1914.

Viewing incidents
Symantec Data Loss Prevention incident lists display the individual incident records with
information about the incidents. You can click on any incident to see a snapshot containing
more details. You can select specific incidents or groups of incidents to modify or remediate.
Symantec Data Loss Prevention provides incident lists for Network, Endpoint, and Discover
incidents.
To view incidents
1 In the Enforce Server administration console, on the Incidents menu, select one of the
types of reports.
For example, select Discover. In the left navigation panel, click Incidents-All Scans.
The incident list displays the individual incident records that contain information such as
severity, associated policy, number of matches, and status.
2 Optionally, use report filters to narrow down the incident list.
See “Filtering reports” on page 1914.
3 To view more details of a particular incident, click the incident.
The incident snapshot appears, displaying general incident information, matches detected
in the intercepted text, and details about policy, attributes, and incident history.
You can also search for similar incidents from the Correlations tab.
4 Optionally, click through the incident snapshot to view more information about the incident.
The following list describes the ways you can access more information through the
snapshot:
■ You can find information about the policy that detected the incident. On the Key Info
tab, the Policy Matches section displays the policy name. Click on the policy name
to see a list of incidents that are associated with that policy. Click view policy to see
a read-only version of the policy.
This section also lists other violated policies with the same file or message. When
multiple policies are listed, you can see the snapshot of an incident that is associated
with a particular policy. Click go to incident next to the policy name. To see a list of
all incidents that the file or message created, click show all.
Managing and reporting incidents 1912
About custom reports and dashboards

■ You can view lists of the incidents that share various attributes with the current incident.
The Correlations tab shows a list of correlations that match single attributes. Click
on attribute values to see the lists of incidents that are related to those values.
For example, the current network incident is triggered from a message from a particular
email account. You can bring up a list of all incidents that this account created.
■ For most network incidents, you can access any attachments that are associated with
the network message. To do so, locate the Attachments field in the Incident Details
section of the snapshot and click the attachment file name.
For a detailed description of incident snapshots and the actions you can perform through
them, see the online Help.
5 When you finish viewing incidents, you can exit the incident snapshot or incident list, or
you can choose one or more incidents to remediate.
See “Remediating incidents” on page 1844.

About custom reports and dashboards


You can filter and summarize reports, and then save them for continued use. When saving a
customized report, you can configure Symantec Data Loss Prevention to send the report
according to a specific schedule.
Symantec Data Loss Prevention displays the titles of customized reports under Incidents >
All Reports.
The All Reports screen displays all out-of-the-box and custom reports available to your
assigned role(s). The list includes shared custom reports and the dashboards that you or
anyone else in your current role created. Several standard reports are available with Symantec
Data Loss Prevention.
Symantec Data Loss Prevention displays each report's name, associated product, and
description. For custom reports, Symantec Data Loss Prevention indicates whether the report
is shared or private and displays the report generation and delivery schedule.
You can modify existing reports and save them as custom reports, and you can also create
custom dashboards. Custom reports and dashboards are listed in the Saved Reports section
of the navigation panel.
You can click any report on the list to re-run it with current data.
You can view and run custom reports for reports created by users who have any of the roles
that are assigned to you. You can only edit or delete the custom reports that are associated
with the current role. The only custom reports visible to the Administrator are the reports that
the Administrator user created.
A set of tables lists all the options available for filtering and summarizing reports.
Managing and reporting incidents 1913
Using IT Analytics to manage incidents

See “About summary reports” on page 1909.


See “Summary options for incident reports” on page 1944.
See “General filters for reports” on page 1941.
See “Advanced filter options for reports” on page 1949.

Create Dashboard Lets you create a custom dashboard that displays summary data from several
reports you specify. For users other than the Administrator, this option leads to the
Configure Dashboard screen, where you specify whether the dashboard is private
or shared. All Administrator dashboards are private.

See “Creating dashboard reports” on page 1906.

Saved (custom) reports associated with your role appear near the top of the screen.
The following options are available for your current role's custom reports:

Click this icon next to a report to display the save report or configure dashboard
screen. You can change the name, description, or schedule, or (for dashboards
only) change the reports to include.

See “Saving custom incident reports” on page 1914.

See “Configuring dashboard reports” on page 1907.

Click this icon next to a report to display the screen to change the scheduling of this
report. If this icon does not display, then this report is not currently scheduled.

See “Saving custom incident reports” on page 1914.

Click this icon next to a report to delete that report. A dialog prompts you to confirm
the deletion. When you delete a report, you cannot retrieve it. Make sure that no
other role members need the report before you delete it.

Using IT Analytics to manage incidents


IT Analytics Solution is a Business Intelligence (BI) application that complements and expands
upon the reporting that is offered by Symantec Data Loss Prevention. It provides
multi-dimensional analysis and robust graphical reporting features to Symantec Management
Platform. This functionality lets you create on-the-fly ad-hoc reports without advanced knowledge
of databases or third-party reporting tools. IT Analytics provides this powerful on-the-fly ad-hoc
reporting with pivot tables, pre-compiled aggregations for fast answers to typically long-running
queries, and easy export to .PDF, Excel, .CSV and .TIF files.
For more information, see the IT Analytics landing page at the Symantec Support Center, at
https://support.symantec.com/en_US/dpl.56005.html.
Managing and reporting incidents 1914
Filtering reports

Filtering reports
You can filter an incident list or summary report.
To filter an incident list
1 In the Enforce Server administration console, on the Incidents menu, select one of the
types of reports.
For example, select Network, and then click Policy Summary.
2 In the Filter area, current filters are displayed, as well as options for adding and running
other filters.
3 Modify the default filters as wanted. For example, from the Status filter drop-down lists,
select Equals and New.
For Network, and Endpoint reports, the default filters are Date and Status. For Discover
reports, default filters are Status, Scan, and Target ID.
4 To add a new filter, select filter options from the drop-down lists. Click Advanced Filters
& Summarization for additional options. Click Add Filter on the right, for additional filter
options.
Select the filter type and parameters from left to right as if writing a sentence. For example,
from the advanced filters, Add Filter options, select Policy and Is Any Of, and then select
one or more policies to view in the report. Hold down Ctrl or Shift to select more than one
item in the listbox.
5 Click Apply to update the report.
6 Save the report.
See “Saving custom incident reports” on page 1914.

Saving custom incident reports


After you summarize or filter a report, you can save it for continued use. When you save a
customized report, Symantec Data Loss Prevention displays the report title under Saved
Reports in the All Reports section. If a user chooses to share the report, Symantec Data
Loss Prevention displays the report link only for users who belong to the same role as the user
who created the report.
See “About custom reports and dashboards” on page 1912.
You can edit the report later on the Edit Preferences screen.
See “Editing custom dashboards and reports” on page 1921.
Optionally, you can schedule the report to be run automatically on a regular basis.
See “Scheduling custom incident reports” on page 1915.
Managing and reporting incidents 1915
Scheduling custom incident reports

To save a custom report


1 Set up a customized filter or summary report.
See “About custom reports and dashboards” on page 1912.
Click Save > Save As.
2 Enter a unique report name and describe the report. The report name can include up to
50 characters.
3 In the Sharing section, users other than the administrator can share a custom report.

Note: This section does not appear for the administrator.

The Sharing section lets you specify whether to keep the report private or share it with
other role members. Role members are other users who are assigned to the same role.
To share the report, select Share Report. All role members now have access to this
report, and all can edit or delete the report. If your account is deleted from the system,
shared reports remain in the system. Shared reports are associated with the role, not with
any specific user account. If you do not share a report, you are the only user who can
access it. If your account is deleted from the system, your private reports are deleted as
well. If you log on with a different role, the report is visible on the All Reports screen, but
not accessible to you.
4 Click Save.

Scheduling custom incident reports


Optionally, you can schedule a saved report to be run automatically on a regular basis.
You can also schedule the report to be emailed to specified addresses or to the data owners
on a regular schedule.
See the Symantec Data Loss Prevention Data Insight Implementation Guide.
Managing and reporting incidents 1916
Scheduling custom incident reports

To schedule a custom report


1 Click Send > Schedule Distribution.
If SMTP is not set up on your Enforce Server, you are not able to select the Send menu
item to send the report.
See “Configuring the Enforce Server to send email alerts” on page 176.
2 Specify the Delivery Details:

To: Select whether the report is sent to specified


email addresses or to the data owners.

Manual - Sent to specified e-mail addresses Enter the specific email addresses manually in
the text box.

Auto - Send to incident data owners To send the report to the data owners, the Send
report data with emails setting must be enabled
for this option to appear.

See “Configuring the Enforce Server to send


email alerts” on page 176.

If you select to have the report sent to the incident


data owners, then the email address in the
incident attribute Data Owner Email Address is
the address where the report is sent.

This Data Owner Email Address must be set


manually, or with a lookup plug-in.

See the Symantec Data Loss Prevention Data


Insight Implementation Guide.

A maximum of 10000 incidents can be distributed


per data owner.

CC: Enter the email addresses manually in the text


box.

Subject: Use the default subject or modify it.

Body: Enter the body of the email.

Response action variables can also be entered


in the body.

See “Response action variables” on page 1847.

3 In the Schedule Delivery section, specify the delivery schedule.


See “Delivery schedule options for incident and system reports” on page 1917.
Managing and reporting incidents 1917
Delivery schedule options for incident and system reports

4 In the Change Incident Status / Attributes section, you can implement workflow.
The Auto - Send to incident data owners option must be set for this section to appear.
See “Configuring the Enforce Server to send email alerts” on page 176.
5 After sending the report, you can change an incident's status to any of the valid values.
Select a status value from the drop-down list.
6 You can also enter new values for any custom attributes.
These attributes must be already set up.
See “About incident status attributes” on page 1962.
7 Select one of the custom attributes from the drop-down list.
8 Click Add.
9 In the text box, enter the new value for this custom attribute.
After sending the report, the selected custom attributes set the new values for those
incidents that were sent in the report.
10 Click Next.
11 Enter the name and description of the saved report.
12 Click Save.

Delivery schedule options for incident and system


reports
The Schedule Delivery section lets you set up a schedule for the report.

Note: If your Enforce Server is not configured to send email, or you are not allowed to send
reports, the Schedule Delivery section does not appear.

When you make a selection from the list, additional fields appear.
To remove scheduling of a report that was previously scheduled, click the Remove option.
The following table describes the additional fields available for each option on the list.
Managing and reporting incidents 1918
Delivery schedule options for incident and system reports

Delivery Details Specify the following delivery details:

■ Send To
Specify Manual to specify the email addresses.
Specify Auto for automatic sending to data owners.
■ To
Enter one or more email addresses. Separate them with commas.
■ CC
Enter one or more email addresses. Separate them with commas.
■ Subject
Provide a subject for the email.
■ Body
Enter the body of the email. Use variables for items such as the policy name.
See “Response action variables” on page 1847.

One time Select One time to schedule the report to be run once at a future time, and then
specify the following details for that report:

■ Time
Select the time you want to generate the report.
■ Send Date
Enter the date you want to generate the report, or click the date widget and
select a date.

Daily Select Daily to schedule the report to be run every day, and then specify the following
details for that report:

■ Time
Select the time you want to generate the report.

■ Until

Enter the date you want to stop generating daily reports, click the date widget and
select a date, or select Indefinitely.

Weekly Select Weekly on to schedule the report to be run every week, and then specify
the following details for that report:

■ Time
Select the time you want to generate the report.
■ Days of Week
Click to check one or more check boxes to indicate the day(s) of the week you
want to generate the report.
■ Until
Enter the date you want to stop generating weekly reports, click the date widget
and select a date, or select Indefinitely.
Managing and reporting incidents 1919
Delivery schedule options for dashboard reports

Monthly Select Monthly on to schedule the report to be run every month, and then specify
the following details for that report:

■ Time
Select the time you want to generate the report.
■ Day of Month
Enter the date on which you want to generate the report each month.
■ Until
Enter the date you want to stop generating monthly reports, click the date widget
and select a date, or select Indefinitely.

See “Saving custom incident reports” on page 1914.


See “ Working with saved system reports” on page 168.

Delivery schedule options for dashboard reports


The Delivery Schedule section lets you set up a schedule for the report.

Note: If your Enforce Server is not configured to send email, or you are not allowed to send
reports, the Delivery Schedule section does not appear.

When you make a selection from the Schedule drop-down list, additional fields appear.
The following table describes the additional fields available for each option on the list.

No Schedule Select No Schedule to save the report without a schedule.

Once Select Once to schedule the report to be run once at a future time, and then specify
the following details for that report:

■ On

Enter the date you want to generate the report, or click the date widget and select
a date.

■ At

Select the time you want to generate the report.

■ Send To

Enter one or more email addresses. Separate them with commas.


Managing and reporting incidents 1920
Delivery schedule options for dashboard reports

Send Every Day Select Send Every Day to schedule the report to be run every day, and then specify
the following details for that report:

■ At

Select the time you want to generate the report.


■ Until

Enter the date you want to stop generating daily reports, click the date widget and
select a date, or select Indefinitely.

■ Send To

Enter one or more email addresses. Separate them with commas.

Send Weekly On Select Send Weekly on to schedule the report to be run every week, and then
specify the following details for that report:

■ Day

Click to check one or more check boxes to indicate the day(s) of the week you want
to generate the report.

■ At

Select the time you want to generate the report.

■ Until

Enter the date you want to stop generating weekly reports, click the date widget
and select a date, or select Indefinitely.

■ Send To

Enter one or more email addresses. Separate them with commas.

Send Monthly On Select Send Monthly on to schedule the report to be run every month, and then
specify the following details for that report:

■ Day of each month

Enter the date on which you want to generate the report each month.

■ At

Select the time you want to generate the report.

■ Until

Enter the date you want to stop generating monthly reports, click the date widget
and select a date, or select Indefinitely.

■ Send To

Enter one or more email addresses. Separate them with commas.

See “Configuring dashboard reports” on page 1907.


Managing and reporting incidents 1921
Using the date widget to schedule reports

Using the date widget to schedule reports


The date widget specifies dates for reports.
The date widget enters the date for you. You can click Today to enter the current date.
To use the date widget
1 Click the date widget.
2 Click the left arrow or the right arrow on either side of the month to change the month.
3 Click the left arrow or the right arrow on either side of the year to change the year.
4 Click the desired date on the calendar.

Editing custom dashboards and reports


You can edit any custom report or dashboard that you create.
To edit a custom dashboard or report
1 In the Enforce Server administration console, on the Incidents menu, select Incident
Reports.
The Incident Reports dashboard appears and displays Saved Reports near the top.
2 Click the edit icon next to the report or dashboard to edit.
The Save Report screen or the Save Dashboard screen appears. You can edit the name,
description, and schedule of any custom report or dashboard, and you can select different
component reports for a custom dashboard.
See “Saving custom incident reports” on page 1914.
3 When you finish editing, click Save.

Exporting incident reports


A report can be exported to a comma-separated text (.csv) file or to an XML file.
You can set up a CSV delimiter other than a comma. You can specify which fields are exported
to XML. These options must be set in your profile before you export a report.
See “Editing a user profile” on page 87.
Managing and reporting incidents 1922
Exported fields for Network Monitor

To export a report
1 Click Incidents, and select a type of report.
2 Navigate to the report that you want to export. Filter or summarize the incidents in the
report, as desired.
See “Common incident report features” on page 1933.
3 Check the boxes on the left side of the incidents to select the incidents to export.
4 In the Export drop-down, select Export All: CSV or Export All: XML

Note: See the current version of the Incident Reporting and Update API Developers Guide
for the location of the XML schema files for exported reports and for a description of
individual XML elements.

5 Click Open or Save. If you selected Save, a Save As dialog box opens, and you can
specify the location and the file name.
See “Exported fields for Network Monitor” on page 1922.
See “Exported fields for Endpoint Discover” on page 1924.
See “Exported fields for Network Discover/Cloud Storage Discover” on page 1923.
See “Printing incident reports” on page 1936.
See “Sending incident reports by email” on page 1935.

Exported fields for Network Monitor


The following fields are exported for Network Monitor:

Type Incident type (for example SMTP, HTTP, or FTP).

Message Status of this incident message.


Status

Severity Severity of this incident (High, Medium, or Low).

Sent Date and time the message was sent.

ID Unique identifier for this incident.

Policy Name of the policy that triggered this incident.

Matches The number of times that this item matches the detection parameters of a policy rule.

Subject Subject of the message.


Managing and reporting incidents 1923
Exported fields for Network Discover/Cloud Storage Discover

Recipient(s) Recipient of the message.

Status Status of this incident (New, Escalated, Dismissed, or Closed).

Has Indicates if this message has an attachment.


Attachment

Data Owner The person responsible for remediating the incident. This field must be set manually,
Name or with one of the lookup plug-ins.

Reports can automatically be sent to the data owner for remediation.

Data Owner The email address of the person responsible for remediating the incident. This field
Email must be set manually, or with one of the lookup plug-ins.

Custom attributes are also exported.

Exported fields for Network Discover/Cloud Storage


Discover
The following fields are exported for Network Discover/Cloud Storage Discover:

Type Target type (for example file system, Lotus Notes, or SQL Database).

Message Status of this incident message.


Status

Severity Severity of this incident (High, Medium, or Low).

Detection Date Date that an incident was detected.

Seen Before Was this incident previously seen? The value is Yes or No.

Subject Email subject for integrated Exchange scans.

Sender Email sender for integrated Exchange scans.

Recipient Email recipient for integrated Exchange scans.

ID Unique identifier for this incident.

Policy Name of the policy that triggered this incident.

Matches The number of times that this item matches the detection parameters of a policy rule.

Location Location (path) of this item.

Status Status of this incident (New, Escalated, Dismissed, or Closed).


Managing and reporting incidents 1924
Exported fields for Endpoint Discover

Target Name of the scan target.

Scan Date and time when the file was scanned.

File Owner Owner of the file.

Last Modified Date and time when the item was last modified.
Date

File Create Date and time when the item was created.
Date

Last Access Date and time when the item was last accessed (not shown for NFS targets).
Date

Data Owner The person responsible for remediating the incident. This field must be set manually,
Name or with one of the lookup plug-ins.

Reports can automatically be sent to the data owner for remediation.

Data Owner The email address of the person responsible for remediating the incident. This field
Email must be set manually, or with one of the lookup plug-ins.

Custom attributes are also exported.

Exported fields for Endpoint Discover


The following fields are exported for Endpoint Discover:

Type Target type (for example Removable Storage).

Severity Severity of this incident (High, Medium, or Low).

Occurred On Date that an incident was detected.

ID Unique identifier for this incident.

Policy Name of the policy that triggered this incident.

Matches The number of times that this item matches the detection parameters of a policy rule.

Status Status of this incident (New, Escalated, Dismissed, or Closed).

File Name Name of the file that violated the policy.

File Path Path of the file.


Note: The file location appears only for fixed drive incidents.

Machine Computer on which the incident occurred.


Managing and reporting incidents 1925
Deleting incidents

User Endpoint user name.

Prevention Status from Endpoint (for example Action Blocked).


Status

Subject Subject of the message.

Recipient(s) Recipient of the message.

Has Indicates if this message has an attachment.


Attachment

Data Owner The person responsible for remediating the incident. This field must be set manually,
Name or with one of the lookup plug-ins.

Reports can automatically be sent to the data owner for remediation.

Data Owner The email address of the person responsible for remediating the incident. This field
Email must be set manually, or with one of the lookup plug-ins.

Custom attributes are also exported.

Deleting incidents
Incident reporting performance often deteriorates when the number of incidents in your system
exceeds one million (1,000,000). Symantec recommends keeping your incident count below
this threshold by deleting incidents to maintain good system performance.
Incident deletion is permanent: you can delete incidents, but you cannot recover the incidents
that you have deleted. Symantec Data Loss Prevention offers options for deleting only certain
parts of the data that triggered the incident.
After you have marked incidents for deletion, you can view, configure, run, and troubleshoot
the incident deletion process from the Enforce Server administration console.You can mark
incidents for deletion manually or automatically.
See “About automatically flagging incidents for deletion” on page 1929.
You can also delete hidden incidents.
See “Deleting hidden incidents” on page 1961.
To delete an incident
1 On the Incident Report screen, select the incident or incidents you want to delete, then
click Incident Actions > Delete Incidents.
2 On the Delete Incidents screen, select from the following deletion options:
Managing and reporting incidents 1926
Deleting incidents

Delete incident Permanently deletes the incident(s) and all associated data (for example,
completely any emails and attachments). Note that you cannot recover the incidents
that have been deleted.

Retain incident, but Retains the actual incident(s) but discards the Symantec Data Loss
delete message data Prevention copy of the data that triggered the incident(s). You have the
option of deleting only certain parts of the associated data. The rest of the
data is preserved.

Delete Original Deletes the message content (for example, the email message or HTML
Message post). This option applies only to Network incidents.

Delete This option refers to files (for Endpoint and Discover incidents) or email or
Attachments/Files posting attachments (for Network incidents). The options are:

■ All, which deletes all attachments. Choose this option to delete all files
(for Endpoint and Discover incidents) or email attachments (for Network
incidents). Attachments and files are added to the incident deletion
queue after their associated incidents have been deleted.

■ Attachments/Files with no violations. This option deletes only those


attachments in which Symantec Data Loss Prevention found no matches.
Choose this option when you have incidents with individual files taken
from a compressed file (Endpoint and Discover incidents) or several
email attachments (Network incidents).

3 Click Cancel or Delete.


Delete marks the incident for deletion and adds it to the incident deletion queue. You
cannot recover an incident after it has been marked for deletion. Symantec Data Loss
Prevention permanently deletes the incidents in the incident deletion queue when it runs
the incident deletion job.

About the incident deletion process


You can view, configure, run, and troubleshoot the incident deletion process on the Incident
Deletion screen of the Enforce Server administration console: System > Incident Data >
Incident Deletion.This screen shows you the number of incidents in the incident deletion
queue, the deletion schedule, and a history of deletion jobs.
The incident deletion queue includes all incidents marked for deletion by all your Symantec
Data Loss Prevention users. In addition to viewing the number of incidents marked for deletion,
you can start and stop a deletion job manually from the incident deletion queue.
You can view detailed information about your deletion jobs in the deletion jobs history section,
including the number of incidents and attachments or files deleted, the job start and end time,
the job duration, whether or not the job was stopped manually, and the job status (Completed,
Failed, or In Progress). In the case of failed deletion jobs, you can click the Failed link to see
Managing and reporting incidents 1927
Deleting incidents

the error message and problem statement. This information may be useful to your Oracle
database administrator in troubleshooting the job failure. If this information is insufficient to
resolve your deletion job issues, you can export information from any job to a CSV file and
send it to Symantec Data Loss Prevention Support for additional help.
By default, the incident deletion job runs nightly at 11:59 P.M. in the Enforce Server's local
time zone. When the job runs, it also creates an event on the System > Servers and Detectors
> Events screen. This event is created whether or not any incidents are actually deleted.

Configuring the incident deletion job schedule


The default incident deletion job schedule is daily at 11:59 P.M. in the Enforce Server's local
time zone. You can configure the deletion job schedule to run at any other scheduled time.
Symantec suggests running your incident deletion at a time when your system is idle or not
in heavy use.
To configure the incident deletion job schedule
1 Click the Schedule Deletion Job calendar icon.
2 In the Schedule Incident Deletion dialog box, specify one of the following options:
■ No Regular Schedule: Select this option to turn off the deletion job schedule.
■ Once: Specify a day and time for a single incident deletion job.
■ Daily: Specify a daily time for incident deletion jobs.
■ Weekly: Specify a day and time for incident deletion jobs.
■ Monthly: Specify a day of the month and time for incident deletion jobs. To
accommodate differences between months, the day value must be between 1 and 28.

3 Click Submit.

Note: The incident deletion job schedule is reset to the default value during the upgrade process.
If you are using a custom incident deletion job schedule, reconfigure the schedule after the
upgrade process is complete.

Starting and stopping incident deletion jobs


If there are incidents pending deletion, you can start an incident deletion job manually from
the incident deletion queue. You can also stop any incident deletion job that is currently running.
Managing and reporting incidents 1928
Deleting incidents

To start and stop incident deletions job manually


1 Click Start Deletion to start an incident deletion job manually.
2 When an incident deletion job is running, the progress bar will show you how many
incidents have been deleted.
3 Click Stop Deletion to stop an incident deletion job.
The progress bar refreshes every 30 seconds by default. If you are deleting a large number
of incidents (over 500,000), the refresh process may degrade the performance of the deletion
job. You can adjust the refresh rate in the manager.properties file.
To configure the progress bar refresh rate
1 Open the manager.properties file:
■ On Windows systems: \Program
Files\Symantec\DataLossPrevention\EnforceServer\15.5\Protect\config\manager.properties

■ On Linux systems:
/opt/Symantec/DataLossPrevention/EnforceServer/15.5/Protect/config/manager.properties

2 Set a new value in milliseconds for the


com.vontu.incident.deletion.progress.refreshRate property. For example, to set
the refresh rate to two minutes (120 seconds):

com.vontu.incident.deletion.progress.refreshRate=120000

3 Save and close the manager.properties file, then restart the Symantec DLP Manager
service.
See “About Symantec Data Loss Prevention services” on page 101.

Working with the deletion jobs history


The deletion jobs history section shows you your previously run incident deletion jobs, including:
■ The number of incidents deleted.
■ The number of attachments and files deleted.
■ The deletion job start and end time.
■ The deletion job duration.
■ Whether or not the deletion job was stopped manually.
■ The deletion job status.
Managing and reporting incidents 1929
Deleting incidents

If a deletion job failed, a link will appear in the status column. Click the link to see the error
message and problem statement. This information may be useful to your Oracle database
administrator for troubleshooting a failed deletion job.
If you are having trouble troubleshooting incident deletion job issues, you can export detailed
deletion job information to send to Symantec Data Loss Prevention Support.
To view and export failed deletion job information
1 In the Deletion jobs history list, click the Failed link for the failed job you want to view.
The error message and problem statement that appear may be useful to your Oracle
database administrator for troubleshooting your incident deletion job issues. If you need
additional help, continue to step 2.
2 To export information for a failed deletion job, select the job in the Deletion jobs history
list, then click Export.
3 Save the ZIP file to send to Symantec Data Loss Prevention Support for analysis. The
data contained in the ZIP file is intended for use by Symantec Data Loss Prevention
Support only, and will not be helpful for your in-house troubleshooting efforts.

About automatically flagging incidents for deletion


You can automatically flag incidents for deletion based on criteria that you define. For example,
you might want to automatically flag incidents for deletion based on their age. Flagging incidents
for deletion automatically can save you a significant amount of time and effort, especially if
you have a large number of incidents in your system.
Incidents that you have automatically flagged for deletion are permanently deleted from your
system when the next incident deletion job runs. Unlike manually selected incidents, automatic
deletion tagging marks the entire incident for deletion, including the message data and
attachments.
See “About the incident deletion process” on page 1926.
To automatically flag incidents for deletion, you first create custom incident reports with your
criteria, such as incident age. You can have one active report per incident category: Network,
Endpoint, Discover, and Applications. These report types are license-dependant: you cannot
create or review reports for which you do not have a license.
See “About creating incident reports for automatic incident deletion flagging” on page 1930.
After you have created your custom incident reports, you configure and manage incident
deletion flagging jobs on the System > Incident Deleter > Flag Incidents for Deletion page.
See “Configuring automatic incident deletion flagging” on page 1931.
See “Managing automatic incident deletion flagging” on page 1931.
Managing and reporting incidents 1930
Deleting incidents

You must have Symantec Data Loss Prevention administrator privileges to configure automatic
incident deletion flagging.

About creating incident reports for automatic incident deletion flagging


You create custom reports that include your criteria for automatic incident deletion flagging on
the Incidents page for each specific incident type. Symantec recommends that you use
single-summary reports only for incident deletion flagging.
See “About custom reports and dashboards” on page 1912.
See “Saving custom incident reports” on page 1914.
The most useful system report to start from when creating custom incident reports for incident
deletion flagging is the Incidents > incident type > Incidents - All report. This system report
includes all incidents present in your system for a given incident type.
The following procedure gives an example for flagging Network incidents created between 1
January 2016 and 1 January 2017 for deletion. This is a simple example that only involves
filtering the list of all Network incidents by a range of dates. No additional filters or summarization
are applied in this example.
To create a report to filter Network incidents within a range of dates
1 In the Enforce Server administration console, navigate to Incidents > Network > Incidents
- All.
2 In the Filter section, select Status: Equals All.
3 In the Date section, select Custom, then enter a start date of 1/1/16 and an end date of
1/1/17.
4 Click Apply.
5 Click Save > Save As.
6 Enter a name for and description of your report in the Save Report As dialog box, then
click Save.
You can now view your custom report on the Incidents > All Reports page, and you can
select it when you configure your automatic incident deletion flagging job.
You can use Advanced Filters & Summarization to further refine your reports.
If you have hidden incidents from reports, those incidents will not be deleted even if they meet
the criteria you select. You must unhide those incidents you wish to automatically flag for
deletion.
See “Unhiding hidden incidents” on page 1959.
See “Filtering reports” on page 1914.
Managing and reporting incidents 1931
Deleting incidents

Configuring automatic incident deletion flagging


You configure automatic incident deletion flagging on the System > Incident Deleter > Flag
Incidents for Deletion page. Automatic incident deletion flagging configuration consists of
selecting your custom incident reports and scheduling incident deletion flagging jobs. You
must have Symantec Data Loss Prevention administrator privileges to configure automatic
incident deletion flagging.
See “About creating incident reports for automatic incident deletion flagging” on page 1930.
To configure automatic incident deletion flagging
1 In the Enforce Server administration console, navigate to the System > Incident Deleter
> Flag Incidents for Deletion page.
2 Click Configure.
3 On the configuration page, select the report or reports that include the incidents you want
to flag for incident deletion. You can select on report per incident type.You cannot select
system reports for incident deletion flagging.
4 Set a schedule for your incident deletion flagging jobs. You can schedule incident deletion
flagging jobs to run at a specific time once, every day, every week, or every month. You
can also select No Regular Schedule if you prefer to schedule your incident deletion jobs
manually.
There are two considerations to keep in mind when scheduling incident deletion flagging
jobs:
■ The incident deletion flagging jobs should run to completion before your scheduled
incident deletion jobs.
■ The incident deletion flagging jobs should run at a time when Symantec Data Loss
Prevention is not running any other jobs.

5 Click Save.

Managing automatic incident deletion flagging


You manage automatic incident deletion flagging jobs on the System > Incident Deleter >
Flag Incidents for Deletion page. On this page you can view your custom reports for incident
deletion flagging, the schedule for the upcoming incident deletion flagging job, and the incident
deletion flagging job history.
You can link directly to your incident deletion flagging job report by clicking the report name
in the Selected reports for incident deletion flagging section.
You can view incident deletion flagging job history in the Job history of incident deletion
flagging section. For each job history, Symantec Data Loss Prevention displays the following
information:
Managing and reporting incidents 1932
Deleting custom dashboards and reports

■ Job ID: The identifier for the incident deletion flagging job.
■ Job started: The start time for the incident deletion flagging job.
■ Report Name: the name of the custom report used to flag incidents for deletion.
■ #Incidents Flagged: The number of incidents flagged for deletion by that job.
■ Status: The status of the incident deletion flagging job.
You can delete incident deletion flagging jobs by selecting one or more jobs using the
checkboxes, then clicking Delete. Note that there is no confirmation for incident deletion
flagging job deletion, though deleted jobs are displayed in the Tomcat logs.

Troubleshooting automatic incident deletion flagging


Automatic incident deletion flagging includes two event codes useful for tracking incident
deletion flagging jobs. It also logs information about the process to the Tomcat logs.
The system event codes are:
■ 2318: Incident deletion flagging process started.

■ 2319: Incident deletion flagging process ended.

Tomcat logs include the following information (line breaks added for legibility):

Timestamp- Thread: 111 INFO


[com.vontu.manager]
User "Administrator" initiated incident action
"Marked for Deletion" for 6 incident(s)

Timestamp- Thread: 111 INFO


[com.vontu.manager]
Incident deletion flagging process ended.

Timestamp- Thread: 119 INFO


[com.vontu.manager.system.incident.deletion.IncidentFlagDeletionListController]
The flagged incident deletion jobs have been deleted. Number of jobs deleted are: N

Be aware that incident deletion flagging jobs can fail due to insufficient space for undo/redo
actions in the Symantec Data Loss Prevention database. For detailed information about
managing the database, see the Symantec Data Loss Prevention System Maintenance Guide.

Deleting custom dashboards and reports


You can delete any custom report or dashboard that you create.
Managing and reporting incidents 1933
Common incident report features

To delete a custom dashboard or report


1 In the Enforce Server administration console, on the Incidents menu, select Incident
Reports.
The Incident Reports dashboard appears and displays Saved Reports near the top.
2 Click the delete icon next to the report or dashboard to delete it.
3 Click OK to confirm.
4 Symantec Data Loss Prevention deletes the report, and removes it from the Incident
Reports screen.

Common incident report features


The following options are common to incident report lists:
■ Icons to perform the following tasks for a report:
■ Save
You can save the current report as a custom saved report.
See “Saving custom incident reports” on page 1914.
■ Send
You can email the report or schedule the report distribution.
See “Saving custom incident reports” on page 1914.
■ Export
You can export the current report as CSV or XML.
See “Exporting incident reports” on page 1921.
■ Delete Report
If this report is not a saved report, then the Delete Report option does not appear.

■ Report filters and summary options


See “Incident report filter and summary options” on page 1934.
■ Page navigation icons
See “Page navigation in incident reports” on page 1934.
The following summary reports are available for the types of incidents:
■ Network
See “Network summary report” on page 1861.
■ Endpoint
See “Endpoint incident summary reports” on page 1874.
■ Discover
See “Discover summary reports” on page 1885.
Managing and reporting incidents 1934
Page navigation in incident reports

Page navigation in incident reports


All reports except executive summaries include page navigation options. Symantec Data Loss
Prevention displays the number of currently visible incidents out of total report incidents (for
example, 1-19 of 19 or 1-50 of 315).
Reports with more than 50 incidents have the following options:

Displays the first page of the report.

Displays the previous page.

Displays the next page.

Displays the last page.

Show All Displays all items on one single page.

Use the Show All link on an Incident List with caution when the system contains
more than 500 incidents. Browser performance degrades drastically if more than
500 incidents are displayed on the Incident List page.

Select All Selects all incidents on all pages, so you can update them all at once. (Available
only on Incident Lists.) Click Unselect All to cancel.
Note: Use caution when you choose Select All. This option selects all the incidents
in the report (not only those on the current page). Any incident command that you
subsequently apply affects all the incidents.

To select only the incidents on the current page, select the checkbox at top left of
the incident list.

See “Common incident report features” on page 1933.

Incident report filter and summary options


Filters are separated into commonly used filters, and advanced filters and summarizations.
The common filters include the following options:

Status Select Equals, Is Any Of, or Is None Of. Then select status values.
Hold down Ctrl and click to select more than one separate status
value. Hold down Shift and click to select a range.

Date Use the drop-down menu to select a date range, such as Last Week
or Last Month. The default is All Dates.
Network and Endpoint reports

Severity Check the boxes to select the severity values.


Managing and reporting incidents 1935
Sending incident reports by email

Scan For Discover reports, select the scan to report. You can select the
most recent scan, the initial scan, or a scan in progress. All Scans
Discover reports
is the default.

Target ID For Discover reports, select the name of the target to report. All
Targets is the default.

Click the Advanced Filters & Summarization bar to expand the section with filter and summary
options.
Click Add Filter to add an advanced filter.
Select a primary and optional secondary option for summarization. A single-summary report
is organized with a single summary criterion, such as the policy that is associated with each
incident. A double-summary report is organized with two criteria, such as policy and incident
status.

Note: If you select a condition in which you enter the content to be matched in the text field,
your entire entry must match exactly. For example, if you enter "apples and oranges", that
exact text must appear in the specified component for it to be considered a match. The sentence
"Bring me the apples and the oranges" is not considered a match.

For a complete list of the report filter and summary options, see the Symantec Data Loss
Prevention Administration Guide.
See “Common incident report features” on page 1933.

Sending incident reports by email


You can send a copy of the current report to any email address.
To send reports, your system administrator must configure an SMTP server. The Administrator
must specify a report distribution option on the System > Settings page. You must also specify
an email address for your user account.
See “Configuring the Enforce Server to send email alerts” on page 176.
To send a report
1 Click Incidents, and select a type of report.
2 Navigate to the report that you want to export. Filter or summarize the incidents in the
report, as desired.
See “Common incident report features” on page 1933.
Managing and reporting incidents 1936
Printing incident reports

3 Click Send in the upper right corner.


Alternatively, you can use the Send menu (above the filters).
See “Saving custom incident reports” on page 1914.
4 In the Send Report dialog box, specify the following options:

To Enter one or more email addresses (comma-separated).

Subject Enter a subject for the message.

Message Enter the message.

5 Click Send or Cancel.


See “Printing incident reports” on page 1936.
See “Exporting incident reports” on page 1921.

Printing incident reports


You can print a report to any available printer.
To print a report
1 Click Incidents, and select a type of report.
2 Navigate to the report that you want to export. Filter or summarize the incidents in the
report, as desired.
See “Common incident report features” on page 1933.
3 Click Print in the upper right corner.
4 An image of the report appears in a browser window.
5 The printer selection dialog box appears, and you can select a printer.
See “Sending incident reports by email” on page 1935.
See “Exporting incident reports” on page 1921.

Incident snapshot history tab


You can view the actions that were performed on the incident. For each action, the History
tab displays the action date and time, the actor (a user or server), and the action or the
comment.
See “Discover incident snapshot” on page 1882.
See “Network incident snapshot” on page 1857.
Managing and reporting incidents 1937
Incident snapshot notes tab

See “Endpoint incident snapshot” on page 1866.

Incident snapshot notes tab


You can add a note to an incident, or view existing notes for that incident, on the Notes tab.
To add a note, click Add Note. The limit for notes is 4000 bytes.
See “Discover incident snapshot” on page 1882.
See “Network incident snapshot” on page 1857.
See “Endpoint incident snapshot” on page 1866.

Incident snapshot attributes section


You can view a list of custom attributes and their values, if any have been specified. Click on
attribute values to view an incident list that is filtered on that value. To add new values or edit
existing ones, click Edit. In the Edit Attributes dialog box that appears, type the new values
and click Save. Hidden incidents are not displayed in the filtered list.

Note: This section appears only if a system administrator has configured custom attributes.

See “Discover incident snapshot” on page 1882.


See “Endpoint incident snapshot” on page 1866.
See “Network incident snapshot” on page 1857.

Incident snapshot correlations tab


You can view lists of the incidents that share various attributes of the current incident.
For example, if the copying of a file triggered the current incident, you can bring up a list of all
the incidents that are related to the copying of this file. The Correlations tab shows a list of
correlations that are matched to single attributes. Click on attribute values to view lists of the
incidents that are related to those values.
To search for other incidents with the same attributes, click Find Similar. In the Find Similar
Incidents dialog box that appears, select the desired search attributes. Then click Find
Incidents. Hidden incidents are not displayed when you search for similar incidents.
See “Discover incident snapshot” on page 1882.
See “Endpoint incident snapshot” on page 1866.
See “Network incident snapshot” on page 1857.
Managing and reporting incidents 1938
Incident snapshot policy section

Incident snapshot policy section


The Policy area shows the policy that was violated in the incident and indicates if the policy
blocked a move or notified the user. It also shows the total number of matches for the policy,
as well as matches per policy rule. Click the policy name to view a list of all incidents that
violated the policy. Click view policy to view a read-only version of the policy.
You see the icons that describe the following information:
■ Symantec Data Loss Prevention blocked a copy of the sensitive information.
■ Symantec Data Loss Prevention notified the user about the copy of confidential data.
This section also lists other policies that are violated from the same file. To view the snapshot
of an incident that is associated with a particular policy, click the Go to Incident link next to
the policy name. To view a list of all incidents that are related to the file, click show all.
See “Discover incident snapshot” on page 1882.
See “Endpoint incident snapshot” on page 1866.
See “Network incident snapshot” on page 1857.

Incident snapshot matches section


In the Matches section, Symantec Data Loss Prevention displays the content (if applicable)
and the matches that caused the incident.
Matches are highlighted in yellow. This section shows the match total and displays the matches
in the order in which they appear in the original content. To view the rule that triggered a match,
click on the highlighted match.
See “Discover incident snapshot” on page 1882.
See “Endpoint incident snapshot” on page 1866.
See “Network incident snapshot” on page 1857.
See “About the Similarity Threshold and Similarity Score” on page 667.

Incident snapshot access information section


The Access Information section of an incident snapshot shows the Access Control Lists for
that object.
Access Control Lists (ACL) are lists of the permissions that are attached to an object or piece
of data. The list contains information about all users who have read and write permissions for
the file. Use the list to view which users have access to the file as well as which actions each
user can perform. The permissions for each user or group are not set through Symantec Data
Managing and reporting incidents 1939
Customizing incident snapshot pages

Loss Prevention. Administrators set the permissions for each file using other types of programs
on the endpoint. Permissions are generally set at the time that the file is created.
For example, User 1 has permission to access the file Example1.doc. User 1 can view and
edit the file. User 2 also has access to the file Example1.doc. However, User 2 can only view
the file. User 2 does not have permission to make changes to the file. In the ACL, both User
1 and User 2 are listed with the permissions that have been granted to them.
Table 56-1 shows the combinations.

Table 56-1 Access control list example

Name Permission

User 1 GRANT READ

User 1 GRANT WRITE

User 2 GRANT READ

The ACL contains a new line for each permission granted. The ACL only contains one line for
User 2 because User 2 only has one permission, to read the file. User 2 cannot make any
changes to the file. User 1 has two entries because User 1 has two permissions: reading the
file and editing it.
You can view ACL information only on Discover and Endpoint local drive incident snapshots.
You cannot view ACL information on any other type of incidents.
The Access Information section appears on the Key Info tab of the incident snapshot.
See “Discover incident snapshot” on page 1882.
See “Endpoint incident snapshot” on page 1866.
See “Network incident snapshot” on page 1857.

Customizing incident snapshot pages


You can customize the appearance of the incident snapshot page.
To customize the appearance of the incident snapshot page
1 From an incident snapshot, click Customize Layout (in the upper-right corner).
2 Select the information to appear on each of the tabs in the incident snapshots.
Tab 1 always contains the Key Info, and cannot be changed.
3 For each of the areas on the incident snapshot screen, select the information that appears.
4 Click Save.
Managing and reporting incidents 1940
About filters and summary options for reports

About filters and summary options for reports


You can set a number of filters and summaries for Symantec Data Loss Prevention incident
reports.
These filters let you see the incidents and incident data in different ways.
The set of filters apply separately to Network, Endpoint, and Storage events.
Figure 56-1 shows the locations of the options to filter and summarize reports.

Figure 56-1 Filter and summary options

General filters

Advanced filters

Summary options

Current filters and


summary options

The filters and summary options are in the following sections:

General filters The general filter options are the See “General filters for reports”
most commonly used. They are on page 1941.
always visible in the incident list
report.
Managing and reporting incidents 1941
General filters for reports

Advanced filters The advanced filters provide many See “Advanced filter options for
additional filter options. You must reports” on page 1949.
click the Advanced Filters &
Summarization bar, and then
click Add Filter to view these filter
options.

Summary options The summary options provide See “Summary options for
ways to summarize the incidents incident reports” on page 1944.
in the list. You must click the
Advanced Filters &
Summarization bar to view these
summary options.

Symantec Data Loss Prevention contains many standard reports. You can also create custom
reports or save report summary and filter options for reuse.
See “About Symantec Data Loss Prevention reports” on page 1899.

General filters for reports


General filters for reports include a set of a few common filters.
Most of these filters are applicable for all the products. Network Discover/Cloud Storage
Discover contains some general filters that relate to scans of storage. For example, you can
filter the incidents that are in a particular scan. These filters are not applicable to Network
Prevent or Endpoint Prevent.
Table 56-2 lists the general filter options for report status values.
You can also create custom status values.
See “About incident status attributes” on page 1962.
These status filters are available for Network, Endpoint, and Discover incidents.

Table 56-2 General filters for status values

Name Description

Equals The status is equal to the field that is selected in the next drop-down.

Is Any Of The status can be any of the fields that are selected in the next drop-down.
Shift-click to select multiple fields.

Is None Of The status is none of the fields that are selected in the next drop-down.
Shift-click to select multiple fields.

Table 56-3 lists the general filter options by date.


Managing and reporting incidents 1942
General filters for reports

These date filters are available for Network, and Endpoint incidents.

Table 56-3 General filters by date

Name Description

All Dates All of the dates that contain incidents.

Current Month to Date All of the incidents that were reported for the current month up to today's
date.

Current Quarter to Date All of the incidents that were reported for the current quarter up to today's
date.

Current Week to Date All of the incidents that were reported for the current week.

Current Year to Date All of the incidents that have been reported for the current year up to today's
date.

Custom A custom time frame. Select the dates that you want to view from the
calendar menu.

Last 7 Days All of the incidents that were reported in the previous seven days.

Last 30 Days All of the incidents that were reported in the previous 30 days.

Last Month All of the incidents that were reported during the previous calendar month.

Last Week All of the incidents that were reported during the previous calendar week.

Last Quarter All of the incidents that were reported during the previous quarter.

Last Year All of the incidents that were reported during the last calendar year.

Today All of the incidents that were reported today.

Yesterday All of the incidents that were reported yesterday.

Table 56-4 lists the general filter options by severity. Check the box to select the severities to
include in the filter.
These severity filters are available for Network, Endpoint, and Discover incidents.

Table 56-4 General filters for severity values

Name Description

High Lists only the high-severity incidents. Displays how many high-severity
incidents are in the incident list.
Managing and reporting incidents 1943
General filters for reports

Table 56-4 General filters for severity values (continued)

Name Description

Info Lists only the incidents that are informational only. Informational incidents
are not assigned any other severity. Displays how many informational
incidents are in the incident list.

Low Lists only the low-severity incidents. Displays how many low-severity
incidents are in the incident list.

Medium Lists only the medium-severity incidents. Displays how many


medium-severity incidents are in the incident list.

Table 56-5 lists the general filter options for Network Discover scans. This filter is only available
for Discover incidents.

Table 56-5 General filters for scans

Name Description

All Scans All of the incidents that have been reported in all of the scans that have
been run.

Initial Scan All of the incidents that were reported in the initial scan.

In Process All of the incidents that have been reported in the scans that are currently
in progress.

Last Completed Scan All of the incidents that were reported in the last complete scan.

You can filter Discover incidents by Target ID. This filter is only available for Discover incidents.
Select the target, or select All Targets. Shift-click to select multiple fields.
Table 56-6 lists the general filter options by detection date for Discover incidents.

Table 56-6 General filters by date

Name Description

All Dates All of the dates that contain incidents.

Current Month to Date All of the incidents that were reported for the current month up to today's
date.

Current Quarter to Date All of the incidents that were reported for the current quarter up to today's
date.

Current Week to Date All of the incidents that were reported for the current week.
Managing and reporting incidents 1944
Summary options for incident reports

Table 56-6 General filters by date (continued)

Name Description

Current Year to Date All of the incidents that have been reported for the current year up to today's
date.

Custom A custom time frame. Select the dates that you want to view from the
calendar menu.

Custom Since The Symantec DLP Agents that have connected to the Endpoint Server
from a specific date to the present date. Select the date where you want
the filter to begin.

Custom Before The Symantec DLP Agents that have connected to an Endpoint Server
before a specific date. Select the final date for the filter.

Last 7 Days All of the incidents that were reported in the previous seven days.

Last 30 Days All of the incidents that were reported in the previous 30 days.

Last Month All of the incidents that were reported during the previous calendar month.

Last Week All of the incidents that were reported during the previous calendar week.

Last Quarter All of the incidents that were reported during the previous quarter.

Last Year All of the incidents that were reported during the last calendar year.

Today All of the incidents that were reported today.

Yesterday All of the incidents that were reported yesterday.

Summary options for incident reports


Incident report summaries provide options for a summary of the information that is contained
within the incidents. For example, you can summarize incidents by the status or the policy.

Note: Hidden incidents are not included in report summaries unless the Advanced filter option
for the Is Hidden filter is set to Show All.
See “About incident hiding” on page 1958.

Table 56-7 lists the summary options for incident reports.


Managing and reporting incidents 1945
Summary options for incident reports

Table 56-7 Summary filters

Name Description Applicable products

Agent Configuration Summarize the agents and incidents by Endpoint


the associated agent configuration entity.
If you have more than one agent
configuration entity configured, you can
summarize or filter by a specific entity
drop down menu. If the default agent
configuration entity is the only entity
configured, you will not see the drop
down menu.

Agent Response Summarize incidents by how the agent Endpoint


has responded to the incident.

Content Root Summarize the incidents by the content Discover


root path.

Data Owner Email Address The email address of the person Network
responsible for remediating the incident.
Endpoint
This field must be set manually, or with
a lookup plug-in. Discover

Data Owner Name The person responsible for remediating Network


the incident. This field must be set
Endpoint
manually, or with a lookup plug-in.
Discover
Reports can automatically be sent to the
data owner for remediation.

Destination IP Summarize the incidents by the Network


destination IP address.
Endpoint

Detection Month Summarize the incidents by the month Discover


in which they were detected.

Detection Quarter Summarize the incidents by the calendar Discover


quarter in which they were detected.

Detection Week Summarize the incidents by the week in Discover


which they were detected.

Detection Year Summarize the incidents by the year in Discover


which they were detected.

Device Instance ID Summarize the incidents by the specific Endpoint


device that created the violation.
Managing and reporting incidents 1946
Summary options for incident reports

Table 56-7 Summary filters (continued)

Name Description Applicable products

Domain Summarize the incidents by the domain Network


name.

Endpoint Location Summarize the incidents by the location Endpoint


of the endpoint.

The location can be one of the following:

■ On the Corporate Network


■ Off the Corporate Network

File Name Summarize the incidents by the file name Endpoint


that is associated with the incident.

File Owner Summarize the incidents by the owner Discover


of the file.

Investigating State Summarize the agents by the current Endpoint


status.
Discover

Location Summarize the incidents by their Discover


location.

Log Level Summarize the agents by their Endpoint


configured log levels.

Machine IP (Corporate) Summarize the incidents by the IP Endpoint


address of a machine on the corporate
network.

Machine Name Summarize the incident by the computer Endpoint


name on which the incidents were
created.

Month Summarize the incidents by the month Network


in which they were created.
Endpoint

Months Since First Summarize the incidents by how many Discover


Detected months have passed since the incident
was first detected.

Network Prevention Action Summarize the incidents by the action Network


from Network Prevent.
Managing and reporting incidents 1947
Summary options for incident reports

Table 56-7 Summary filters (continued)

Name Description Applicable products

No primary summary Placeholder selection to denote that no Network


selected primary summary has been selected.
Endpoint

Discover

No secondary summary Placeholder selection to denote that no Network


selected summary has been selected.
Endpoint

Discover

Policy Summarize the incidents by the policy Network


from which they were created.
Endpoint

Discover

Policy Group Summarize the incidents by the policy Network


group to which they belong.
Discover

Protect Status Summarize the incidents by the Network Discover


status of the incidents.

Protocol Summarize the incidents by the protocol Network


that generated the incident.

Protocol or Endpoint Summarize the incidents by the protocol Endpoint


Destination or the endpoint destination where the
incidents were created.

Remediation Detection Summarize the incidents by their Discover


Status remediation detection status.

Quarantine Failure Reason Summarize the incidents by the reason Endpoint


that the quarantine response action
Discover
failed.

Quarter Summarize the incidents by the quarter Network


in which they were created.
Endpoint

Quarters Since First Summarize the incidents by how many Discover


Detected quarters have passed since the incident
was first detected.

Recipient Summarize the incidents by the recipient. Discover


Managing and reporting incidents 1948
Summary options for incident reports

Table 56-7 Summary filters (continued)

Name Description Applicable products

Scan Summarize the incidents by which scan Discover


was used to find the incidents.

Scanned Machine Summarize the incidents by the Discover


computers that have been scanned.

Sender Summarize the incidents by the sender. Network

Endpoint

Discover

Server or Detector Summarize the incidents by the server Network


on which they were created.
Endpoint

Source IP Summarize the incidents by the source Network


IP address from which they were
Endpoint
created.

Source File Summarize the incidents by the source Endpoint


file that violated the policy.

Status Summarize the incidents by the incident Network


status.
Endpoint

Discover

Subject Summarize the incidents by the subject. Discover

Target ID Summarize the incidents by the target Discover


scan ID.

Target Type Summarize the incidents by the type of Discover


target on which the incident was
generated.

User Justification Summarize the incidents by the Endpoint


justification that was input by the user.

User Name Summarize the incidents by the user who Endpoint


generated the incident.

Week Summarize the incidents by the week in Network


which they were created.
Endpoint
Managing and reporting incidents 1949
Advanced filter options for reports

Table 56-7 Summary filters (continued)

Name Description Applicable products

Weeks Since First Summarize the incidents by how many Discover


Detected weeks have passed since the incident
was first detected.

Year Summarize the incidents by the year in Network


which they were created.
Endpoint

Years Since First Detected Summarize the incident by how many Discover
years have passed since the incident
was first detected.

Advanced filter options for reports


Advanced report filters let you filter incidents related to specific actions or text strings. For
example, you can filter the incidents that relate to a specific keyword. Or, you can filter out the
incidents that relate to a certain action. These filters combine a set of chooser fields or text
boxes to create the advanced filter.
Table 56-8, Table 56-9, and Table 56-10 list the advanced filter options for reports.

Table 56-8 Advanced filters, first field

Name Description Applicable


products

Agent Configuration Summarize the agents and incidents by the Endpoint


associated agent configuration entity. If you
have more than one agent configuration entity
configured, you can summarize or filter by a
specific entity drop down menu. If the default
agent configuration entity is the only entity
configured, you will not see the drop down
menu.
Managing and reporting incidents 1950
Advanced filter options for reports

Table 56-8 Advanced filters, first field (continued)

Name Description Applicable


products

Agent Configuration Status Summarize the agent by the status of the Endpoint
configuration entity.

■ Current Configuration
The configuration on the agent is the same
as the configuration on the Endpoint Server.
■ Outdated Configuration
The configuration on the agent is different
than the configuration on the Endpoint
Server.
■ Unknown/deleted Configuration
The agents either cannot report which
configuration is installed, or the configuration
on the agent has been deleted from the
Endpoint Server.

Agent Response Filter incidents by how the agent has responded Endpoint
to the incident.

Application Name Filter the incidents by the name of the Endpoint


application where the incident was generated.

Application Window Title Filter the incidents by a string in the title of the Endpoint
window where the incident was generated.

Attachment File Name Filter incidents by the file name of the Network
attachment that is associated with the incident.

Attachment File Size Filter incidents by the size of the attachment that Network
is associated with the incident.

Box: Collaborator Filter incidents by Box collaborators. Discover

Box: Collaborator Role Filter incidents by the role of the Box Discover
collaborator. Roles include:

■ Co-owner
■ Editor
■ Previewer
■ Previewer Uploader
■ Uploader
■ Viewer
■ Viewer Uploader
Managing and reporting incidents 1951
Advanced filter options for reports

Table 56-8 Advanced filters, first field (continued)

Name Description Applicable


products

Box: Shared Link Filter incidents by the presence or absence of Discover


a shared link.

Box: Shared Link Download Allowed Filter incidents by the presence or absence of Discover
a shared link that allows downloads.

Box: Shared Link Expiration Date Filter incidents by the expiration date setting of Discover
a shared link.

Box: Shared Link Password Protected Filter incidents by the presence or absence of Discover
a password-protected shared link.

Content Root Filter the incidents by the content root path. Discover

Data Owner Email Address The email address of the person responsible for Network
remediating the incident. This field must be set
Endpoint
manually, or with a lookup plug-in.
Discover

Data Owner Name The person responsible for remediating the Network
incident. This field must be set manually, or with
Endpoint
a lookup plug-in.
Discover
Reports can automatically be sent to the data
owner for remediation.

Destination IP Filter the incidents by the destination IP address Network


for the message that generated the incident.
Endpoint

Detection Date Filter the incidents by the date that the incident Discover
was detected.

Device Instance ID Summarize the incidents by the specific device Endpoint


that created the violation.

Document Name Filter the incidents by the name of the violating Discover
document.

Domain Filter the incidents by the domain name that is Network


associated with the incident.
Managing and reporting incidents 1952
Advanced filter options for reports

Table 56-8 Advanced filters, first field (continued)

Name Description Applicable


products

Endpoint Location Filter the incidents by the endpoint location. Endpoint

The location can be one of the following:

■ On the Corporate Network


■ Off the Corporate Network

File Last Modified Date Filter the incidents by the last date when the file Endpoint
was modified.
Discover

File Location Filter the incidents by the location of the violating Endpoint
file.

File Name Filter the incidents by the name of the violating Endpoint
file. No wildcards, but you can specify a partial
Discover
match, for example .pdf.

File Owner Filter the incidents by the owner of the violating Discover
files.

File Size Filter the incidents by the size of the violating Endpoint
file.
Discover

Incident History Issuer Filter the incidents by the user responsible for Network
issuing the history of the incident.
Endpoint

Discover

Incident ID Filter the incidents by the ID of the incidents. Network

Endpoint

Discover

Incident Match Count Filter the incidents by the number of incident Network
matches.
Endpoint

Discover

Incident Notes Filter the incidents by a string in the incident Network


notes.
Endpoint

Discover

Incident Reported On Filter the incidents by the date that the incident Endpoint
was reported.
Managing and reporting incidents 1953
Advanced filter options for reports

Table 56-8 Advanced filters, first field (continued)

Name Description Applicable


products

Investigating State Filter the agents by the investigation state. You Discover
can select one of the following:
Endpoint
■ Investigating
■ Not Investigating

Is Hidden Filters hidden incidents. You can select one of Network


the following:
Endpoint
■ Show All
Discover
■ Show Hidden

See “About incident hiding” on page 1958.

Is Hiding Allowed Filters the incidents based on the state of the Is Network
Hiding Allowed flag. Select the Is Any Of
Endpoint
operator from the second field, then select either
the Allow Hiding or Do Not Hide option from Discover
the third field.

See “About incident hiding” on page 1958.

Last Connection Time Filter agents according to the last time each Endpoint
agent connected to the Endpoint Server.

Location Filter the incidents by their location. Location Discover


can include the server where the incidents were
generated.

Machine IP (Corporate) Filter the incidents by the IP address of the Endpoint


computer on which the incidents were created.

Machine Name Filter the incidents by the computer name on Endpoint


which the incidents were created.

Network Prevent Action Filter the incidents by the action from Network Network
Prevent.

Policy Filter the incidents by the policy from which they Network
were created.
Endpoint

Discover
Managing and reporting incidents 1954
Advanced filter options for reports

Table 56-8 Advanced filters, first field (continued)

Name Description Applicable


products

Policy Group Filter the incidents by the policy group to which Network
they belong.
Endpoint

Discover

Policy Rule Filter the incidents by the policy rule that Network
generated the incidents.
Endpoint

Discover

Protect Status Filter the incidents by the Network Protect status Discover
of the incidents.

Protocol Filter the incidents by the protocol to which they Network


belong.

Protocol or Endpoint Destination Filter the incidents by the protocol or the Endpoint
endpoint destination that generated the incident.

Read ACL: File Filter the incidents by the File access control Endpoint
list.
Discover

Read ACL: Share Filter the incidents by the Share access control Discover
list.

Recipient Filter the incidents by the name of the recipient Network


of the message that generated the incident.
Endpoint

Discover

Remediation Detection Status Filter the incidents by their remediation detection Discover
status.

Scanned Machine Filter the incidents by the computers that have Discover
been scanned.

Seen Before Filter the incidents on whether an earlier Discover, but not
connected incident exists. for SQL Database
incidents (where
Seen Before is
always false)
Managing and reporting incidents 1955
Advanced filter options for reports

Table 56-8 Advanced filters, first field (continued)

Name Description Applicable


products

Sender Filter the incidents by the sender. Network

Endpoint

Discover

Server or Detector Filter the incidents by the server on which they Network
were created.
Endpoint

Discover

SharePoint ACL: Permission Level Filter the incidents on the permission level of Discover
the SharePoint access control list.

SharePoint ACL: User/Group Filter the incidents on the user or group in the Discover
SharePoint access control list.

Source IP Filter the incidents by the source IP address Network


from which they were created.

Subject Filter incidents by the subject line of the Network


message that generated the incident.
Discover

Superseded Filter the incidents by the incident responses Discover


have been superseded by other responses.
Endpoint

Target Type Filter the incidents by the type of target that is Discover
associated with the incidents.

Time Since First Detected Filter the incidents by how much time has Discover, but not
passed since the incident was first detected. for SQL Database
incidents

URL Filter the incidents by the URL where the Discover


violations occurred.

User Justification Filter the incidents by the justification that was Endpoint
input by the user.

User Name Filter the incidents by the user who generated Endpoint
the incident.

The second field in the advanced filters lets you select the match type in the filter.
Managing and reporting incidents 1956
Advanced filter options for reports

Table 56-9 Advanced filters, second field

Name Description

Contains Any Of Lets you modify the filter to include any words in the text string, or lets
you choose from a list in the third field.

Contains Ignore Case Lets you modify the filter to ignore a specific text string.

Does Not Contain Ignore Lets you modify the filter to filter out the ignored text string.
Case

Does Not Match Exactly Lets you modify the filter to match on any combination of the text string.

Ends with Ignore Case Lets you modify the filter so that only the incidents that end with the ignored
text string appear.

Is Any Of Lets you modify the filter so that the results include any of the text string,
or lets you choose from a list in the third field.

Is Between Lets you modify the filter so that the numerical results are between a range
of specified numbers.

Is Greater Than Lets you modify the filter so that the numerical results are greater than a
specified number.

Is Less Than Lets you modify the filter so that the numerical results are less than a
specified number.

Is None Of Lets you modify the filter so that the results do not include any of the text
string, or lets you choose from a list in the third field.

Is Unassigned Lets you modify the filter to match incidents for which the value specified
in the first field are unassigned.

Matches Exactly Lets you modify the filter to match exactly the text string.

Matches Exactly Ignore Lets you modify the filter so that the filter must match the ignored text
Case string exactly.

Starts with Ignore Case Lets you modify the filter so that only the incidents that start with the
ignored text string appear.

The third field in the advanced filters lets you select from a list of items, or provides an empty
box to enter a string.
This third field varies depending on the selections in the first and second fields.
For a list of items, use Shift-click to select multiple items.
For strings, wildcards are not allowed, but you can enter a partial string.
Managing and reporting incidents 1957
Advanced filter options for reports

For example, you can enter .pdf to select any PDF file.
If you do not know what text to enter, use the summary options to view the list of possible text
values. You can also see a summary of how many incidents are in each category.
See “Summary options for incident reports” on page 1944.
Table 56-10 lists some of the options in the third field.

Table 56-10 Advanced filters, third field

Name Description

Blocked The user was blocked from performing the action that cause the incident.

Action Encrypted A managed user tried to copy or move a sensitive file using a supported
channel and the file was automatically encrypted.

Action Encrypted Blocked A user action was blocked and a file was not encrypted either because
an unmanaged user attempted to copy or move it using a supported
channel, or because a managed user attempted to copy or move the file
using an unsupported channel.

Content Removed The content in violation was removed.

No Remediation No incident remediation has occurred for this incident.

None No action was taken regarding the violation that caused the incident.

Protect File Copied The file in violation was copied to another location.

Protect File Quarantined The file in violation was quarantined to another location.

User Notified The user was notified that a violation had occurred.
Chapter 57
Hiding incidents
This chapter includes the following topics:

■ About incident hiding

■ Hiding incidents

■ Unhiding hidden incidents

■ Preventing incidents from being hidden

■ Deleting hidden incidents

About incident hiding


Incident hiding lets you flag specified incidents as "hidden." Because these hidden incidents
are excluded from normal incident reporting, you can improve the reporting performance of
your Symantec Data Loss Prevention deployment by hiding any incidents that are no longer
relevant. The hidden incidents remain in the database; they are not moved to another table,
database, or other type of offline storage.
You can set filters on incident reports in the Enforce Server administration console to display
only hidden incidents or to display both hidden and non-hidden incidents. Using these reports,
you can flag one or more incidents as hidden by using the Hide/Unhide options that are
available when you select one or more incidents and click the Incident Actions button. The
Hide/Unhide options are:
■ Hide Incidents—Flags the selected incidents as hidden.
■ Unhide Incidents—Restores the selected incidents to the unhidden state.
■ Do Not Hide—Prevents the selected incidents from being hidden.
■ Allow Hiding—Allows the selected incidents to be hidden.
Hiding incidents 1959
Hiding incidents

The hidden state of an incident displays in the incident snapshot screen in the Enforce Server
administration console. The History tab of the incident snapshot includes an entry for each
time the Do Not Hide or Allow Hiding flags are set for the incident.
See “Filtering reports” on page 1914.
Access to hiding functionality is controlled by roles. You can set the following user privileges
on a role to control access:
■ Hide Incidents—Grants permission for a user to hide incidents.
■ Unhide Incidents—Grants permission for a user to show hidden incidents.
■ Remediate Incidents—Grants permission for a user to set the Do Not Hide or Allow
Hiding flags.
See “About role-based access control” on page 109.
See “Hiding incidents ” on page 1959.
See “Unhiding hidden incidents” on page 1959.
See “Preventing incidents from being hidden” on page 1960.

Hiding incidents
To hide incidents
1 Open the Enforce Server administration console and navigate to an incident report.
2 Select the incidents you want to hide, either by selecting the incidents manually or by
setting filters or advanced filters to return the set of incidents that you want to hide.
3 Click the Incident Actions button and select Hide/Unhide > Hide Incidents.
The selected incidents are hidden.

Unhiding hidden incidents


To restore hidden incidents
1 Open the Enforce Server administration console and navigate to an incident report.
2 Select the Advanced Filters & Summarization link.
3 Click the Add filter button.
4 Select Is Hidden in the first drop-down list.
Hiding incidents 1960
Preventing incidents from being hidden

5 Select Show Hidden from the second drop-down list.


6 Select the incidents you want to unhide, either by selecting incidents manually or by setting
filters or advanced filters to return the set of incidents you want to unhide.
The selected incidents are unhidden.

Preventing incidents from being hidden


You can prevent incidents from being hidden using either an incident report or an incident
snapshot.
To prevent incidents from being hidden using an incident report
1 Open the Enforce Server administration console and navigate to an incident report.
2 Select the incidents you want to prevent from being hidden. You can select incidents
manually or by setting filters or advanced filters to return the set of incidents you want to
prevent from being hidden.
3 Click the Incident Actions button and select Hide/Unhide > Do Not Hide.
The selected incidents are prevented from being hidden.

Note: You can allow incidents to be hidden that you have prevented from being hidden
by selecting the incidents and then selecting Hide/Unhide > Allow Hiding from the
Incident Actions button.

To prevent an incident from being hidden using the incident snapshot


1 Open the Enforce Server administration console and navigate to an incident report.
2 Click on an incident to open the incident snapshot.
3 On the Key Info tab, in the Incident Details section, click Do Not Hide.

Note: You can allow an incident to be hidden that you have prevented from being hidden
by opening the incident snapshot and then clicking Allow Hiding in the Incident Details
section.
Hiding incidents 1961
Deleting hidden incidents

Deleting hidden incidents


To delete hidden incidents
1 Open the Enforce Server administration console and navigate to an incident report.
2 Click the Advanced Filters & Summarization link.
3 Click Add filter.
4 Select Is Hidden in the first drop-down list.
5 Select Show Hidden from the second drop-down list.
6 Select the incidents you want to delete. You can select the incidents manually or you can
set filters or advanced filters that return the set of incidents you want to delete.
7 Click the Incident Actions button and select Delete incidents.
8 Select one of the following delete options:

Delete incident Permanently deletes the incident(s) and all associated data (for example,
completely any emails and attachments). Note that you cannot recover the incidents
that have been deleted.

Retain incident, but Retains the actual incident(s) but discards the Symantec Data Loss
delete message data Prevention copy of the data that triggered the incident(s). You have the
option of deleting only certain parts of the associated data. The rest of the
data is preserved.

Delete Original Deletes the message content (for example, the email message or HTML
Message post). This option applies only to Network incidents.

Delete This option refers to files (for Endpoint and Discover incidents) or email or
Attachments/Files posting attachments (for Network incidents). The options are All, which
deletes all attachments, and attachments with no violations. For example,
choose this option to delete files (for Endpoint and Discover incidents) or
email attachments (for Network incidents).

This option deletes only those attachments in which Symantec Data Loss
Prevention found no matches. For example, choose this option when you
have incidents with individual files taken from a compressed file (Endpoint
and Discover incidents) or several email attachments (Network incidents).

9 Click the Delete button.


Chapter 58
Working with incident data
This chapter includes the following topics:

■ About incident status attributes

■ Configuring status attributes and values

■ Configuring status groups

■ Export web archive

■ Export web archive—Create Archive

■ Export web archive—All Recent Events

■ About custom attributes

■ About using custom attributes

■ How custom attributes are populated

■ Configuring custom attributes

■ Setting custom attributes

■ Setting the values of custom attributes manually

About incident status attributes


Incident status attributes are specified and configured from the Attributes screen (System >
Incident Data > Attributes).
Any status attribute listed on this screen can be assigned to any given incident by selecting it
from the incident snapshot Status drop-down menu.
The system attributes page contains the following attributes to assist in incident remediation:
■ Status Values
Working with incident data 1963
About incident status attributes

The Status Values section lists the current incident status attributes that can be assigned
to a given incident. Use this section to create new status attributes, modify them, and
change the order that each attribute appears in drop-down menus.
See “Configuring status attributes and values” on page 1964.
■ Status Groups
The Status Groups section lists the current incident status groups and their composition.
Use this section to create new status groups, modify them, and change the group order
they appear in drop-down menus.
See “Configuring status groups” on page 1965.
■ Custom Attributes on the Custom Attributes tab
The Custom Attributes tab provides a list of all of the currently defined custom incident
attributes. Custom attributes provide information about the incident or associated with the
incident. For example, the email address of the person who caused the incident, that
person's manager, why the incident was dismissed, and so on. Use this tab to add, configure,
delete, and order custom incident attributes.
See “About custom attributes” on page 1968.
The process for handling incidents goes through several stages from discovery to resolution.
Each stage is identified by a different status attribute such as "New," "Investigation," "Escalated,"
and "Resolved." This lets you track the progress of the incident through the workflow, and filter
lists and reports by incident status.
The solution pack you installed when you installed Symantec Data Loss Prevention provides
an initial default set of status attributes and status attribute groups. You can create new status
attributes, or modify existing ones. The status attribute values and status groups you use
should be based on the workflow your organization uses to process incidents. For example,
you might assign all new incidents a status of "New." Later, you might change the status to
"Assigned," "Investigation," or "Escalated." Eventually, most incidents will be marked as
"Resolved" or as "Dismissed."
For list and report filtering, you can also create status groups.
Based on the preferences of your organization and the commonly used terminology in your
industry, you can:
■ Customize the names of the status attributes and add new status attributes.
■ Customize the names of the status groups and add new status groups.
■ Set the order in which status attributes appear on the Status drop-down list of an incident.
■ Specify the default status attribute that is automatically assigned to new incidents.
See “Configuring status attributes and values” on page 1964.
See “About incident reports” on page 1902.
See “About incident remediation” on page 1841.
Working with incident data 1964
Configuring status attributes and values

See “About custom attributes” on page 1968.

Configuring status attributes and values


As incidents are processed from discovery to resolution, each stage can be marked with a
different status. The status lets you track the progress of the incident through your workflow.
Based on the preferences of your organization and the commonly used terminology in your
industry, you can define the different statuses that you want to use for workflow tracking.
The Status Values section lists the available incident status attributes that can be assigned
to a given incident. The order in which status attributes appear in this list determines the order
they appear in drop-down menus used to set the status of an incident. You can perform the
following actions from the Status Values section:

Action Procedure

Create a new incident status attribute. Click the Add button.

Delete an incident status attribute. Click the attribute's red X and then confirm your decision.

Change an incident status attribute. Click on the attribute you want to change, enter a new name,
and click Save.

To change the name of an existing status, click on the pencil


icon for that status, enter the new name, and click Save.

Make an incident status attribute the Click [set as default] for an attribute to make it the default
default. status for all new incidents.

Change an incident status attribute's ■ Click [up] to move an attribute up in the order.
order in drop-down menus. ■ Click [down] to move an attribute down in the order.

To create a new incident status attribute


1 Go to the Attributes screen (System > Incident Data > Attributes) screen.
Click the Status tab.
2 Click the Add button in the Status Values section.
3 Enter a name for the new status attribute.
4 Click Save.
See “Configuring status groups” on page 1965.
See “About incident status attributes” on page 1962.
Working with incident data 1965
Configuring status groups

Configuring status groups


Incident status attributes can be assigned to status groups that match the workflow of your
organization. For example, an Open status group might include the status attributes of New,
Investigation, and Escalated. You can then filter incident lists and reports based on their
status group. For example, you can list all incidents with status attributes that belong to the
Open status group.
System > Incident Data > Attributes brings you to Status Groups.
For your convenience, you can group incident statuses to match the workflow of your
organization. You use Status Groups to add or modify the name of a status group, and specify
which status values to include in the group.
The Status Groups section lists the available incident status groups that can be used to filter
incidents. For each group, the status attributes included in the group are listed. You can perform
the following actions from the Status Values section:

Action Procedure

Create a new incident status group. Click the Add Status Group button.

Delete an incident status group. Click the group's red X and then confirm your decision.

Change the name or incident status Click on the group you want to change.Click the pencil icon.
attributes of a group. Change the name, check or uncheck attributes, and click Save.

Change a status group's order in ■ Click [up] to move a group up in the order.
drop-down menus. ■ Click [down] to move a group down in the order.

To define a new status group


1 Go to the Attributes screen (System > Incident Data > Attributes) screen.
Click the Status tab.
2 Click the Add Status Group button in the Status Groups section.
3 Enter a name for the new status group.
4 Click the check boxes for the status attributes that you want to include in this group.
Status attributes are defined with the Add button in the Status Values section.
See “Configuring status attributes and values” on page 1964.
5 Click Save.
See “Configuring status attributes and values” on page 1964.
See “About incident status attributes” on page 1962.
Working with incident data 1966
Export web archive

Export web archive


Use this screen to save an incident list report as an archive of HTML pages. An archive allows
personnel without direct access to Symantec Data Loss Prevention to study incident data,
drilling down into individual incidents as needed.
When you export incidents as a Web Archive, the archive is placed in directory \Program
Files\Symantec\DataLossPrevention\EnforceServer\15.5\Protect\archive\webarchive.

Note: You cannot archive summary reports or dashboards.

When exporting incidents, please note the following considerations:


■ An archive cannot be summarized like a normal report.
■ An archive contains no filters, so it may be difficult to locate a specific incident in an archive
containing a large number of incidents.
■ Exporting an archive of incidents does not remove the incidents from the administration
console.
■ You can export only one archive at a time.
Export Web Archive is a user privilege that must be assigned to a role. You can export web
archives only if your role provides access to this feature. Since role access also determines
what information is contained in incident reports, it also applies to archiving those incident
reports. The information that is contained in the archive you create is the same information
contained in the original incident report.
See “About configuring roles and users” on page 110.
The Export web archive screen is divided into two sections:
See “Export web archive—Create Archive” on page 1966.
See “Export web archive—All Recent Events” on page 1967.

Export web archive—Create Archive


In the Create Archive section, complete the following information:

Field Description

Archive Name Specify a name for the archive you are creating
using normal Windows naming conventions.
Working with incident data 1967
Export web archive—All Recent Events

Field Description

Report to Export From the drop-down list, select the report that you
want to archive. Any reports you created are
available along with default report options.
The Network options are as follows:

■ Incidents - Week, Current—Network incidents


from the current week.
■ Incidents - All—All network incidents.
■ Incidents - New—Network incidents with status
of New.
The Endpoint options are as follows:

■ Incidents - Week, Current—Endpoint incidents


from the current week.
■ Incidents - All—All endpoints incidents.
■ Incidents - New—Only endpoint incidents with
status of New.
The Discover options are as follows:

■ Incidents - Last Scan—Discover incidents from


the last completed scan. (Incidents from a
currently active scan are not included.)
■ Incidents - Scan in Process—Discover
incidents from the current scan.
■ Incidents - All Scans—All Discover incidents.
■ Incidents - New—Discover incidents with status
of New.

After you complete the fields, click Create to compile the archive.
See “Export web archive” on page 1966.

Export web archive—All Recent Events


The All Recent Events section displays a list of events related to this archive. (The list appears
only after you click Create to create the archive.) Event entries show the following information:
■ The event type (Error, Warning, or System Information).
■ The event date and time
■ A brief description of the event
To see the details of any event, click on the event entry in the list. To see the full Events Report
for this archive, click show all.
Working with incident data 1968
About custom attributes

See “Export web archive” on page 1966.

About custom attributes


"Custom attributes" are incident data fields that provide a way to capture and store supplemental
incident information. The additional data that is contained in custom attributes can be:
■ Used to drive workflow.
■ Execute incident response actions.
■ Used in report metrics.
■ Enable Incident Response Teams to act faster on incidents.
■ Enable increased remediation and report automation.
You create the custom attributes that you need for these purposes. Custom attributes provide
information about an incident or associated with an incident; for example, the email address
of the person who caused the incident, that person's manager, why the incident was dismissed,
and so on.
The Custom Attributes tab of the Attributes screen (System > Incident Data > Attributes)
is used for working with custom attributes. The Attributes screen contains the following tabs:
■ Status. The Status tab provides a list of all of the currently defined incident status attributes
and status attribute groups. Use this tab to add, configure, delete, and order incident status
attributes and incident status groups.
See “About incident status attributes” on page 1962.
■ Custom Attributes. The Custom Attributes tab provides a list of all of the currently defined
custom incident attributes. Use this tab to add, configure, delete, and order custom incident
attributes.
The solution pack you loaded when you installed Symantec Data Loss Prevention provides
an initial default set of custom attributes. The Custom Attributes tab provides a list of all of the
currently defined custom attributes that may be applied to any incident. This tab is for creating,
modifying, and deleting custom attributes for your installation as a whole. Applying any of these
custom attributes, or attribute values, to an individual incident is done from the incident snapshot,
or by using a lookup plug-in.
On the Custom Attributes tab, you can perform the following functions:

Action Procedure

Create a new custom attribute. Click the Add button.


Working with incident data 1969
About using custom attributes

Action Procedure

Delete a custom attribute. Click the attribute's red "X" and then confirm your decision.

Note that you cannot delete a custom attribute that is currently


assigned to one or more incidents. You must assign a different
attribute to the affected incident(s) before you can delete the
custom attribute successfully.

Change the name, email status, or Click on the attribute you want to change, change its
attribute group of an attribute. parameters, and Click Save.

Change the attributes order in 1 Click [up] to move an attribute up in the order.
drop-down menus.
2 Click [down] to move an attribute down in the order.

Reload Lookup Plugins Click Reload Lookup Plug-ins to reload any custom attribute
plug-ins that have been unloaded by the system.

Reloading look-up plugins affects all incidents. You may need


to reload lookup plug-ins if any of the following are true:

■ A plug-in was problematic and the system unloaded it, but


now the problem is fixed.
■ The network was down or disconnected for some reason,
but it is functioning properly now.
■ A plug-in stores data in a cache, and you want to update
the cache manually.

See “About incident status attributes” on page 1962.


See “Configuring custom attributes” on page 1970.
See “Setting the values of custom attributes manually” on page 1972.

About using custom attributes


When an incident is created, the Enforce Server retrieves data regarding that incident. Some
of that data is in the form of "attributes." See the Symantec Data Loss Prevention Administration
Guide for more information about incident attributes.
"Custom attributes" are a particular kind of attribute that is used to capture and store
supplemental data. This data is related to the incident such as the name of a relevant manager
or department. You create the custom attributes that you need.
The additional data that is contained in custom attributes can be used for:
■ Enabling a workflow
■ Executing incident response actions
Working with incident data 1970
How custom attributes are populated

■ Including in report metrics


■ Enabling incident response teams to act faster on incidents
■ Enabling increased remediation and report automation

How custom attributes are populated


For each incident, custom attributes can be populated (their values can be set in the incident
data) in the following ways:
■ Automatically when the incident is detected by means of a lookup plug-in, as described in
this guide
■ Automatically when the incident is detected by means of an automated response rule
■ Automatically when a user executes a Smart Response Rule
■ Manually (through data entry) by specific users after detection
Custom attributes can also be re-populated automatically by clicking on the Lookup option in
the Attribute section of the Incident Snapshot screen. This action replaces the existing values
that are stored in the custom attribute fields with the values returned by the new lookup.

Note: If the new lookup returns null or empty values for any custom attribute fields, those empty
values overwrite the existing values.

Configuring custom attributes


Use the Configure Custom Attribute screen to add or modify the a custom attribute.
Custom attributes can be grouped into attribute groups, similar to how statuses are grouped
into status groups, to organize the information in a useful way. Examples of common attribute
groups include Employee Information, Manager Information, and Remediation Information.
All custom attributes are available for all incidents.
To create custom attributes and add them to a group
1 On the Enforce Server, click System > Incident Data > Attributes > Custom Attributes.
Note that a number of custom attributes were defined and loaded for you by the Solution
Pack that you selected during installation. All existing custom attributes are listed in the
Custom Attributes window.
2 To create a new custom attribute, click the Add option.
Working with incident data 1971
Setting custom attributes

3 Type a name for the custom attribute in the Name box. If appropriate, check the Is Email
Address box.
The name you give to a custom attribute does not matter. But a custom attribute you
create must be structured the same as the corresponding external data source. For
example, suppose an external source stores department information as separate
geographic location and department name. In this case, you must create corresponding
location and department name custom attributes. You cannot create a single department
ID custom attribute combining both the location and the department name.
4 Select an attribute group from the Attribute Group drop-down list. If necessary, create
a new attribute group. Select Create New Attribute Group from the drop-down list, and
type the new group name in the text box that appears.
5 Click Save.
See “Configuring custom attributes” on page 1970.
See “About incident status attributes” on page 1962.
See “Configuring status groups” on page 1965.
See “Configuring status attributes and values” on page 1964.

Setting custom attributes


Once you define your custom attributes, they become available to every incident. Each incident
receives its own set of custom attributes (though some name-value pairs may be empty
depending on circumstances). The custom attribute values for an incident can be populated
and changed independently of other incidents.
You can edit the custom attribute values if you have been assigned to a role that includes edit
access for custom attributes. If you want to update a group of incidents, you can select those
incidents on the incident list page. You can then select the Set Attributes command from the
Incident Actions menu. You can select Lookup Attributes, to look up the values of custom
attributes. Note that the Set Attributes command and Attributes section on the Incident
Snapshot page are available only if at least one custom attribute is defined.
To set custom attributes for incidents
1 On the incident list page, select the incident or incidents you want to set custom attributes
for, then click Incident Actions > Set Attributes.
The Set Incident Attributes page appears.
2 Select the custom attributes you want to set for the incident or incidents.
3 Click Save..
4 Generate a new incident, or view an existing incident, and verify that it contains the new
custom attribute.
Working with incident data 1972
Setting the values of custom attributes manually

See “Configuring custom attributes” on page 1970.


See “About incident status attributes” on page 1962.
See “Configuring status attributes and values” on page 1964.

Setting the values of custom attributes manually


You can manually specify incident remediation status or workflow progress with values in
custom attributes.

Note: To auto-populate custom attribute values, use one or more lookup plugins. See “About
lookup plug-ins” on page 1986.

To set the value of custom attributes


1 Display an incident snapshot.
2 Click the Edit option in the Attributes section of the incident snapshot.
3 To set a value for a custom attribute, enter the value in the appropriate attributes field.
4 When you are finished setting values, click Save.
Chapter 59
Working with user risk
This chapter includes the following topics:

■ About user risk

■ About user data sources

■ About identifying users in web incidents

■ Viewing the user list

■ Viewing user details

■ Working with the user risk summary

About user risk


The user risk summary gives you insight into the behavior of specific individuals in your
organization by associating users with web, email, and endpoint incidents. This information
helps you focus your data loss prevention efforts on those users posing the highest risk to the
security of your data.
The Table 59-1 table provides an overview of the steps for creating and working with user risk
summary reports.

Table 59-1 User Risk Summary workflow

Step Action Description

1 Create custom user attributes You can create custom attributes for filtering and working with user
risk summary reports. For example, you can create an attribute named
Employment Status to track the employment status of each of your
users. You can then import that information in a file that is exported
from your enterprise resource planning system, such as SAP.

See “Defining custom attributes for user data” on page 1976.


Working with user risk 1974
About user risk

Table 59-1 User Risk Summary workflow (continued)

Step Action Description

2 Import user data You can import user data from an Active Directory connection or from
a CSV file. Incidents are associated with specific users by email
address and logon credentials. You can also upload files with your
custom attributes, such as information from your enterprise resource
planning system. Symantec Data Loss Prevention provides a CSV
template file that you can use to format any data you want to upload.

See “Bringing in user data” on page 1977.

3 Configure IP address to user name Symantec Data Loss Prevention can resolve user names from IPv4
resolution addresses in HTTP/S and FTP incidents. The domain controller agent
queries Windows Events in the Microsoft Active Directory Security
Event Log of the domain controller. Symantec Data Loss Prevention
associates these Windows Events with user data in your database.

See “About identifying users in web incidents ” on page 1981.

3 View the User List The User List is a list of all users in your system, including their email
address, domain, and logon name.

See “Viewing the user list” on page 1983.

You can view details for specific users in the user snapshot.

See “Viewing user details” on page 1984.

4 View the User Risk Summary The User Risk Summary displays your users and their associated
Endpoint and Network incidents. Use the User Risk Summary to
drill into your user-centric incident data to help you find the
highest-risk users. You can sort and filter this list by policies, custom
attributes, incident status, incident severity, user name identified by
IP address, number of incidents, date, incident type, and user name.

See “Working with the user risk summary” on page 1984.

5 Export user risk summary or user You can export data from the user risk summary and user snapshots
snapshot data. to a CSV file.

See “Working with the user risk summary” on page 1984.

See “Viewing user details” on page 1984.

Using the information that is provided in the user risk summary, you can see who the high-risk
users are and determine the appropriate course of action to take. Such actions might include:
■ Determining whether or not a user poses an active threat to your data security.
■ Applying additional policies to monitor a user's behavior more closely.
■ Applying additional response rules to block actions or send alerts.
Working with user risk 1975
About user data sources

■ Escalating a user's behavior to their manager or other responsible party.


To work with user risk data, a Symantec Data Loss Prevention user must have the User
Reporting privilege. Be aware that users with this privilege are automatically able to view and
access all incidents and incident types in Symantec Data Loss Prevention. The user risk
summary is intended for use by high-level remediators or information security officers. This
privilege is not part of any predefined role.
See “Configuring roles” on page 114.

About user data sources


You can bring in data about your users in CSV file format or through an Active Directory
connection.
User data is information about people in your organization who may have access to data that
you want to keep secure. To track user risk, you must provide the user's first and last name,
their email address (to track Network incidents) and logon information (to track Endpoint
incidents). You can also provide additional standard directory attribute information, such as
the user's address and phone number, as well as custom attributes such as the user's
employment status.
The Table 59-2 table lists the required and optional standard user data attributes:

Table 59-2 Standard user data

Attribute Required or optional Description

FIRST_NAME Required The user's given name.

LAST_NAME Required The user's surname.

EMAIL Required if no logon information is The user's email address.


included

LOGIN Required if no email address is The user's logon information, in


included DOMAIN\LOGIN format

TELEPHONE_NUMBER Optional The user's telephone number.

EMPLOYEE_ID Optional The user's employee identification


number.

TITLE Optional The user's job title.

DEPARTMENT Optional The user's job department.

STREET_ADDRESS Optional The user's street address


Working with user risk 1976
About user data sources

Table 59-2 Standard user data (continued)

Attribute Required or optional Description

STATE_OR_PROVINCE Optional The state or province in which the user


resides.

COUNTRY Optional The country in which the user resides.

POSTAL_CODE Optional The postal code for the user's address.

See “Defining custom attributes for user data” on page 1976.


See “Bringing in user data” on page 1977.

Defining custom attributes for user data


You can create custom attributes to improve relevance while filtering and working with user
risk summary reports. Useful custom attributes might include employment status, the name
of the user's manager, the user's job function, and other information that might be stored in
your enterprise resource planning system or additional user data source.
You must create custom attributes before entering any user data. Each custom attribute is
assigned a unique identification number as it is created. You must add these custom attribute
identification numbers to your data file before you import it to Symantec Data Loss Prevention.
See “Adding a file-based user data source” on page 1977.
To define custom attributes for user data
1 In the Enforce Server administration console, go to System > Users > Attributes.
2 Click Add. The User Attribute dialog box appears.
3 Enter the custom attribute in the Name field. The custom attribute can be a maximum of
60 characters.
4 Click Submit.
To view and edit user custom attributes
1 In the Enforce Server administration console, go to System > Users > Attributes.
2 The custom attributes appear in the User Custom Attributes list. You can take these
actions:
■ To filter the User Custom Attributes list, click Filters, then use the text fields for ID
or Attribute Name to enter a filter value.
■ To edit a custom attribute, click the attribute name or click the edit icon in the Actions
column, then edit the attribute in the User Attribute dialog box.
■ To delete a custom attribute, click the delete icon in the Actions column.
Working with user risk 1977
About user data sources

Bringing in user data


You can bring in user data from a file or an Active Directory connection.
See “Adding a file-based user data source” on page 1977.
See “Adding an Active Directory user data source” on page 1978.
After you have added your user data sources, you can schedule Symantec Data Loss Prevention
to regularly import data from those data sources to ensure that your user data is always up to
date. You can also import a user data source manually.
See “Importing a user data source” on page 1980.

Adding a file-based user data source


You can bring in user data from a .csv file. For your convenience, Symantec Data Loss
Prevention provides an annotated .csv template that you can use to ensure that your data is
formatted correctly. The template includes all the standard user attributes, as well as formatting
examples and instructions for adding custom attributes. The template also includes headers
for any custom attributes that you have defined at the time you download the template.
To create a user data file from a template
1 In the Enforce Server administration console, go to System > Users > Data Sources.
2 On the Data Sources page, click Download CSV Template on the right-hand side of the
page.
3 Open the template file and provide the information for the standard user-data attributes.
See “About user data sources” on page 1975.
4 The template file includes column headers for any custom attributes you have defined.
To add custom attributes manually, create a new column for each attribute, then populate
the rows as appropriate.
You must enter the column headers in this format: ID[Attribute Name]. For example,
1[Employment Status].
See “Defining custom attributes for user data” on page 1976.
5 Save the file (in .csv format) to a location on your Enforce Server.
To add a file-based user data source
1 In the Enforce Server administration console, go to System > Users > Data Sources.
2 On the Data Source Management page, click Add > CSV User Source. The Add CSV
User Source dialog box appears.
3 In the Add CSV User Source dialog box, enter the following information:
Working with user risk 1978
About user data sources

■ Name: Specify a name for the data source.


■ File Path: Specify the path to the user data file. This file must be on the Enforce Server.
■ Delimited by: Specify the delimiter for the file. Valid delimiters are comma, pipe,
semicolon, and tab.
■ Encoded by: Specify the character encoding format.
■ Error Threshold Percentage: Specify the percentage of user records that can be
invalid before the file is rejected and the import process fails. Records with duplicate
email addresses or logons count against the error threshold.

4 Click Submit.

Adding an Active Directory user data source


You can use an existing Active Directory connection to bring in user data. To add custom
attributes for users that are added from an Active Directory source, create and import a data
user file that includes the users' first and last names, email or logon information, and the custom
attributes you want to use. Symantec Data Loss Prevention automatically associates the
file-based user data with the existing user records brought in from your Active Directory source.
Symantec Data Loss Prevention uses this Active Directory filter to retrieve user data (line
breaks added for readability):

(&
(objectClass=user)
(objectCategory=person)
(sAMAccountType=805306368)
(!
(|
(&
(sAMAccountType=805306368)
(sAMAccountName=-*)
)
(&
(sAMAccountType=805306368)
(sAMAccountName=_*)
)
)
)
)

Your Active Directory credentials must have permission to access the following user attributes:

FIRST_NAME givenName
Working with user risk 1979
About user data sources

LAST_NAME sn
EMAIL mail
LOGIN_NAME sAMAccountName
TELEPHONE telephoneNumber
TITLE title
COUNTRY co
DEPARTMENT department
EMPLOYEE_ID employeeId
STREET_ADDRESS streetAddress
LOCALITY_NAME l
POSTAL_CODE postalCode
STATE_OR_PROVINCE st
OBJECT_DISINGUISHED_NAME distinguishedName

Your Active Directory credentials must also have permission to access the RootDSE record.
Symantec Data Loss Prevention reads these attributes from RootDSE:

namingContexts
defaultNamingContext
rootDomainNamingContext
configurationNamingContext
schemaNamingContext
isGlobalCatalogReady
highestCommittedUSN

See “Configuring directory server connections” on page 156.


See “Defining custom attributes for user data” on page 1976.
See “Adding a file-based user data source” on page 1977.
To add an Active Directory user data source
1 In the Enforce Server administration console, go to System > Users > Data Sources.
2 On the Data Source Management page, click Add > AD User Source. The Add AD
User Source dialog box appears.
3 In the Add > AD User Source dialog box, enter the following information:
■ Name: Specify a name for the data source.
■ Directory Connection: Select an existing Active Directory connection.
■ Advanced Options > AD Custom Filter: Specify an optional filter for your Active
Directory user data source, such as a workgroup. For example:
Working with user risk 1980
About user data sources

(&(region=North America)(!systemAccount=true))

4 Click Submit.

Note: A best practice is that you should refer to directory connection objects with baseDNs in
the user section of your directory tree. For example: ou=Users,dc=corp,dc=company,dc=com.

Importing a user data source


After you have added your user data sources, you can schedule Symantec Data Loss Prevention
to regularly import data from those data sources to ensure that your user data is always up to
date. You can also import a user data source manually.
Records with duplicate logons or email addresses are excluded from user data source imports.
The number of records excluded from the import displays at the end of the import process,
and the duplicate information appears in the logs.
To view details for a user data source import, click the Status link.
To schedule import of a user data source.
1 In the Enforce Server administration console, go to System > Users > Data Sources.
2 On the Data Source Management page, click the Schedule icon for your desired data
source.
3 Choose one of these options for scheduling:
■ Once: Specify a single day and time for user data import.
■ Daily: Specify a time for daily import of the user data source.
■ Weekly: Specify a day and time for weekly import of the user data source.
■ Monthly: Specify a day and time for monthly import of the user data source.

4 Click Submit.
To import a data source manually
1 In the Enforce Server administration console, go to System > Users > Data Sources.
2 On the Data Source Management page, select the data source you want to import.
3 Click Import.
Working with user risk 1981
About identifying users in web incidents

To view data source import details


1 In the Enforce Server administration console, go to System > Users > Data Sources.
2 On the Data Source Management page, click the Status link for your desired data source.
The Import Details dialog box appears.
3 The Import Details dialog box displays the following information for all imports:
■ Name: The name of the imported data source.
■ Status: Done, Completed with Errors, Failed.
■ Queued at: The time that the data source import was entered in the import queue.
■ Started at: The start time of the data source import.
■ Completed at: The completion time of the data source import.
For successful imports and imports completed with errors, the Import Details dialog box
displays the following additional information:
■ Added records: The number of added user records.
■ Updated records: The number of updated user records.
■ Skipped errored records: The number of records skipped because of errors in the
user data source.
■ Skipped duplicate records: The number of records skipped because of duplicate
user data.
For failed imports, the Import Details dialog box displays the following additional
information:
■ Last successful import: The date and time of the last successful import of the user
data source.
■ Failure reason: The reason for the import failure.

About identifying users in web incidents


The IP address in a Network Prevent for Web incident can be used to determine the user name
that is associated with that incident. Using the domain controller agent, Symantec Data Loss
Prevention collects Windows Events from the Security event log on the Microsoft Active
Directory domain controller server. These events are stored in the Symantec Data Loss
Prevention database, where a look-up service can resolve the IP address to its associated
user name. You don't need to cross-check incidents with domain controller logs to determine
the actual user responsible for each incident. You can view specific user names that are
associated with incidents (rather than IP addresses) in the User Risk Summary report. See
“Working with the user risk summary” on page 1984.
Working with user risk 1982
About identifying users in web incidents

User identification requires an Enforce Server, Network Prevent for Web, domain controller
servers, and an Active Directory domain controller. See the section "Installing the domain
controller Agent" in the Symantec Data Loss Prevention Installation Guide for complete
instructions on installing the domain controller Agent. It is available at the Symantec Support
Center at http://www.symantec.com/doc/DOC9257. After you install all of the required
components, you can enable User Identification by configuring a mapping schedule on the
User Identification page.

Note: Symantec Data Loss Prevention supports the use of multiple domain controllers.

Enabling user identification and configuring the mapping schedule


The domain controller agent queries Windows Events in the Microsoft Active Directory Security
Event Log of the domain controller. Symantec Data Loss Prevention associates these Windows
Events with user data in your database. The IPv4 address data from the domain controller
may not correspond precisely to a given user. If you have any doubt that the resolved username
is correct, verify that the user was logged in at the time of the incident before taking any incident
response actions.
The user identification lookup job on the Enforce Server checks the database for new events
from the domain controller every day at 4:00 A.M. by default.
Symantec Data Loss Prevention stores the user records received from the domain controller
agent in the Symantec Data Loss Prevention database. User records are purged every 3 days
by default.
To set the Mapping Schedule and enable User Identification
1 Click Configure from the System > Incident Data > User Identification page.
2 Click Once, Daily, Weekly, or Monthly to schedule a mapping job. The default is No
Regular Schedule. Scheduling must be configured to enable mapping.
3 Click Save when you are done.
To set up data retention parameters
1 Go to the System > Incident Data > User Identification > Configure page.
2 The default time for the system to keep user login events is 3 days. If you want to change
this value, enter another value in the User data retention field.
3 Click Save when you are done.
Working with user risk 1983
Viewing the user list

To specify the domain controller warning schedule


1 Go to the System > Incident Data > User Identification > Configure page.
2 Specify the domain controller warning in days. This is the number of days since the last
connection of a domain controller. The default is 8 days.
3 Click Save when you are done.
If you want to discontinue use of User Identification, you need to stop the mapping job. If you
don't stop the mapping job, it continues to run, even if the domain controllers are in a suspended
state.
To stop scheduled mapping
1 Go to the System > Incident Data > User Identification > Configure page.
2 Check the box next to Stop mapping. Suspending mapping does not stop any jobs that
are in progress.
3 Click Save when you are done.

Checking the status of the domain controllers


After you have set a mapping schedule, you can go to the System > Incident Data > User
Identification page and check the status of your domain controllers. You can sort controllers
by
■ State: Active or Suspended
■ Domain controller name
■ Last connection time
■ Days since last connection
■ Warnings
■ Login timeout
You can suspend an domain controller by clicking the green Active button. You can activate
a suspended domain controller by clicking the red Suspended button.

Viewing the user list


The user list displays all users that you have entered in Symantec Data Loss Prevention. In
the user list, you can view the names, email addresses, and domain and logon information for
each user. You can sort the list first or last name, and you can search the list by name, email
address, domain, or logon. Clicking on an individual user's name takes you to the user detail
view.
Working with user risk 1984
Viewing user details

See “Viewing user details” on page 1984.


The user list does not display incident data, only user data.
To view the user list
1 In the Enforce Server administration console, go to Incidents > Users > User List.
2 To sort the user list by first or last name, click one of the sort icons in the appropriate
column.
3 To search the user list, enter your search term in the search field at the upper-right corner
of the list. You can search on the user's first and last name, logon, and email address.
Only one search term is handled at a time.

Viewing user details


The user snapshot shows all user information and incidents for a specific user. You reach the
user detail view by clicking a user's name on the user list. You can also export the user snapshot
to a CSV file.
See “Viewing the user list” on page 1983.
To view user details
1 In the Enforce Server administration console, go to Incidents > Users > User List.
2 Click the name of the user for whom you want to view details.
3 On the User page, you can view a list of incidents, as well as user information, standard
attributes, and custom attributes. For users identified by IP address, there is also data
about the last activity time.
4 To export the user snapshot to a CSV file, click Export.

Working with the user risk summary


The user risk summary displays all users who have incidents associated with them. You can
sort and filter the user risk summary to gain insight into the user risk in your organization. For
example, you can view incidents that are associated with specific policies, or with custom
attributes that you have entered, such as job function or employment status. If you want to
return to a particular view of the user risk summary, you can save the URL and bookmark it
in your web browser. You can also export data from the user risk summary to a CSV file.
To view the user risk summary
1 In the Enforce Server administration console, go to Incidents > Users > User Risk
Summary.
2 To sort the list, click one of the sort icons in one of the columns.
Working with user risk 1985
Working with the user risk summary

3 To filter the list, select your filter values using the options above the user risk summary
list:

Filter Default value Description

Policies All Select a policy or policies by expanding the policy group


and checking the appropriate box or boxes.

Attributes None (0) Enter up to two custom attributes to filter the list. Select the
attribute from the drop-down list, then specify an include
or exclude condition and enter your desired values. To add
a second attribute filter, click Add Attribute Filter.

Status All Filter the list by incident status.

Date Last 7 Days Filter the list by date or date range.

Type All Filter the list by incident type, such as Email/SMTP,


Printer/Fax, or HTTP.

Include All You can filter the list by incident severity. You must select
at least one severity level.

You can also include or exclude user names identified by


IP address.

4 After you have selected your filter values, click Apply.


5 To save a particular filter configuration, click Get Link and copy the provided URL to your
web browser bookmarks.
6 To export data from the user risk summary to a CSV file, click Export. You can export
the current page or all pages in the user risk summary.
Chapter 60
Implementing lookup
plug-ins
This chapter includes the following topics:

■ About lookup plug-ins

■ Implementing and testing lookup plug-ins

■ Configuring the CSV Lookup Plug-In

■ Configuring LDAP Lookup Plug-Ins

■ Configuring Script Lookup Plug-Ins

■ Configuring migrated Custom (Legacy) Lookup Plug-Ins

About lookup plug-ins


A lookup plug-in lets you connect the Enforce Server to an external system to retrieve
supplemental data related to an incident. The data is stored as attributes. Lookup plug-ins let
you add additional context to incidents to facilitate remediation workflow. For example, consider
an email message that triggers an incident. A lookup plug-in can be used to retrieve and display
the name and the email address of the sender's manager from a directory server based on
the email sender's address.
Lookup plug-ins use incident attributes and custom attributes in coordination with each other.
The system generates incident attributes when a policy rule is violated. You define custom
attributes for custom incident data. Continuing the example, on detection of the incident, the
system generates the incident attribute "sender-email" and populates it with the email address
of the sender. The lookup plug-in uses this key-value pair to look up the values for custom
attributes "Manager Name" and "Manager Email" from an LDAP server. The plug-in populates
the custom attributes and displays them in the Incident Snapshot.
Implementing lookup plug-ins 1987
About lookup plug-ins

See “About custom attributes” on page 1968.


See “About using custom attributes” on page 1969.
See “How custom attributes are populated” on page 1970.

Types of lookup plug-ins


Symantec Data Loss Prevention provides several types of lookup plug-ins, including CSV,
LDAP, Script, Data Insight, and Custom (Legacy). The following table describes each type of
lookup plug-in in more detail.
See “About lookup plug-ins” on page 1986.

Table 60-1 Types of lookup plug-ins

Type Description

CSV The CSV Lookup Plug-in lets you retrieve incident data from a comma-separated values (CSV)
file uploaded to the Enforce Server. You can configure one CSV Lookup Plug-in per Enforce Server
instance.

See “About the CSV Lookup Plug-In ” on page 1988.

LDAP The LDAP Lookup Plug-in lets you retrieve incident data from a directory server, such as Microsoft
Active Directory, Oracle Directory Server, or IBM Tivoli. You can configure multiple instances of
the LDAP Lookup Plug-in.

See “About LDAP Lookup Plug-Ins” on page 1988.

Script The Script Lookup Plug-in lets you write a script to retrieve incident data from any external resource.
For example, you can use a Script Lookup Plug-in to retrieve incident data from external resources
such as proxy log files or DNS systems. You can configure multiple instances of the Script Lookup
Plug-in.

See “About Script Lookup Plug-Ins” on page 1988.

Data Insight The Data Insight Lookup Plug-in lets you retrieve incident data from Symantec Data Insight so
that you can locate and manage data at risk. You can configure one Data Insight Lookup Plug-in
per Enforce Server instance.

Custom (Legacy) The Custom (Legacy) Lookup Plug-in lets you use Java code to retrieve incident data from any
external resource.

See “About Custom (Legacy) Lookup Plug-Ins” on page 1989.


Note: As the name indicates, the Custom (Legacy) Lookup Plug-in is reserved for legacy Java
plug-ins. For new custom plug-in development, you must use one of the other types of lookup
plug-ins.
Implementing lookup plug-ins 1988
About lookup plug-ins

About the CSV Lookup Plug-In


The CSV Lookup Plug-In extracts data from a comma-separated values (CSV) file stored on
the Enforce Server. The plug-in uses data from the CSV file to populate custom attributes for
an incident at the time the incident is generated.
The CSV Lookup Plug-In receives a group of lookup parameters that contain data about an
incident from the Enforce Server. One or more of the lookup parameters in the group is mapped
to column heads in a CSV file. For example, the sender-email lookup parameter might be
mapped to the Email column in the CSV file. The value in the lookup parameter is used as a
key to find a matching value in the corresponding CSV column. When a match is found, the
CSV row that contains the matching value provides the data that is returned to the Enforce
Server. The Enforce Server uses the data in that row to populate the custom attributes for that
incident. For example, if the sender-email lookup parameter contains the value
[email protected], the plug-in searches the Email column for a row that contains
[email protected]. That row is then used to provide the data to populate the custom
attributes for the incident.
The CSV Lookup Plug-In uses an in-memory database to process large files.
See “Configuring the CSV Lookup Plug-In” on page 2006.

About LDAP Lookup Plug-Ins


The LDAP Lookup Plug-In pulls data from a live LDAP system (such as Microsoft Active
Directory, Oracle Directory Server, or IBM Tivoli). It then uses that data to populate custom
attributes for an incident at the time the incident is generated.
The LDAP Lookup Plug-In receives a group of lookup parameters that contain data about an
incident from the Enforce Server. These lookup parameters are then used in LDAP queries to
pull data out of an existing LDAP directory. For example, the value of the sender-email lookup
parameter might be compared to the values in the email attribute of the directory. If the
sender-email lookup parameter contains [email protected], a query can be
constructed to search for a record whose email attribute contains [email protected].
Data in the record that the search returns is inserted into the custom attributes for the incident.
See “Configuring LDAP Lookup Plug-Ins” on page 2015.

About Script Lookup Plug-Ins


You can write one or more Script Lookup Plug-ins to query data repositories for attribute values.
For example, you can write a script that queries a DNS server for information about a sender
that is involved in an incident. A Script Lookup Plug-In can use the output from such scripts
to populate custom attributes in incident records.
Implementing lookup plug-ins 1989
About lookup plug-ins

Unlike the CSV or LDAP Lookup Plug-ins, the Script Lookup Plug-In does not use in-line
attribute maps to specify how to look up parameter keys. Instead, you write this functionality
into each script as needed.
To implement a Script Lookup Plug-In , you can use any scripting language that reads standard
input (stdin) and writes standard output (stdout). The examples in the user interface and in
this documentation use Python version 2.6.
See “Configuring advanced plug-in properties” on page 2005.

About the Data Insight lookup plug-in


The Veritas Data Insight lookup plug-in retrieves data from a Veritas Data Insight Management
Server and uses it to populate attributes for a Network Discover incident at the time the incident
is generated. The Data Insight lookup plug-in connects Symantec Data Loss Prevention with
Symantec Data Insight to retrieve attribute values. Data Insight can be used to provide granular
context to incidents, including up-to-date data owner information. The values for incident
attributes are viewed and populated at the Incident Snapshot screen.
The Data Insight lookup plug-in requires a Data Insight license separate from Symantec Data
Loss Prevention licensing. If your system is not licensed for Data Insight, the Data Insight
lookup plug-in is not available. If you are licensed for Data Insight, refer to the Symantec Data
Loss Prevention Data Insight Implementation Guide for details on integrating with Data Insight.

About Custom (Legacy) Lookup Plug-Ins


You can use a Custom (Legacy) Lookup Plug-In to migrate legacy Custom Java Lookup
Plug-Ins to the Enforce Server administration console. Because Custom Java Lookup Plug-Ins
are no longer the preferred way to create new plug-ins, the information presented here is
provided to support organizations using legacy plug-ins but upgrading to Data Loss Prevention
version 15.1. As an alternative to migrating legacy Custom Java Lookup Plug-Ins, consider
rewriting such plug-ins using a Script Lookup Plug-In or one of the other supported lookup
plug-ins, such as CSV or LDAP.
See “Types of lookup plug-ins” on page 1987.

Note: Custom (Legacy) Lookup Plug-Ins should only be used for migrating legacy lookup
plug-ins implemented using the Java Lookup API. Support for new Custom Java Lookup
Plug-Ins are not supported.

See “Configuring migrated Custom (Legacy) Lookup Plug-Ins” on page 2031.


Implementing lookup plug-ins 1990
About lookup plug-ins

About lookup parameters


When an incident is created, the Enforce Server generates incident attributes and populates
them with data it captures from the incident. You use one or more incident attributes as lookup
parameter keys to retrieve external data and populate custom attributes with values that have
been retrieved from the external system. You choose which lookup parameters to use for your
lookup plug-ins at the Lookup Parameters screen. At least one lookup parameter must be
present in the external data source for the lookup to be performed.
While some attributes are created for all incident types, others are specific to the incident type.
For example, the incident attribute sender-email is specific to SMTP incidents. Attributes
specific to Endpoint and Discover incidents are prefaced by an identifier, such as
discover-name and endpoint-machine-name. For administrative convenience, lookup
parameters are organized into groups. An incident exposes all of the lookup parameters in
each lookup parameter group that is enabled. On lookup, some of the name-value pairs in
that group may be valueless depending on the type of incident. For example, the attribute
value of the sender-email parameter is null for Discover incidents (sender-email=null).
Lookup plug-ins do not change the system-defined values of lookup parameters. The plug-in
only uses these parameters as keys to perform the lookup and populate custom attributes.
For example, if a lookup plug-in uses the subject lookup parameter, the value of this attribute
is not changed by a value for this attribute in the external data source; the Enforce Server
ignores the value after the lookup is made. There are two exceptions, however:
data-owner-name and data-owner-email. These system-defined incident attributes function
like custom attributes and their values are populated by retrieved values.
When you map the keys to your data source, the plug-in searches the keys in order until it
finds the first matching value. When a matching value is located, the plug-in stops searching
for the keys. The plug-in uses the data in the row that contains the first matching value to
populate the relevant custom attributes. Therefore, key values are not used in combination,
but rather the first value that is found is the key. Because the plug-in stops searching after it
finds the first matching value, the order in which you list the keys in your attribute mapping is
significant. Refer to the individual attribute mapping topics and examples for nuances among
the lookup plug-in attribute mapping syntax.
To perform a lookup, you must map at least one lookup parameter key to a field in your external
data source. Each lookup parameter group that you enable is a separate database query for
the Enforce Server to perform. All database queries are executed for each incident before
lookup. To avoid the performance impact of unnecessary database queries, you should only
enable attribute groups that your lookup plug-ins require.
Because the plug-in stops searching after it finds the first matching lookup parameter key-value
pair, the order in which you list the keys in your attribute map is significant. Refer to the attribute
mapping examples for the specific type of plug-in you are implementing.
See “Selecting lookup parameters” on page 1996.
Implementing lookup plug-ins 1991
About lookup plug-ins

About plug-in deployment


A lookup plug-in is deployed by enabling it through the user interface. Each lookup plug-in
must be enabled, even if there is only one. If multiple plug-ins are enabled, you chain them
together and specify their order of execution.
The selected lookup parameter keys apply globally to all deployed lookup plug-ins. If plug-ins
are reloaded, all deployed plug-ins are reloaded.
You can only deploy one CSV Lookup Plug-in and one Data Insight Lookup Plug-in per Enforce
Server instance.
See “Enabling lookup plug-ins” on page 2001.

About plug-in chaining


When you create a lookup plug-in, you map the lookup parameter keys and custom attributes
to fields in the external data source. All deployed lookup plug-ins receive a reference to the
same attribute map. This allows plug-ins to be chained together and executed in sequence.
In a lookup plug-in chain, the first plug-in uses the lookup parameters that are passed to it by
the Enforce Server to look up attribute values. The second plug-in uses data that is passed to
it by the first plug-in including the lookup parameters and any variables created by the previous
lookup. This continues in sequence or all plug-ins in the chain.
A plug-in chain is useful when information must be pulled from different sources to populate
custom attributes for an incident. A chain is also useful when there are differences or
dependencies between the “keys” needed to unlock the correct data.
For example, consider the following plug-in chain:
1. A Script Lookup Plug-in performs a DNS lookup using one or more parameters.
2. A CSV Lookup Plug-in uses the result of the script look up to retrieve incident data from a
CSV file that is an extract from an asset management system.
3. An LDAP Lookup Plug-in uses the result of the CSV lookup to obtain data from a corporate
LDAP directory.
See “Chaining lookup plug-ins” on page 2002.
See “Chaining multiple Script Lookup Plug-Ins” on page 2027.

About upgrading lookup plug-ins


Prior to Symantec Data Loss Prevention version 11.6, lookup plug-ins were implemented
manually using property files; there was no user interface for configuring lookup plug-ins. The
lookup plug-in user interface was introduced in version 11.6.
Implementing lookup plug-ins 1992
Implementing and testing lookup plug-ins

If you are upgrading to version 12.0 or later, existing lookup plug-ins are automatically upgraded
to the new framework and added to the user interface for configuration and deployment. In
addition, the plug-in state will be preserved after the upgrade, that is, if a plug-in was enabled
before the upgrade it should be turned on in the user interface after the upgrade.
If the upgrade of a lookup plug-in does not succeed, the system displays the following error
message:

INFO: IN PROCESS: Errors detected in lookup plugin configuration.


Your lookup plugins may require manual configuration after the upgrade.

In this case, check the plug-in at the System > Lookup Plugins screen and manually configure
it following the instructions provided with this documentation. Refer to the Symantec Data Loss
Prevention Release Notes for known issues related to the upgrade of lookup plug-ins.

Implementing and testing lookup plug-ins


The following table describes the workflow for implementing and testing lookup plug-ins. Linked
sections explain these steps in more detail.

Table 60-2 Implementing and testing lookup plug-ins

Step Description

1 Decide what external data you want to extract and load into incidents as custom attributes.

See “About using custom attributes” on page 1969.

2 Identify the sources from which custom attribute data is to be obtained and the appropriate
lookup plug-in for retrieving this information.

See “Types of lookup plug-ins” on page 1987.

3 Create a custom attribute for each individual piece of external data that you want to include in
incident snapshots and reports.

See “Configuring custom attributes” on page 1970.

4 Determine which lookup parameter groups include the specific lookup parameters you need
to extract the relevant data from the external sources.

See “About lookup parameters” on page 1990.


Implementing lookup plug-ins 1993
Implementing and testing lookup plug-ins

Table 60-2 Implementing and testing lookup plug-ins (continued)

Step Description

5 Configure the plug-in to extract data from the external data source and populate the custom
attributes.

See “Configuring the CSV Lookup Plug-In” on page 2006.

See “Configuring LDAP Lookup Plug-Ins” on page 2015.

See “Configuring Script Lookup Plug-Ins” on page 2020.

See “Configuring migrated Custom (Legacy) Lookup Plug-Ins” on page 2031.

6 Enable the plug-in on the Enforce Server.

See “Enabling lookup plug-ins” on page 2001.

7 Set the execution order for multiple plug-ins.

See “Chaining lookup plug-ins” on page 2002.

8 Verify privileges. The end user must have Lookup Attribute privileges to use a lookup plug-in
to look up attribute values.

See “Configuring roles” on page 114.

9 Generate an incident. The incident must be of the type that exposes one or more incident
attributes that you have designated as parameter keys.

See “Configuring policies” on page 413.

10 View the incident details. For the incident you generated, go to the Incident Snapshot screen.
In the Attributes section, you should see the custom attributes you created. Note that they are
unpopulated (have no value). If you do not see the custom attributes, verify the privileges and
that the custom attributes were created.

11 If the lookup plug-in is properly implemented, you see the Lookup button available in the
Attributes section of the Incident Snapshot. Once you click Lookup you see that the value
for each custom attribute is populated. After the initial lookup, the connection is maintained and
subsequent incidents will have their custom attributes automatically populated by that lookup
plug-in; the remediator does not need to click Lookup for subsequent incidents. If necessary
you can reload the plug-ins.

See “Troubleshooting lookup plug-ins” on page 2003.

See “Reloading lookup plug-ins” on page 2002.


Implementing lookup plug-ins 1994
Implementing and testing lookup plug-ins

Managing and configuring lookup plug-ins


The System > Incident Data > Lookup Plugins screen is the home page for creating,
configuring, and managing lookup plug-ins. Lookup plug-ins are used for remediation to retrieve
incident-related data from an external data source and populate incident attributes.
See “About lookup plug-ins” on page 1986.
You create and configure lookup plug-ins at the Lookup Plugins List Page.

Table 60-3 Creating and configuring lookup plug-ins

Action Description

New Plugin Select this option to create a new plug-in.

See “Creating new lookup plug-ins” on page 1995.

Modify Plugin Chain Select this option to enable (deploy) plug-ins and to set the order of lookup for multiple
plug-ins.

See “Enabling lookup plug-ins” on page 2001.

Lookup Parameters Select this option to choose which lookup parameter groups to use as keys to
populate attribute fields from external data sources.

See “Selecting lookup parameters” on page 1996.

Reload Plugins Select this option to refresh the system after making changes to enabled plug-ins
or if the external data is updated. This action automatically performs the enabled
lookups in order and populates the incidents as they are created.
See “Reloading lookup plug-ins” on page 2002.

For each configured lookup plug-in, the system displays the following information at the Lookup
Plugins List Page. You use this information to manage lookup plug-ins.

Table 60-4 Managing lookup plug-ins

Display field Description

Execution Sequence This field displays the order in which the system executes lookup plug-ins.

See “Enabling lookup plug-ins” on page 2001.

Name This field displays the user-defined name of each lookup plug-in.

Click the Name link to edit that plug-in.

See “Creating new lookup plug-ins” on page 1995.


Implementing lookup plug-ins 1995
Implementing and testing lookup plug-ins

Table 60-4 Managing lookup plug-ins (continued)

Display field Description

Type The field displays the type of lookup plug-in. You can configure one CSV and one
Data Insight Lookup Plug-in per Enforce Server instance. You can configure multiple
instances of the LDAP, Script, and Custom (Legacy) lookup plug-ins.

See “Types of lookup plug-ins” on page 1987.

Description This field displays the user-defined description of each lookup plug-in.

See “Implementing and testing lookup plug-ins” on page 1992.

Status The field displays the state of each lookup plug-in, either On (green) or Off (red).
To edit the state of a plug-in, click Modify Plugin Chain.

See “Enabling lookup plug-ins” on page 2001.

For each configured lookup plug-in, you can perform the following management functions at
the Lookup Plugins List Page.

Table 60-5 Sorting and grouping lookup plug-ins

Action Description

Edit Click the pencil icon in the Actions column to edit the plug-in.

Delete Click the X icon in the Actions column to delete the plug-in. You must confirm or
cancel the action to execute it.

Sort Sort the selected display column in ascending or descending order.

Group Group the plug-ins according to the selected display column. For example, where
you have multiple plug-ins, it may be useful to group them by Type or by Status.

Creating new lookup plug-ins


You must have Server Administration privileges to create and configure lookup plug-ins.
See “Configuring roles” on page 114.
To create new lookup plug-in
1 Navigate to System > Incident Data > Lookup Plugins in the Enforce Server
administration console.
2 Click New Plugin at the Lookup Plugins List Page screen.
Implementing lookup plug-ins 1996
Implementing and testing lookup plug-ins

3 Select the type of lookup plug-in you want to create and configure it.

CSV

See “Configuring the CSV Lookup Plug-In” on page 2006.

LDAP

See “Configuring LDAP Lookup Plug-Ins” on page 2015.

Script

See “Configuring Script Lookup Plug-Ins” on page 2020.

Data Insight

Custom (Legacy)

See “Configuring migrated Custom (Legacy) Lookup Plug-Ins” on page 2031.

4 Click Save to apply the lookup plug-in configuration.


The system displays a success (green) message if the plug-in was successfully saved or
an error (red) message if the plug-in is misconfigured and could not be saved.
See “Troubleshooting lookup plug-ins” on page 2003.
5 Click Modify Plugin Chain and enable the lookup plug-in and chain multiple plug-ins.
See “Enabling lookup plug-ins” on page 2001.
See “Chaining lookup plug-ins” on page 2002.

Selecting lookup parameters


The System > Lookup Plugins > Edit Lookup Plugin Parameters page lists the Lookup
Parameter Keys that you select to trigger the look up of attribute values. Lookup parameter
keys are organized into attribute groups. Selections made at this screen apply to all lookup
plug-ins deployed on the Enforce Server.
To perform a lookup, you must map at least one lookup parameter key to a field in your external
data source. Each lookup parameter group that you enable is a separate database query for
the Enforce Server to perform. All database queries are executed for each incident before
lookup. To avoid the performance impact of unnecessary database queries, you should only
enable attribute groups that your lookup plug-ins require.
Because the plug-in stops searching after it finds the first matching lookup parameter key-value
pair, the order in which you list the keys in your attribute map is significant. Refer to the attribute
mapping examples for the specific type of plug-in you are implementing for details.
See “About lookup parameters” on page 1990.
Implementing lookup plug-ins 1997
Implementing and testing lookup plug-ins

To enable one or more lookup parameter keys


1 Navigate to System > Lookup Plugins in the Enforce Server administration console.
2 Click Lookup Parameters at the Lookup Plugins List Page.
3 Select (check) one or more attribute groups at the Edit Lookup Plugin Parameters page.
Click View Properties to view all of the keys for that attribute group.
■ Attachment Table 60-6
■ Incident Table 60-7
■ Message Table 60-8
■ Policy Table 60-9
■ Recipient Table 60-10
■ Sender Table 60-11
■ Server Table 60-12
■ Monitor Table 60-13
■ Status Table 60-14
■ ACL Table 60-15
■ Cloud Applications and API Appliance Table 60-16

4 Save the configuration.


Verify the success message indicating that all enabled plug-ins were reloaded.

Table 60-6 Attachment lookup parameters

Lookup parameter key Description and comments

attachment-nameX Name of the attached file, where X is the unique index to distinguish between
multiple attachments, for example: attachment-name1, attachment-size1;
attachment-name2, attachment-size2; etc.

attachment-sizeX Original size of the attached file, where X is the unique index to distinguish
between multiple attachments. See above example.

Table 60-7 Incident lookup parameters

Lookup parameter key Description

date-detected Date and time when the incident was detected, for example:
date-detected=Tue May 15 15:08:23 PDT 2012.
Implementing lookup plug-ins 1998
Implementing and testing lookup plug-ins

Table 60-7 Incident lookup parameters (continued)

Lookup parameter key Description

incident-id The incident ID assigned by Enforce Server. The same ID can be seen in the
incident report. For example: incident-id=35.

protocol The name of the network protocol that was used to transfer the violating message,
such as SMTP and HTTP. For example: protocol=Email/SMTP.

data-owner-name The person responsible for remediating the incident. This attribute is not populated
by the system. Instead, it is set manually in the Incident Details section of the
Incident Snapshot screen, or automatically using a lookup plug-in.

Reports based on this attribute can automatically be sent to the data owner for
remediation.

data-owner-email The email address of the person responsible for remediating the incident. This
attribute is not populated by the system. Instead, it is set manually in the Incident
Details section of the Incident Snapshot screen, or automatically using a lookup
plug-in.

Table 60-8 Message lookup parameters

Lookup parameter key Description

date-sent Date and time when the message was sent if it is an email. For example:
date-sent=Mon Aug 15 11:46:55 PDT 2011.

subject Subject of the message if it is an email incident.

file-create-date Date that the file was created in its current location, whether it was originally
created there, or copied from another location. Retrieved from the operating
system.

file-access-date Date that the file was examined.

file-created-by User who placed the file on the endpoint.

file-modified-by Fully-qualified user credential for the computer where the violating copy action
took place.

file-owner The name of the user or the computer where the violating file is located.

discover-content-root-path Root of path of the file which caused a Discover incident.

discover-location Full path of the file that caused a Discover incident.

discover-name The name of the violating file.


Implementing lookup plug-ins 1999
Implementing and testing lookup plug-ins

Table 60-8 Message lookup parameters (continued)

Lookup parameter key Description

discover-extraction-date Date a subfile was extracted from an encapsulated file during Discover scanning.

discover-server The name of repository to be scanned.

discover-notes-database Specific attribute for Discover scan of Lotus Notes repository.

discover-notes-url Specific attribute for Discover scan of Lotus Notes repository.

endpoint-volume-name The name of the local drive where an endpoint incident occurred.

endpoint-dos-volume-name The Windows name of the local drive where an endpoint incident occurred.

endpoint-application-name Name of application most recently used to open (or create) the violating file.

endpoint-application-path Path of the application that was used to create or open the violating file.

endpoint-file-name The name of the violating file.

endpoint-file-path Location the file was copied to.

Table 60-9 Policy lookup parameter

Lookup parameter key Description and comments

policy-name The name of the policy that was violated, for example: policy-name=Keyword
Policy.

Table 60-10 Recipient lookup parameters

Lookup parameter key Description

recipient-emailX The email address of the recipient, where X is the unique index to distinguish
between multiple recipients; for example: recipient-email1,
recipient-ip1, recipient-url1; recipient-email2, recipient-ip2,
recipient-url2; etc.

recipient-ipX The IP address of the recipient, where X is the unique index to distinguish
between multiple recipients. See above example.

recipient-urlX The URL of the recipient, where X is the unique index to distinguish between
multiple recipients. See above example.
Implementing lookup plug-ins 2000
Implementing and testing lookup plug-ins

Table 60-11 Sender lookup parameters

Lookup parameter key Description

sender-email The email address of the sender for Network Prevent for Email (SMTP) incidents.

sender-ip The IP address of the sender for Endpoint and Network incidents on protocols
other than SMTP.

sender-port The port of the sender for Network incidents on protocols other than SMTP.

endpoint-user-name The user who was logged on to the endpoint when the violation occurred.

endpoint-machine-name Name of the endpoint where the violating file resides.

Table 60-12 Server lookup parameters

Lookup parameter key Description and comments

server-name The name of the detection server that reported the incident. This name is
user-defined and entered when the detection server is deployed. For example:
server-name=My Network Monitor.

Table 60-13 Monitor lookup parameters

Lookup parameter key Description

monitor-name The name of the detection server that reported the incident. This name is
user-defined and entered when the detection server is deployed. For example:
server-name=My Network Monitor.

monitor-host The IP address of the detection server that reported the incident. For example:
monitor-host=127.0.0.1

monitor-id The system-defined numeric identifier of the detection server. For example:
monitor-id=1.

Table 60-14 Status lookup parameter

Lookup parameter key Description and comments

incident-status Current status of the incident. For example:


incident-status=incident.status.New.

Table 60-15 ACL lookup parameters

Lookup parameter key Description

acl-principalX A string that indicates the user or group to whom the ACL applies.
Implementing lookup plug-ins 2001
Implementing and testing lookup plug-ins

Table 60-15 ACL lookup parameters (continued)

Lookup parameter key Description

acl-typeX A string that indicates whether the ACL applies to the file or to the share.

acl-grant-or-denyX A string that indicates whether the ACL grants or denies the permission.

acl-permissionX A string that indicates whether the ACL denotes read or write access.

Table 60-16 Cloud Applications and API Appliance lookup parameters

Lookup parameter key Description

common-user-name A string representing the name of the user as displayed in reports.

common-user-id A string representing the unique identifier of the user.

client-user-id A string representing the identifier for the user within the client domain
making the detection request.

client-domain A string representing the domain of the REST client making a


detection request.

common-owner A string representing the user identification of the data owner. Used
in data-at-rest (DAR) requests only.

common-sharedwith An array of user identifiers for all users the file is shared with. Used
in DAR requests only.

Enabling lookup plug-ins


To enable a lookup plug-in you have to change its status from Off, which is the initial status
of a plug-in after it is configured, to On. The System > Incident Data > Lookup Plugins >
Modify Plugin Chain is where you enable lookup plug-ins.
See “About plug-in deployment” on page 1991.
To enable a lookup plug-in
1 Navigate to System > Incident Data > Lookup Plugins in the Enforce Server
administration console.
2 Click Modify Plugin Chain at the Lookup Plugins List Page.
Implementing lookup plug-ins 2002
Implementing and testing lookup plug-ins

3 In the Dedicated Actions field, select (check) the On option.


4 Click Save to apply the configuration.
If the plug-in cannot be loaded the system will report an error and the plug-in state will
remain Off. In this case, check the latest Tomcat log file for the error.
See “Troubleshooting lookup plug-ins” on page 2003.

Chaining lookup plug-ins


The System > Incident Data > Lookup Plugins > Modify Lookup Plugin Execution Chain
is where you enable lookup plug-ins and specify the execution order when multiple lookup
plug-ins are deployed.
See “Enabling lookup plug-ins” on page 2001.
If you enable multiple lookup plug-ins you must specify their order of execution. When plug-ins
are chained together, input from a previous plug-in is used as attributes by subsequent lookup
plug-ins.
See “About plug-in deployment” on page 1991.
To chain multiple lookup plug-ins
1 Navigate to System > Incident Data > Lookup Plugins in the Enforce Server
administration console.
2 Click Modify Plugin Chain at the Lookup Plugins List Page.
3 In the Execution Sequence field, select the execution order from the drop-down menu.
4 Click Save to apply the chaining configuration.

Reloading lookup plug-ins


If you have changed the configuration of a lookup plug-in, or the external data has changed,
you need to reload the lookup plug-ins. Reloading plug-ins refreshes the system and
automatically performs the enabled look-ups in order and populates the incident attributes as
incidents are detected.
In addition to reloading plug-ins if changes are made, you may need to reload lookup plug-ins
if any of the following are true:
■ A plug-in was problematic and the system unloaded it, but now the problem is fixed.
■ The network was down or disconnected for some reason, but it is functioning properly now.
■ A plug-in stores data in a cache, and you want to update the cache manually.
Implementing lookup plug-ins 2003
Implementing and testing lookup plug-ins

To reload lookup plug-ins


1 Navigate to System > Incident Data > Lookup Plugins in the Enforce Server
administration console.
2 Click Reload Plugins to reload all enabled plug-ins.

Note: Administrators can also reload lookup plug-ins from the Custom Attributes tab of
the System > Incident Data > Attributes screen.

Troubleshooting lookup plug-ins


Symantec Data Loss Prevention provides logging and error messages specific to lookup
plug-ins. The most common errors involve the failure of a plug-in to load due to one or more
misconfigurations. If a lookup plug-in fails to load, the exception is logged as a warning at the
system events screen and in the Tomcat log. In addition, the attribute map and plug-in execution
chain is logged in the Tomcat log.
To troubleshoot lookup plug-in errors
1 Navigate to the System > Servers and Detectors > Overview screen and look for any
warnings in the Recent Error and Warning Events table at the bottom of the page.
2 On the Enforce Server host, open the log file
c:\ProgramData\Symantec\DataLossPrevention\EnforceServer\15.5\protect\Enforce\
logs\tomcat\localhost.<date>.log (Windows) or
/var/log/Symantec/DataLossPrevention/EnforceServer/15.5/
tomcat/localhost.<date>.log (Linux).

3 Troubleshoot errors that appear in the Tomcat localhost log file.


Table 60-17
4 Configure detailed logging for lookup plug-ins if the plug-in fails but errors are not logged.
See “Configuring detailed logging for lookup plug-ins” on page 2004.
5 Refer to the troubleshooting topics for specific plug-ins.
See “Testing and troubleshooting the CSV Lookup Plug-In ” on page 2012.
See “Testing and troubleshooting LDAP Lookup Plug-ins” on page 2018.
See “Script Lookup Plug-In tutorial” on page 2027.
Implementing lookup plug-ins 2004
Implementing and testing lookup plug-ins

Table 60-17 Troubleshooting lookup plug-ins

Problem Solution

Lookup plug-in fails to load If the plug-in failed to load, search for a message in the log file similar to the following:
SEVERE
[com.vontu.enforce.workflow.attributes.AttributeLookupLoader]
Error loading plugin [<Plugin_Name>]

Note the "Cause" section that follows this type of error message. Any such entries
will explain why the plug-in failed to load.

Attributes are not populated by If the plug-in loads but attributes are not populated, look in the log for the attribute
the lookup map. Verify that values are being populated, including for the lookup parameters that
you enabled. To do this, search for a lookup parameter key that you have enabled,
such as sender-email.

Configuring detailed logging for lookup plug-ins


The system provides detailed logging configuration for lookup plug-ins. You can configure the
logging levels for lookup plug-ins in the System > Logs > Configuration tab. Configuring the
logs for lookup plug-ins provides more detailed log messages in the Tomcat localhost log.
See “Troubleshooting lookup plug-ins” on page 2003.
To configure and collect the logs for lookup plug-ins
1 Navigate to the System > Servers and Detectors > Logs screen.
2 Select the Configuration tab.
3 For the Enforce Server, select the Custom Attribute Lookup Logging entry from the
Diagnostic Logging Setting drop-down menu.
4 Click Configure Logs.
5 In the Collection tab, select the following Debug and Trace Logs for the Enforce Server.
6 Click Collect Logs.
7 At the bottom of the page, click Download to download the logs. Use the Refresh button
to refresh the page. The logs are packaged in a ZIP file.
8 Open the ZIP file or save it to the file system and extract it.
9 Navigate to directory
c:\ProgramData\Symantec\DataLossPrevention\EnforceServer\15.5\Protect\logs\tomcat
(Windows) or /var/log/Symantec/DataLossPrevention/EnforceServer/15.5/tomcat
(Linux).
Implementing lookup plug-ins 2005
Implementing and testing lookup plug-ins

10 Open the file localhost.<date>.log using a text editor. Open the file with the most
recent date.
11 Search for the name of the lookup plug-in. You should see several messages.
12 If necessary, verify the lookup plug-in logging properties in file
ManagerLogging.properties in your config directory.

com.vontu.logging.ServletLogHandler.level=FINEST
com.vontu.enforce.workflow.attributes.CustomAttributeLookup.level=FINEST
com.vontu.lookup.level=FINEST

Configuring advanced plug-in properties


The file Plugins.properties in your config directory (\Program
Files\Symantec\DataLossPrevention\EnforceServer\15.5\Protect\config\ [Windows]
or /opt/Symantec/DataLossPrevention/EnforceServer/15.5/Protect/config [Linux])
contains several advanced properties for configuring lookup plug-ins. Generally these properties
do not need to be modified unless necessary according to the following descriptions.

Table 60-18 Advanced properties for lookup plug-ins

Property Default Description

AttributeLookup. data-owner-name, The Attribute Lookup Output Parameters property is a


data-owner-email comma-separated list that specifies which parameters can be
output.parameters
modified by lookup plug-ins. Generally, the values for lookup
parameter keys are set by the system when an incident is created.
Because these parameters are used to look up custom attribute
values, they are not modified by the looked up values if they are
different from the system-defined values

However, this property lets you modify the output of the Data
Owner Name and Data Owner Email attributes based on
retrieved values. These parameters are specified in lookup plug-in
configurations and scripts using the same syntax as custom
attributes. Both attributes are enabled by selecting the Incident
attribute group.

You can disable this feature by removing one or both of the


entries. If removed, the output for either parameter is not changed
by a looked up value.
Implementing lookup plug-ins 2006
Configuring the CSV Lookup Plug-In

Table 60-18 Advanced properties for lookup plug-ins (continued)

Property Default Description

AttributeLookup.timeout 60000 To avoid a system freeze due to unanticipated lookup problems,


the Enforce Server limits the amount of time given to each lookup
plug-in. This timeout is configured in the
com.vontu.api.incident.attributes.AttributeLookup.timeout
property in the Plug-ins.properties file.

If a lookup exceeds the 60-second default timeout, the incident


attribute framework unloads the associated plug-in. If there is a
runaway lookup the Enforce Server cannot execute that particular
lookup for any subsequent incidents. If the plug-in times out
frequently, you can extend the timeout by modifying the period
(in milliseconds).
Note: Note that increasing this value may result in slower incident
processing times because of slow attribute lookups.

AttributeLookup.auto true The automatic lookup property specifies whether the lookup
should be triggered automatically when a new incident is detected.
This property automatically populates incident attributes using
the deployed lookup plug-ins after the initial lookup is executed.

You can disable auto-lookup by changing the property value to


false. If this property is disabled, remediators must click Lookup
for every incident.

After setting the AttributeLookup.auto property to false,


make sure you restart the Symantec DLP Incident Persister
service. If you do not restart the service the custom attributes will
continue to be automatically populated.

AttributeLookup.reload false The automatic plug-in reload property specifies whether all
plug-ins should be automatically reloaded each day at 3:00 A.M.
Change to true to enable.

Configuring the CSV Lookup Plug-In


You can only configure one CSV Lookup Plug-In per Enforce Server instance.
See “About the CSV Lookup Plug-In ” on page 1988.
Implementing lookup plug-ins 2007
Configuring the CSV Lookup Plug-In

Table 60-19 Configuring the CSV Lookup Plug-In

Step Action Description

1 Create custom attributes. Define the custom attributes for the information you want to look up.
See “Setting the values of custom attributes manually” on page 1972.

2 Create the CSV data source file. The CSV file that contains the data to be used to populate custom
attributes for incident remediation.

See “Requirements for creating the CSV file” on page 2008.

3 Create a new CSV plug-in. See “Creating new lookup plug-ins” on page 1995.

4 Name and describe the plug-in. The name string is limited to 100 characters. We recommend that you
enter a description for the lookup plug-in.

5 Specify the file path. Provide the path to the CSV file. The CSV file must be local to the Enforce
Server.

See “Specifying the CSV file path” on page 2009.

6 Choose the File Delimiter. Specify the delimiter that is used in the CSV file. The pipe delimiter [|] is
recommended.

See “Choosing the CSV file delimiter” on page 2009.

7 Choose the File Encoding. For example: UTF-8

See “Selecting the CSV file character set” on page 2009.

8 Map the attributes. Map the system and the custom attributes to the CSV file column heads
and define the keys to use to extract custom attribute data. Keys map to
column heads, not custom attributes.

The syntax is as follows:

attr.attribute_name=column_head

keys=column_head_first:column_head_next:column_head_3rd

See “Mapping attributes and parameter keys to CSV fields” on page 2010.

9 Save the plug-in. Verify that the correct save message for the plug-in is displayed.

9 Select the Lookup Parameter Define the keys which are used to extract custom attribute data.
Keys.
See “Selecting lookup parameters” on page 1996.

10 Enable the lookup plug-in. The CSV Lookup Plug-In must be enabled on the Enforce Server.

See “Enabling lookup plug-ins” on page 2001.

11 Troubleshoot the plug-in. See “Testing and troubleshooting the CSV Lookup Plug-In ” on page 2012.
Implementing lookup plug-ins 2008
Configuring the CSV Lookup Plug-In

Table 60-19 Configuring the CSV Lookup Plug-In (continued)

Step Action Description

11 Test the lookup plug-in.

Requirements for creating the CSV file


The CSV Lookup Plug-In requires a CSV file that is stored on the Enforce Server.
When creating a CSV file, keep in mind the following requirements:
■ The first data row of the CSV file must contain column headers.
■ Column header fields cannot be blank.
■ Make sure that there are no white spaces at the end of the column header fields.
■ Make sure that all rows have the same number of columns.
■ Each row of the file must be on a single, non-breaking line.
■ One or more columns in the file are used as key-fields for data lookups. You specify in the
attribute mapping which column heads are to be used as key fields. You also specify the
key field search order. Common key fields typically include email address,
Domain\UserName (for Endpoint incidents), and user name (for Storage incidents).
■ The data values in the key field columns must be unique. If multiple columns are used as
key fields (for example, EMP_EMAIL and USER_NAME), the combination of values in each row
must be unique.
■ Fields in data rows (other than the column header row) can be empty, but at least one key
field in each row should contain data.
■ The same type of delimiter must be used for all values in the column header and data rows.
■ If the CSV file is read-only, make sure that the CSV file has a new line at the end of the
file. The system attempts to add a new line to the file on execution of the plug-in. If the file
is read-only the system cannot add a new line, the plug-in does not load.
■ For Discover scan incidents, the file-owner lookup parameter does not include a domain.
To use file-owner as the key, the CSV file column that corresponds to file-owner should
be in the format owner. The format DOMAIN\owner does not result in a successful lookup.
This restriction only applies to Discover incidents, other kinds of incidents can include a
domain.
For example, the column-header row and a data-row of a pipe-delimited CSV file might
look like:

email|first_name|last_name|domain_user_name|user_name|department|manager|manager_email
[email protected]|John|Smith|CORP\jsmith1|jsmith1|Accounting|Mei Wong|[email protected]
Implementing lookup plug-ins 2009
Configuring the CSV Lookup Plug-In

■ If more than 10% of the rows in the CSV file violate any of these requirements, the Plugin
does not load.
■ For accuracy in the lookup, the CSV file needs to be kept up to date.
See “About the CSV Lookup Plug-In ” on page 1988.

Specifying the CSV file path


To configure the CSV Lookup Plug-In you must specify the CSV File Path property for the
location of the CSV file. The CSV file must be stored locally on the Enforce Server.
You can enter either an absolute file path or a relative file path. For example:
■ ../../../../symantecDLP_csv_lookup_file/senders2.csv

■ C:/SymantecDLP_csv_lookup_file/senders2.csv

On Windows you can use either forward or backward slashes. For example:
C:/Symantec/DataLossPrevention/EnforceServer/15.5/Protect/plugins/employees.csv
or C:\Program
Files\Symantec\DataLossPrevention\EnforceServer\15.5\Protect\plugins\employees.csv.
On Linux you can only use forward slashes.
The system validates the file path when you save the configuration. If the system cannot locate
the file it reports an error and does not let you save the configuration. Make sure that the CSV
file is not open and is stored locally to the Enforce Server.

Choosing the CSV file delimiter


Use the Delimiter property to specify the CSV file delimiter.
The following delimiters are supported:
■ Comma
■ Pipe
■ Tab
■ Semicolon
The recommended practice is to use the pipe character (“|”) as the delimiter. Use of the comma
delimiter is discouraged because commas are often included in data fields as part of the data.
For example, a street address might contain a comma.

Selecting the CSV file character set


You must specify the character set for the CSV file. The default is UTF-8.
All supported character sets are listed in the drop-down menu.
Implementing lookup plug-ins 2010
Configuring the CSV Lookup Plug-In

Mapping attributes and parameter keys to CSV fields


To configure the CSV Lookup Plug-In , you enter the execution code in the Attribute Mapping
field. This code maps the lookup parameter keys and custom attributes to column headers in
the CSV file. One or more attribute=column pairs is used to map the incident attributes to the
column heads. The keys property in the attribute map identifies which columns to use for the
lookup.
Here is an example CSV file attribute mapping:

attr.Store-ID=store-id
attr.Store\ Address=store_address
attr.incident-id=incident-id-key
attr.sender-email=sender-email-key
keys=sender-email-key:incident-id-key

With this example in mind, adhere to the following syntactical rules when mapping the attributes
to CSV file data.

Table 60-20 Attribute mapping syntax for CSV files

Example and syntax Description

Attributes map to column header names in


attr.Store-ID=store-id attribute-column pairs.

attr.attribute_name=column_head Here, Store-ID is a custom attribute and store-id


is a column header name in the CSV file.

Spaces are allowed before and after the = sign


attr.Store\ Address=store_address (except for the LDAP Lookup Plugin).

attr.attribute\ name=column\ head Blank spaces in attribute and column names must
be preceded by a backslash.

Here, the custom attribute is named Store


Address.

Each attribute-column pair is entered on a


attr.Store-ID=store-id separate line.
attr.Store\ Address=store_address

attr.attribute_name=column_head

attr.attribute_name=column_head
Implementing lookup plug-ins 2011
Configuring the CSV Lookup Plug-In

Table 60-20 Attribute mapping syntax for CSV files (continued)

Example and syntax Description

All syntax is case sensitive.


attr.Store\ Address=STORE_ADDRESS
The identifier attr. must be lower case.

Incident attributes must match the


system-definition string precisely.

System attributes are mapped to column header


attr.incident-id=incident-id-key names. The column name does not have to match
attr.sender-email=sender-email-key the system attribute, nor does it require the word
"key".
attr.attribute_name=column_head

Keys map the column name headers to the


keys=sender-email-key:incident-id-key incident attribute keys you want to use to look up
the attribute values. The keys map to the column
keys=<column_name_1st>:column_name_2nd header names, not to the incident attribute names.
The order of appearance determines priority.
Once the first incident is located in the CSV file,
the other attributes are populated.

CSV attribute mapping example


Consider another mapping example for the CSV Lookup Plug-In .

attr.sender-email = Email
attr.endpoint-user-name = Username
attr.file-owner = File-owner
attr.sender-ip = IP

attr.First\ Name = FIRST_NAME


attr.Last\ Name = LAST_NAME
attr.Business\ Unit = Org
attr.Manager\ Email = Mgr_email
attr.Employee\ ID = EMPLOYEE_NUMBER
attr.Phone\ Number = Phone
attr.Manager\ Last\ Name = Mgr_lastname
attr.Manager\ First\ Name = Mgr_firstname
attr.Employee\ Email = Emp_email

keys = Email:Username:File-owner:IP

Note the following about this example:


Implementing lookup plug-ins 2012
Configuring the CSV Lookup Plug-In

■ The first four lines map lookup parameters to column headers.


■ The remaining nine lines map custom attributes to column headers.
■ A backslash is prepended before each instance of a white-space character in an attribute
or column name. In this example, attr.Employee\ Email = Emp_email maps the
Employee Email custom attribute to the emp_email column head.
■ The keys property identifies and sequences the keys that are used to extract custom
attribute data. Each key is separated with a colon. The order in which you list the keys
determines the search sequence. In this example (keys =
Email:Username:File-owner:IP), the plug-in first searches the Email column for a value
that matches the lookup parameter value of the sender-email that has been passed to
the plug-in. If no matching value is found, the plug-in then searches the Username column
for a value that matches the endpoint-user-name lookup parameter. If no matching value
is found in that column, it then goes on to search the next key (File-owner), and so on.
■ The plug-in stops searching after it finds the first matching parameter key-value pair. As a
result, the order in which you list the keys column heads is significant.

Testing and troubleshooting the CSV Lookup Plug-In


If the plug-in does not load, or if the plug-in loads but fails to populate the custom attributes
with looked up values, troubleshoot as follows:
To test and troubleshoot the CSV Lookup Plug-In
1 Verify that the CSV file conforms to the requirements. If more than 10% of the rows in the
CSV file violate any of the CSV file requirements, the lookup plug-in does not load.
See “Requirements for creating the CSV file” on page 2008.
2 Verify that the delimiter you selected is the one used in the CSV file. Note that the system
defaults to comma, whereas the recommendation is pipe.
See “Choosing the CSV file delimiter” on page 2009.
3 Check the attribute mapping. No system-provided validation is available for the attribute
map. Make sure that your attribute map adheres to the syntax.
Common syntactical errors include:
■ Every entry in the attribute-mapping field is case-sensitive.
■ Spaces in attribute and column names must be identified by a backslash.
■ For every attribute=column pair, the data to the right of the equals sign (=) must be a
column head name.
■ Keys are column header names, not incident attributes.
Implementing lookup plug-ins 2013
Configuring the CSV Lookup Plug-In

4 If the plug-in fails to load, or the plug-in fails to return looked up values, check the file
c:\ProgramData\Symantec\DataLossPrevention\EnforceServer\Protect\
logs\tomcat\localhost.<latest-date>.log (Windows) or
/var/log/Symantec/DataLossPrevention/EnforceServer/15.5/
tomcat/localhost.<latest-date>.log (Linux).

■ Check that the database and table are created and that the CSV file is loaded into the
table. To verify, look for lines similar to the following:

INFO [com.vontu.lookup.csv.CsvLookup]
creating database
create table using SQL
importing data from file into table LOOKUP having columns

Note: To process large files, the CSV Lookup Plug-In uses an in-memory database
(Apache Derby). Only one instance of Derby can be running per Enforce Server. If a
previous instance is running, the CSV Lookup Plug-In does not load. If the database
and table are not created, restart the Symantec DLP Manager service and reload the
plug-in.

5 If the plug-in fails to return looked up values, check the file


c:\ProgramData\Symantec\DataLossPrevention\EnforceServer\Protect\
logs\tomcat\localhost.<latest-date>.log (Windows) or
/var/log/Symantec/DataLossPrevention/EnforceServer/15.5/
tomcat/localhost.<latest-date>.log (Linux).

Look for a warning message indicating that "SQL query did not return any results." In this
case, make sure that the attribute mapping matches the CSV column heads and reload
the plug-in if changes were made.
See “Troubleshooting lookup plug-ins” on page 2003.

CSV Lookup Plug-In tutorial


This tutorial provides instructions for implementing a simple CSV Lookup Plug-In . The purpose
of this tutorial is to introduce you to the lookup plug-in feature from a hands-on approach. If
you have experience generating incidents, creating custom attributes, and implementing lookup
plug-ins this tutorial may be too basic.
See “About the CSV Lookup Plug-In ” on page 1988.
To implement a simple CSV Lookup Plug-In
1 Create the following custom attributes at System > Attributes > Custom Attributes:
■ Manager
Implementing lookup plug-ins 2014
Configuring the CSV Lookup Plug-In

■ Department
■ Email Address

2 Create a pipe-delimited CSV file containing the following data.

SENDER|MGR|DEPT|EMAIL
[email protected]|Merle Manager|Engineering|[email protected]

3 Save the CSV file to the same volume drive where the Enforce Server is installed.
For example: C:\Program Files\Symantec\DataLossPrevention\EnforceServer
\15.5\Protect\plugins\lookup\csv_lookup_file.csv.

4 Create a basic keyword policy.


See “Configuring policies” on page 413.
5 Generate an email incident.
To trigger the lookup for this example, the incident should be an SMTP incident with the
sender of the email being the address [email protected]. Change the value of sender in
the CSV to match the actual value of the email sender.
6 Create a new CSV Lookup Plug-In at System > Incident Data > Lookup Plugins > New
Plugin.
7 Configure the lookup plug-in as follows:
■ Name: CSV Lookp Plug-in
■ Description: Look up manager of email sender from CSV file.
■ CSV File Path: C:\Program
Files\Symantec\DataLossPrevention\EnforceServer\15.5\Protect\plugins\lookup\csv_lookup_file.csv
■ Delimiter: Pipe [|]
■ File Encoding: UTF-8
■ Attribute Mapping
Map the system-defined attributes, custom attributes, and lookup parameter keys on
separate lines as follows:

attr.sender-email=SENDER
attr.Manager=MGR
attr.Department=DEPT
attr.Email\ Address=EMAIL
keys=SENDER
Implementing lookup plug-ins 2015
Configuring LDAP Lookup Plug-Ins

attr.sender-email = SENDER This is a lookup parameter key from the Sender group. It is mapped to
the corresponding column header in the CSV file.

attr.Manager = MGR This is a custom attribute defined in Step 1. It is mapped to the


corresponding column header in the CSV file.

attr.Department = DEPT This is a custom attribute defined in Step 1. It is mapped to the


corresponding column header in the CSV file.

attr.Email\ Address = EMAIL This is a space delimited custom attribute defines in Step 1. It is mapped
to the corresponding column head in the CSV file.

keys = SENDER This line declares one key to perform the lookup. The lookup ceases
once the first key is located, and the attribute values are populated.

8 Save the plug-in configuration.


9 Select System > Lookup Plugins > Lookup Parameters and select the following lookup
parameter key group:

Sender This group contains the sender-email key.

10 Select System > Lookup Plugins > Modify Plugin Chain and enable the plug-in.
11 Open the Incident Snapshot for the incident generated in the Step 4.
12 Verify that the unpopulated custom attributes you created in Step 1 appear in the Attributes
pane to the right of the screen.
If they do not, complete Step 1.
13 Verify that the Lookup option appears in the Attributes pane above the custom attributes.
If it does not, verify that the Lookup Attributes privilege is granted to the user.
Click Reload Plugin after making any changes.
14 Click the Lookup option.
The custom attributes should be populated with values looked up and retrieved from the
CSV file.
15 Troubleshoot the plug-in as necessary.
See “Testing and troubleshooting the CSV Lookup Plug-In ” on page 2012.

Configuring LDAP Lookup Plug-Ins


To configure one or more LDAP Lookup Plug-ins, complete these tasks.
Implementing lookup plug-ins 2016
Configuring LDAP Lookup Plug-Ins

Table 60-21 Configuring LDAP Lookup Plug-ins

Step Action Description

1 Create custom attributes. See “Configuring custom attributes” on page 1970.

2 Configure a connection to A functioning connection to an LDAP server must be available.


the LDAP server.
See “Requirements for LDAP server connections” on page 2016.

The connection to the LDAP server can be configured from the link in the LDAP
Lookup Plug-In .

See “Configuring directory server connections” on page 156.

3 Create a new LDAP See “Creating new lookup plug-ins” on page 1995.
Lookup Plug-In .

4 Map the attributes. Map the attributes to the corresponding LDAP directory fields. The syntax is
as follows:

attr.CustomAttributeName = search_base:
(search_filter=$variable$):
ldapAttribute

See “Mapping attributes to LDAP data” on page 2017.

See “Attribute mapping examples for LDAP” on page 2018.

5 Save and enable the The LDAP Lookup Plug-In must be enabled on the Enforce Server.
plug-in.
See “Enabling lookup plug-ins” on page 2001.

6 Test and troubleshoot the See “Troubleshooting lookup plug-ins” on page 2003.
LDAP Lookup Plug-In .

Requirements for LDAP server connections


The following conditions must be met for Symantec Data Loss Prevention to establish a
connection with an LDAP directory:
■ The LDAP directory must be running on a host that is accessible to the Enforce Server.
■ There must be an LDAP account that the Symantec Data Loss Prevention can use. This
account must have read-only access. You must know the user name and password of the
account.
■ You must know the Fully Qualified Domain Name (FQDN) of the LDAP server (the IP
address cannot be used).
■ You must know the port on the LDAP server which the Enforce Server uses to communicate
with the LDAP server. The default is 389.
Implementing lookup plug-ins 2017
Configuring LDAP Lookup Plug-Ins

You can use an LDAP lookup tool such as Softerra LDAP Browser to confirm that you have
the correct credentials to connect to the LDAP server. Also confirm that you have the right
fields defined to populate your custom attributes.
See “About LDAP Lookup Plug-Ins” on page 1988.

Mapping attributes to LDAP data


You map system and custom attributes to LDAP data in the Attribute Mapping field. Each
mapping is entered on a separate line. The order in which these mapping entries appear does
not matter.
The attribute mapping syntax for LDAP Lookup Plug-ins is as follows:

attr.CustomAttributeName = search_base:
(search_filter=$variable$):
ldapAttribute

The following table describes this syntax in more detail.

Table 60-22 LDAP mapping syntax details

Element Description

CustomAttributeName The name of the custom attribute as it is defined in the Enforce Server.
Note: If the name of the attribute contains white-space characters, you must
precede each instance of the white space with a backslash. A white-space
character is a space or a tab. For example, you need to enter the Business
Unit custom attribute as: attr.Business\ Unit

See “Configuring custom attributes” on page 1970.

search_base Identifies the LDAP directory.

search_filter The name of the LDAP attribute (field) that corresponds to the lookup parameter
(or other variable) passed to the plug-in from the Enforce Server.

variable The name of the lookup parameter that contains the value to be used as a key to
locate the correct data in the LDAP directory.

In cases where multiple plug-ins are chained together, the parameter might be a
variable that is passed to the LDAP Lookup Plug-In by a previous plug-in.

ldapAttribute The LDAP attribute whose data value is returned to the Enforce Server. This value
is used to populate the custom attribute that is specified in the first element of the
entry.

See “About LDAP Lookup Plug-Ins” on page 1988.


Implementing lookup plug-ins 2018
Configuring LDAP Lookup Plug-Ins

Attribute mapping examples for LDAP


The following mappings provide additional attribute mapping examples for LDAP Lookup
Plug-ins.
The following example attribute mapping searches the hr.corp LDAP directory for a record
with an attribute for mail whose value matches the value of the sender-email lookup
parameter. It returns to the Enforce Server the value of the givenName attribute for that record.

attr.First\ Name = dc=corp,dc=hr:(mail=$sender-email$):givenName

In the following attribute mapping example, a separate line is entered for each custom attribute
that is to be populated. In addition, note the use of the TempDeptCode temporary variable. The
department code is needed to obtain the department name from the LDAP hierarchy. But only
the department name needs to be stored as a custom attribute. The TempDeptCode variable
is created for this purpose.

attr.First\ Name = cn=users:(mail=$sender-email$):firstName


attr.Last\ Name = cn=users:(mail=$sender-email$):lastName
attr.TempDeptCode = cn=users:(mail=$sender-email$):deptCode
attr.Department = cn=departments:(deptCode=$TempDeptCode$):name
attr.Manager = cn=users:(mail=$sender-email$):manager

Testing and troubleshooting LDAP Lookup Plug-ins


Complete these steps to troubleshoot LDAP Lookup Plug-In implementations.
See “About LDAP Lookup Plug-Ins” on page 1988.
To troubleshoot an LDAP Lookup plug-in
1 If the plug-in does not save correctly, verify the configuration.
Before using the LDAP Lookup Plug-In you should test the connection to the LDAP server.
You can use a lookup tool such as the Softerra LDAP Browser to help confirm that you
have the correct fields defined.
See “Configuring directory server connections” on page 156.
2 Make sure that the plug-in is enabled.
3 Make sure that you created the Custom Attribute definitions.
In particular, check the attribute mapping. The attribute names must be identical.
4 If you made changes, or edited the lookup parameter keys, reload the plug-in.
See “Reloading lookup plug-ins” on page 2002.
Implementing lookup plug-ins 2019
Configuring LDAP Lookup Plug-Ins

5 Select Incidents > All Incidents for the detection server you are using to detect the
incident.
6 Select (check) several incidents and select Lookup Attributes from the Incident Actions
drop-down menu. (This action looks up attribute values for all incidents for that form of
detection.
7 Check the Incident Snapshot screen for an incident. Verify that the Lookup Custom
Attributes are filled with entries retrieved from the LDAP lookup.
8 If the correct values are not populated, or there is no value in a custom attribute you have
defined, make sure that there are no connection errors are recorded in the Incident History
tab.
9 Check the Tomcat log file.
See “Troubleshooting lookup plug-ins” on page 2003.

LDAP Lookup Plug-In tutorial


This tutorial provides steps for implementing a simple LDAP Lookup Plug-In .
To implement an LDAP Lookup Plug-In
1 Create the following custom attributes at System > Attributes > Custom Attributes:
LDAP givenName
LDAP telephoneNumber
2 Create a directory connection for the Active Directory server at System > Settings >
Directory Connections.
For example:
■ Hostname: enforce.dlp.company.com
■ Port: 389
■ Base DN: dc=enforce,dc=dlp,dc=com
■ Encryption: None
■ Authentication: Authenticated
■ username: userName
■ password: password

3 Test the connection. The system indicates if the connection is successful.


4 Create a new LDAP plug-in at System > Lookup Plugins > New Plugin > LDAP.
Name: LDAP Lookup Plug-in
Description: Description for the LDAP Plug-in.
Implementing lookup plug-ins 2020
Configuring Script Lookup Plug-Ins

5 Select the directory connection created in Step 2.


6 Map the attributes to LDAP metadata.

attr.LDAP\ givenName = cn=users:(|(givenName=$endpoint-user-name$)(mail=$sender-email$)


(streetAddress=$discoverserver$)):givenName
attr.LDAP\ telephoneNumber = cn=users:(|(givenName=$endpoint-user-name$)(mail=$sender-email$)
(streetAddress=$discoverserver$)):telephoneNumber

7 Save the plug-in. Verify that the correct save message for the plug-in is displayed.
8 Enable the following keys at the System > Lookup Plugins > Lookup Parameters page.
■ Incident
■ Message
■ Sender

9 Create an incident that generates one of the lookup parameters. For example, an email
incident exposes the sender-email attribute. There must be some corresponding information
in the Active Directory server.
10 Open the Incident Snapshot for the incident.
11 Click the Lookup button and verify the custom attributes created in the Step 1 are
populated in the right panel.

Configuring Script Lookup Plug-Ins


Complete these steps to implement one or more Script Lookup Plug-Ins to look up external
information.
See “Writing scripts for Script Lookup Plug-Ins” on page 2021.

Table 60-23 Configuring a Script Lookup Plug-In

Step Action Description

1 Create custom See “Configuring custom attributes” on page 1970.


attributes.

2 Create the script. See “Writing scripts for Script Lookup Plug-Ins” on page 2021.

3 Define the Lookup Select the keys to use to extract custom attribute data.
Parameter Keys.
See “Selecting lookup parameters” on page 1996.
Implementing lookup plug-ins 2021
Configuring Script Lookup Plug-Ins

Table 60-23 Configuring a Script Lookup Plug-In (continued)

Step Action Description

4 Create a new Script See “Creating new lookup plug-ins” on page 1995.
Plugin.

5 Enter the Script This value is the local path to the script engine executable on the Enforce Server
Command. host.

See “Specifying the Script Command” on page 2022.

6 Specify the Arguments. This value is the path to the Python script file to use for attribute lookup and any
command line arguments. Begin the script path with the -u argument to improve
lookup performance.

See “Specifying the Arguments” on page 2023.

7 Enable the stdin and Enable both options to help prevent script injection attacks.
stout options.
See “Enabling the stdin and stdout options” on page 2023.

8 Optionally, enable You can specify the incident types by protocol for passing attribute values to look
protocol filtering. up scripts.

See “Enabling incident protocol filtering for scripts” on page 2024.

9 Optionally, enable and You can encrypt and pass credentials required by the script to connect to external
encrypt credentials. systems.

See “Enabling and encrypting script credentials” on page 2025.

9 Save the plugin. Verify that the correct save message for the plugin is displayed.

See “Creating new lookup plug-ins” on page 1995.

10 Enable the lookup You can chain scripts together and chain scripts with other lookup plugins.
plugin.

11 Test the lookup plugin. Test the lookup plugin.

See “Troubleshooting lookup plug-ins” on page 2003.

Writing scripts for Script Lookup Plug-Ins


If you are using the Script Lookup Plug-In , you must write a script to extract data and populate
the custom attributes of each incident. The Script Lookup Plug-In passes attributes to scripts
as key-value pairs. In return, scripts must output a set of key-value pairs to standard out
(stdout). The plugin uses these key-value pairs to populate custom attributes.
Implementing lookup plug-ins 2022
Configuring Script Lookup Plug-Ins

When writing scripts for use with the Script Lookup Plug-In , adhere to the following syntax
requirements and calling conventions, including how a script plugin passes arguments to
scripts and the required format for script output.

Table 60-24 Script plugin calling conventions

Convention Syntax Description

Input attribute_name=attribute_value The Script Lookup Plug-In passes attributes to scripts as


command-line parameters in the form key=value.

Output stdout To work with the plugin and populate attributes, scripts
must output a set of key-value pairs to standard out
(stdout).

Newline characters must separate output key-value pairs.


For example:

host-name=mycomputer.company.corp
username=DOMAIN\bsmith

exit code 0 Scripts must exit with an exit code of ‘0.’ If scripts exit with
any other code, the Enforce Server assumes that an error
has occurred in script execution and terminates the
attribute lookup.

error handling stderr to a file Scripts cannot print out error or debug information. Redirect
stderr to a file. In Python this would be:

fsock=open("C:\error.log", "a") sys.stderr=fsock

See “Example script” on page 2029.

Specifying the Script Command


The Script Command field specifies the path to the script engine for executing the script.
These instructions are specific to Python.
To specify the script command
1 Download and install version 2.6 of Python on the Enforce Server host, if you have not
already done so.
2 Enter the local path to the python.exe executable file.
For example:
■ Windows: c:\python26\python.exe
Implementing lookup plug-ins 2023
Configuring Script Lookup Plug-Ins

■ Linux: /usr/local/bin/python

3 Enter the Arguments.


See “Specifying the Arguments” on page 2023.

Specifying the Arguments


The Arugments field specifies the path to the script and any additional command line
arguments. These instructions are specific to Python.
To specify the Arguments
1 After writing a script, copy it to the Enforce Server host, or to a file share that is accessible
by the Enforce Server.
2 Make sure that permissions are set correctly on the directory and the script file.
Both the directory and file must be readable and executable by the protect user.
3 Enter the -u argument in the Argument field.
This command forces stdin, stdout, and stderr to be totally unbuffered, which improves
lookup performance.
4 Enter the fully qualified path to the script file.
For example:
■ Windows: -u,c:\python26\scripts\ip-lookup.py
■ Linux: -u,/opt/python26/scripts/ip-lookup.py

Note: The system does not validate the file location.

5 Save the plugin configuration.

Enabling the stdin and stdout options


When you configure a Script Lookup Plug-In you can choose to Enable stdin and Enable
stdout. If these options are enabled, the system checks the script input and output for unsafe
characters such as command delimiters and logical operators that could be exploited by a
UNIX or Windows shell.
Because you are running the script on the host where the Enforcer Server is installed, you
should enable both options, unless you are certain that your script is safe. If enabled, the logs
will indicate invalid and unescaped characters.
See Table 60-25 on page 2024.
Implementing lookup plug-ins 2024
Configuring Script Lookup Plug-Ins

Table 60-25 Invalid characters for attribute names

Invalid character Description

Empty string Empty strings are not allowed.

@ Attributes containing these characters will be ignored during processing if the stdin and
stdout options are enabled.
.

$ Attributes containing the $ and % characters are allowed if these characters are properly
escaped by a backslash.
%

Enabling incident protocol filtering for scripts


Optionally, you can specify the incident types (by protocol) for passing attribute values to look
up scripts. If you do not enable protocol filtering, your Script Lookup Plug-In will apply to all
incidents.
For example, you can limit the passing of attribute values to those incidents that are detected
over HTTP. When you filter by protocol, Enforce Server still captures the incidents that are
detected over other protocols. But it does not use the Script Lookup Plug-In to populate those
incidents with attribute values.
Implementing lookup plug-ins 2025
Configuring Script Lookup Plug-Ins

To enable protocol filtering


1 Navigate to the System > Lookup Plugins > Edit Script Lookup Plugin screen in the
Enforce Server administration console.
See “Configuring Script Lookup Plug-Ins” on page 2020.
2 At the Script Lookup Plugin screen, select (check) the Enable protocol filtering option.
This action displays all the protocols that are available for filtering. Note that protocols are
detection server-specific.

Note: Network protocols are configured at the System > Settings > Protocols screen.
Endpoint protocols are configured at the System > Agents > Agent Configuration screen.
Discover protocols are configured at the Policies > Discover Scanning > Discover
Targets. And, once an incident is generated, the protocol value for the incident is displayed
at the top of the Incident Snapshot screen.

3 Specify the protocols you want to include in the lookup.


If you enable protocol filtering, you must select at least one protocol on which to filter.
4 Save the plug-in configuration.

Enabling and encrypting script credentials


If your script is connecting to an external system that requires credentials, you can enable
credentials for your script. If you enable credentials through the user interface option, you must
encrypt them. Symantec Data Loss Prevention provides the Credential Utility, which lets you
encrypt credentials and use them to authenticate to an external data source.
When the Enforce Server invokes the Script Lookup Plug-In , the plug-in decrypts any
credentials at runtime and passes them to the script as attributes. The credentials are then
available for use within the script. The Credential Utility uses the same platform encryption
keys that are used to protect user accounts and incident information within the Symantec Data
Loss Prevention system.
See Table 60-26 on page 2026.
If you choose to use credentials in clear text, you must hard code them into your script. In this
case, the Enforce Server passes the values you exported to the clear-text credential file. These
values are passed in the following format: key=value.
Implementing lookup plug-ins 2026
Configuring Script Lookup Plug-Ins

Table 60-26 Enabling and encrypting credentials

Step Action Description

1 Create a text file that contains the The format of this file is key=value, where key is the name
credentials that are needed by the script of the credential.
to access the appropriate external
For example:
systems.
username=msantos password=esperanza9

2 Save this credential file to the file system The file needs to be saved to the Enforce Server temporarily.
local to the Enforce Server.
For example: C:\temp\MyCredentials.txt.

3 On the Enforce Server, open a shell or This directory on the Enforce Server contains the Credential
command prompt and change directories Generator Utility.
to \Program
Files\Symantec\DataLossPrevention
\EnforceServer\15.5\Protect\bin.

4 Issue a command to generate an The command syntax is as follows:


encrypted credential file.
CredentialGenerator.bat
in-cleartext-filepath out-encrypted-filepath

For example on Windows you would issue the following:

CredentialGenerator.bat C:\temp\MyCredentials.txt
C:\temp\MyCredentialsEncrypted.txt

You can open this file in a text editor to verify that it is


encrypted.

5 Select Enable Credentials. At the System > Lookup Plugins > Edit Script Lookup
Plugin page, select (check) the Enable Credentials option.

6 Enter the Credentials File Path. Enter the fully qualified path to the encrypted credentials file.
For example:

You might also like