DPM 1052 UserGuide en
DPM 1052 UserGuide en
10.5.2
User Guide
Informatica Data Privacy Management User Guide
10.5.2
April 2022
© Copyright Informatica LLC 2015, 2022
This software and documentation are provided only under a separate license agreement containing restrictions on use and disclosure. No part of this document may be
reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise) without prior consent of Informatica LLC.
Informatica, the Informatica logo, [and any other trademarks appearing in the document] are trademarks or registered trademarks of Informatica LLC in the United
States and many jurisdictions throughout the world. A current list of Informatica trademarks is available on the web at https://www.informatica.com/trademarks.html.
Other company and product names may be trade names or trademarks of their respective owners.
Subject to your opt-out rights, the software will automatically transmit to Informatica in the USA information about the computing and network environment in which the
Software is deployed and the data usage and system statistics of the deployment. This transmission is deemed part of the Services under the Informatica privacy policy
and Informatica will use and otherwise process this information in accordance with the Informatica privacy policy available at https://www.informatica.com/in/
privacy-policy.html. You may disable usage collection in Administrator tool.
Portions of this software and/or documentation are subject to copyright held by third parties, including without limitation: Copyright DataDirect Technologies. All rights
reserved. Copyright © Sun Microsystems. All rights reserved. Copyright © RSA Security Inc. All Rights Reserved. Copyright © Ordinal Technology Corp. All rights
reserved. Copyright © Aandacht c.v. All rights reserved. Copyright Genivia, Inc. All rights reserved. Copyright Isomorphic Software. All rights reserved. Copyright © Meta
Integration Technology, Inc. All rights reserved. Copyright © Intalio. All rights reserved. Copyright © Oracle. All rights reserved. Copyright © Adobe Systems Incorporated.
All rights reserved. Copyright © DataArt, Inc. All rights reserved. Copyright © ComponentSource. All rights reserved. Copyright © Microsoft Corporation. All rights
reserved. Copyright © Rogue Wave Software, Inc. All rights reserved. Copyright © Teradata Corporation. All rights reserved. Copyright © Yahoo! Inc. All rights reserved.
Copyright © Glyph & Cog, LLC. All rights reserved. Copyright © Thinkmap, Inc. All rights reserved. Copyright © Clearpace Software Limited. All rights reserved. Copyright
© Information Builders, Inc. All rights reserved. Copyright © OSS Nokalva, Inc. All rights reserved. Copyright Edifecs, Inc. All rights reserved. Copyright Cleo
Communications, Inc. All rights reserved. Copyright © International Organization for Standardization 1986. All rights reserved. Copyright © ej-technologies GmbH. All
rights reserved. Copyright © Jaspersoft Corporation. All rights reserved. Copyright © International Business Machines Corporation. All rights reserved. Copyright ©
yWorks GmbH. All rights reserved. Copyright © Lucent Technologies. All rights reserved. Copyright © University of Toronto. All rights reserved. Copyright © Daniel
Veillard. All rights reserved. Copyright © Unicode, Inc. Copyright IBM Corp. All rights reserved. Copyright © MicroQuill Software Publishing, Inc. All rights reserved.
Copyright © PassMark Software Pty Ltd. All rights reserved. Copyright © LogiXML, Inc. All rights reserved. Copyright © 2003-2010 Lorenzi Davide, All rights reserved.
Copyright © Red Hat, Inc. All rights reserved. Copyright © The Board of Trustees of the Leland Stanford Junior University. All rights reserved. Copyright © EMC
Corporation. All rights reserved. Copyright © Flexera Software. All rights reserved. Copyright © Jinfonet Software. All rights reserved. Copyright © Apple Inc. All rights
reserved. Copyright © Telerik Inc. All rights reserved. Copyright © BEA Systems. All rights reserved. Copyright © PDFlib GmbH. All rights reserved. Copyright ©
Orientation in Objects GmbH. All rights reserved. Copyright © Tanuki Software, Ltd. All rights reserved. Copyright © Ricebridge. All rights reserved. Copyright © Sencha,
Inc. All rights reserved. Copyright © Scalable Systems, Inc. All rights reserved. Copyright © jQWidgets. All rights reserved. Copyright © Tableau Software, Inc. All rights
reserved. Copyright© MaxMind, Inc. All Rights Reserved. Copyright © TMate Software s.r.o. All rights reserved. Copyright © MapR Technologies Inc. All rights reserved.
Copyright © Amazon Corporate LLC. All rights reserved. Copyright © Highsoft. All rights reserved. Copyright © Python Software Foundation. All rights reserved.
Copyright © BeOpen.com. All rights reserved. Copyright © CNRI. All rights reserved.
This product includes software developed by the Apache Software Foundation (http://www.apache.org/), and/or other software which is licensed under various
versions of the Apache License (the "License"). You may obtain a copy of these Licenses at http://www.apache.org/licenses/. Unless required by applicable law or
agreed to in writing, software distributed under these Licenses is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
or implied. See the Licenses for the specific language governing permissions and limitations under the Licenses.
This product includes software which was developed by Mozilla (http://www.mozilla.org/), software copyright The JBoss Group, LLC, all rights reserved; software
copyright © 1999-2006 by Bruno Lowagie and Paulo Soares and other software which is licensed under various versions of the GNU Lesser General Public License
Agreement, which may be found at http:// www.gnu.org/licenses/lgpl.html. The materials are provided free of charge by Informatica, "as-is", without warranty of any
kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose.
The product includes ACE(TM) and TAO(TM) software copyrighted by Douglas C. Schmidt and his research group at Washington University, University of California,
Irvine, and Vanderbilt University, Copyright (©) 1993-2006, all rights reserved.
This product includes software developed by the OpenSSL Project for use in the OpenSSL Toolkit (copyright The OpenSSL Project. All Rights Reserved) and
redistribution of this software is subject to terms available at http://www.openssl.org and http://www.openssl.org/source/license.html.
This product includes Curl software which is Copyright 1996-2013, Daniel Stenberg, <[email protected]>. All Rights Reserved. Permissions and limitations regarding this
software are subject to terms available at http://curl.haxx.se/docs/copyright.html. Permission to use, copy, modify, and distribute this software for any purpose with or
without fee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies.
The product includes software copyright 2001-2005 (©) MetaStuff, Ltd. All Rights Reserved. Permissions and limitations regarding this software are subject to terms
available at http://www.dom4j.org/ license.html.
The product includes software copyright © 2004-2007, The Dojo Foundation. All Rights Reserved. Permissions and limitations regarding this software are subject to
terms available at http://dojotoolkit.org/license.
This product includes ICU software which is copyright International Business Machines Corporation and others. All rights reserved. Permissions and limitations
regarding this software are subject to terms available at http://source.icu-project.org/repos/icu/icu/trunk/license.html.
This product includes software copyright © 1996-2006 Per Bothner. All rights reserved. Your right to use such materials is set forth in the license which may be found at
http:// www.gnu.org/software/ kawa/Software-License.html.
This product includes OSSP UUID software which is Copyright © 2002 Ralf S. Engelschall, Copyright © 2002 The OSSP Project Copyright © 2002 Cable & Wireless
Deutschland. Permissions and limitations regarding this software are subject to terms available at http://www.opensource.org/licenses/mit-license.php.
This product includes software developed by Boost (http://www.boost.org/) or under the Boost software license. Permissions and limitations regarding this software
are subject to terms available at http:/ /www.boost.org/LICENSE_1_0.txt.
This product includes software copyright © 1997-2007 University of Cambridge. Permissions and limitations regarding this software are subject to terms available at
http:// www.pcre.org/license.txt.
This product includes software copyright © 2007 The Eclipse Foundation. All Rights Reserved. Permissions and limitations regarding this software are subject to terms
available at http:// www.eclipse.org/org/documents/epl-v10.php and at http://www.eclipse.org/org/documents/edl-v10.php.
This product includes software licensed under the terms at http://www.tcl.tk/software/tcltk/license.html, http://www.bosrup.com/web/overlib/?License, http://
www.stlport.org/doc/ license.html, http://asm.ow2.org/license.html, http://www.cryptix.org/LICENSE.TXT, http://hsqldb.org/web/hsqlLicense.html, http://
httpunit.sourceforge.net/doc/ license.html, http://jung.sourceforge.net/license.txt , http://www.gzip.org/zlib/zlib_license.html, http://www.openldap.org/software/
release/license.html, http://www.libssh2.org, http://slf4j.org/license.html, http://www.sente.ch/software/OpenSourceLicense.html, http://fusesource.com/downloads/
license-agreements/fuse-message-broker-v-5-3- license-agreement; http://antlr.org/license.html; http://aopalliance.sourceforge.net/; http://www.bouncycastle.org/
licence.html; http://www.jgraph.com/jgraphdownload.html; http://www.jcraft.com/jsch/LICENSE.txt; http://jotm.objectweb.org/bsd_license.html; . http://www.w3.org/
Consortium/Legal/2002/copyright-software-20021231; http://www.slf4j.org/license.html; http://nanoxml.sourceforge.net/orig/copyright.html; http://www.json.org/
license.html; http://forge.ow2.org/projects/javaservice/, http://www.postgresql.org/about/licence.html, http://www.sqlite.org/copyright.html, http://www.tcl.tk/
software/tcltk/license.html, http://www.jaxen.org/faq.html, http://www.jdom.org/docs/faq.html, http://www.slf4j.org/license.html; http://www.iodbc.org/dataspace/
iodbc/wiki/iODBC/License; http://www.keplerproject.org/md5/license.html; http://www.toedter.com/en/jcalendar/license.html; http://www.edankert.com/bounce/
index.html; http://www.net-snmp.org/about/license.html; http://www.openmdx.org/#FAQ; http://www.php.net/license/3_01.txt; http://srp.stanford.edu/license.txt;
http://www.schneier.com/blowfish.html; http://www.jmock.org/license.html; http://xsom.java.net; http://benalman.com/about/license/; https://github.com/CreateJS/
EaselJS/blob/master/src/easeljs/display/Bitmap.js; http://www.h2database.com/html/license.html#summary; http://jsoncpp.sourceforge.net/LICENSE; http://
jdbc.postgresql.org/license.html; http://protobuf.googlecode.com/svn/trunk/src/google/protobuf/descriptor.proto; https://github.com/rantav/hector/blob/master/
LICENSE; http://web.mit.edu/Kerberos/krb5-current/doc/mitK5license.html; http://jibx.sourceforge.net/jibx-license.html; https://github.com/lyokato/libgeohash/blob/
master/LICENSE; https://github.com/hjiang/jsonxx/blob/master/LICENSE; https://code.google.com/p/lz4/; https://github.com/jedisct1/libsodium/blob/master/
LICENSE; http://one-jar.sourceforge.net/index.php?page=documents&file=license; https://github.com/EsotericSoftware/kryo/blob/master/license.txt; http://www.scala-
lang.org/license.html; https://github.com/tinkerpop/blueprints/blob/master/LICENSE.txt; http://gee.cs.oswego.edu/dl/classes/EDU/oswego/cs/dl/util/concurrent/
intro.html; https://aws.amazon.com/asl/; https://github.com/twbs/bootstrap/blob/master/LICENSE; https://sourceforge.net/p/xmlunit/code/HEAD/tree/trunk/
LICENSE.txt; https://github.com/documentcloud/underscore-contrib/blob/master/LICENSE, and https://github.com/apache/hbase/blob/master/LICENSE.txt.
This product includes software licensed under the Academic Free License (http://www.opensource.org/licenses/afl-3.0.php), the Common Development and
Distribution License (http://www.opensource.org/licenses/cddl1.php) the Common Public License (http://www.opensource.org/licenses/cpl1.0.php), the Sun Binary
Code License Agreement Supplemental License Terms, the BSD License (http:// www.opensource.org/licenses/bsd-license.php), the new BSD License (http://
opensource.org/licenses/BSD-3-Clause), the MIT License (http://www.opensource.org/licenses/mit-license.php), the Artistic License (http://www.opensource.org/
licenses/artistic-license-1.0) and the Initial Developer’s Public License Version 1.0 (http://www.firebirdsql.org/en/initial-developer-s-public-license-version-1-0/).
This product includes software copyright © 2003-2006 Joe WaInes, 2006-2007 XStream Committers. All rights reserved. Permissions and limitations regarding this
software are subject to terms available at http://xstream.codehaus.org/license.html. This product includes software developed by the Indiana University Extreme! Lab.
For further information please visit http://www.extreme.indiana.edu/.
This product includes software Copyright (c) 2013 Frank Balluffi and Markus Moeller. All rights reserved. Permissions and limitations regarding this software are subject
to terms of the MIT license.
DISCLAIMER: Informatica LLC provides this documentation "as is" without warranty of any kind, either express or implied, including, but not limited to, the implied
warranties of noninfringement, merchantability, or use for a particular purpose. Informatica LLC does not warrant that this software or documentation is error free. The
information provided in this software or documentation may include technical inaccuracies or typographical errors. The information in this software and documentation
is subject to change at any time without notice.
NOTICES
This Informatica product (the "Software") includes certain drivers (the "DataDirect Drivers") from DataDirect Technologies, an operating company of Progress Software
Corporation ("DataDirect") which are subject to the following terms and conditions:
1. THE DATADIRECT DRIVERS ARE PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO,
THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
2. IN NO EVENT WILL DATADIRECT OR ITS THIRD PARTY SUPPLIERS BE LIABLE TO THE END-USER CUSTOMER FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, CONSEQUENTIAL OR OTHER DAMAGES ARISING OUT OF THE USE OF THE ODBC DRIVERS, WHETHER OR NOT INFORMED OF THE POSSIBILITIES
OF DAMAGES IN ADVANCE. THESE LIMITATIONS APPLY TO ALL CAUSES OF ACTION, INCLUDING, WITHOUT LIMITATION, BREACH OF CONTRACT, BREACH
OF WARRANTY, NEGLIGENCE, STRICT LIABILITY, MISREPRESENTATION AND OTHER TORTS.
The information in this documentation is subject to change without notice. If you find any problems in this documentation, report them to us at
[email protected].
Informatica products are warranted according to the terms and conditions of the agreements under which they are provided. INFORMATICA PROVIDES THE
INFORMATION IN THIS DOCUMENT "AS IS" WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING WITHOUT ANY WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND ANY WARRANTY OR CONDITION OF NON-INFRINGEMENT.
Part I: Configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4 Table of Contents
Adding the Maximum Risk Score to Data Store Details. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Adding Custom Risk Score Factors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Changing Anomaly Factor Weights. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Changing the Default Risk Cost. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Changing the Default Conformance Ranges. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Changing the Default Anomaly Severity Ranges. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Configuring Axon Data Governance Integration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Configuring a Brand Logo. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Adding a Custom Logo. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
Image File Specifications. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
Chapter 3: Locations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Locations Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Locations Workspace. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Location Properties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Location Management. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Creating a Location. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Exporting Locations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Importing Locations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
Editing a Location. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Copying a Location. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Changing Data Store Location Assignments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Deleting a Location. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Chapter 5: Extensions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
Extensions Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
Extension Types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Supported Data Store Types for Protection Extension Plugins. . . . . . . . . . . . . . . . . . . . . . . . . 66
Extensions Workspace. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
Extensions List Page. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Extension Properties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Custom Extension Properties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Email Extension Properties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Encryption Protection Extension Properties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
Table of Contents 5
Persistent Data Masking - Big Data Protection Extension Properties. . . . . . . . . . . . . . . . . . 70
Persistent Data Masking - Remote Domain Protection Extension Properties. . . . . . . . . . . . . 71
Service Management Extension Properties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
System Log Extension Properties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Extension Management. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Adding or Updating a Protection Extension on the Data Domain Details Page. . . . . . . . . . . . 74
Creating an Extension. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Editing an Extension. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
Deleting an Extension. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6 Table of Contents
Synchronizing Users. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Importing Data Stores. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Importing Data Store Aliases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
Importing Data Owners. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
Importing Protection Statuses. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
Importing Custom Lineage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
Importing Connection Assignments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
Importing Catalog Metadata. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
Importing Catalog Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
Exporting Data Breach Reports. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
Exporting Data Store Aliases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
Exporting Data Store Configurations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Merging Data Stores. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
Deleting a Data Store. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
Table of Contents 7
Snowflake Connection Properties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
Snowflake Advanced Scanner data store properties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
SQL Server Integration Services Connection Properties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
8 Table of Contents
Creating a Classification Policy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
Exporting Classification Policies. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
Importing Classification Policies. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
Editing a Classification Policy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
Copying a Classification Policy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
Deleting a Classification Policy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
Table of Contents 9
Exporting Scans. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
10 Table of Contents
Evaluate Classification Policies Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294
Evaluate Risk Scores Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
Evaluate Security Policy Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
File Management Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
Rules and Guidelines for File Management Jobs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
Import Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
Rules and Guidelines for Import Jobs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
Import Catalog Results Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
Incremental Scan Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298
Orchestration Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298
Protection Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
Reset Classification Results Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300
Salesforce Import job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300
Subject Data Report Purge Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302
Subject Registry Scan Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302
Privacy Dashboard Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
Sync Catalog Updates Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304
Synchronize Users Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304
Job Management. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304
Pausing a Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304
Resuming a Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
Stopping a Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
Terminating a Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306
Exporting Jobs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306
Downloading Scan Job Reports. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306
Downloading Rejected Records for an Import Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
Table of Contents 11
Deleting an LDAP Connection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
Import and Synchronization from Salesforce. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
Creating a Salesforce Connection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320
Running Salesforce Synchronization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322
Editing a Salesforce Connection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322
Deleting a Salesforce Connection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322
User Import from a CSV File. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
CSV File Format to Import Users. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
Example CSV File to Import Users. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324
Rules and Guidelines to Import Users. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
Steps to Import Users. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
User Group Import from a CSV File. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326
CSV File Format to Import User Groups. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
Example of CSV File to Import User Groups. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
Rules and Guidelines to Import User Groups. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
Steps to Import Groups. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
User Group Membership Import from a CSV File. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
CSV File Format to Import Group Memberships. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330
Example of CSV File to Import Group Memberships. . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
Rules and Guidelines to Import Group Memberships. . . . . . . . . . . . . . . . . . . . . . . . . . . 331
Steps to Import Group Memberships. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
User Access Import from a CSV File. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
CSV File Format to Import User Access. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333
Example of CSV File to Import User Access. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335
Rules and Guidelines to Import User Access. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335
Steps to Import User Access. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336
User Aliases Import from a CSV File. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
CSV File Format to Import User Aliases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
Example of CSV File to Import User Aliases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338
Rules and Guidelines to Import User Aliases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338
Steps to Import User Aliases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
Export of Users and User Access. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
Exporting Users. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
Exporting User Groups. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
Exporting User Group Memberships. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
Exporting User Access. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
Exporting User Aliases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
12 Table of Contents
Top Anomalous Factors Indicator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
Top Data Stores Indicator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348
Top Users Indicator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
Anomalous Factors View. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350
Data Stores View. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351
Users View. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352
Anomalies List. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353
Anomaly Details. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356
Anomalous Factors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358
Anomaly Management. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359
Flagging or Unflagging An Anomaly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360
Marking an Anomaly as Read or Unread. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360
Filtering Anomalies. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362
Exporting Anomalies. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363
Deleting an Anomaly. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364
Suppressing an Anomaly. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364
Table of Contents 13
Chapter 19: Manual Actions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385
Manual Actions Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385
Creating a Service Management Ticket. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386
Creating a Service Management Ticket from the Anomaly Detection Workspace. . . . . . . . . 386
Creating a Service Management Ticket from the Proliferation Page. . . . . . . . . . . . . . . . . . 388
Creating a Service Management Ticket from the Security Policy Violations Workspace. . . . . 390
Creating a Service Management Ticket from the Sensitive Fields Page. . . . . . . . . . . . . . . . 392
Creating a Service Management Ticket from the Top Data Domains List Page. . . . . . . . . . . 395
Creating a Service Management Ticket from the Top Data Stores Grid Page. . . . . . . . . . . . 397
Protecting Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 400
Protecting Data from the Anomaly Detection Workspace. . . . . . . . . . . . . . . . . . . . . . . . . 400
Protecting Data from the Proliferation Page. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402
Protecting Data from the Security Policy Violations Workspace. . . . . . . . . . . . . . . . . . . . 403
Protecting Data from the Sensitive Fields Page. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405
Protecting Data from the Top Data Domains List Page. . . . . . . . . . . . . . . . . . . . . . . . . . 406
Protecting Data from the Top Data Stores Grid Page. . . . . . . . . . . . . . . . . . . . . . . . . . . 408
Running a Custom Script. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409
Running a Custom Script from the Anomaly Detection Workspace. . . . . . . . . . . . . . . . . . . 409
Running a Custom Script from the Proliferation Page. . . . . . . . . . . . . . . . . . . . . . . . . . . 411
Running a Custom Script from the Security Policy Violations Workspace. . . . . . . . . . . . . . 413
Running a Custom Script from the Sensitive Fields Page. . . . . . . . . . . . . . . . . . . . . . . . . 415
Running a Custom Script from the Top Data Domains List Page. . . . . . . . . . . . . . . . . . . . 417
Running a Custom Script from the Top Data Stores Grid Page. . . . . . . . . . . . . . . . . . . . . 419
Sending an Email. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 420
Sending an Email from the Anomaly Detection Workspace. . . . . . . . . . . . . . . . . . . . . . . 421
Sending an Email from the Proliferation Page. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423
Sending an Email from the Security Policy Violations Workspace. . . . . . . . . . . . . . . . . . . 425
Sending an Email from the Sensitive Fields Page. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427
Sending an Email from the Top Data Domains List Page. . . . . . . . . . . . . . . . . . . . . . . . . 429
Sending an Email from the Top Data Stores Grid Page. . . . . . . . . . . . . . . . . . . . . . . . . . 431
Writing a System Log Message. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433
Writing a System Log Message from the Anomaly Detection Workspace. . . . . . . . . . . . . . . 434
Writing a System Log Message from the Proliferation Page. . . . . . . . . . . . . . . . . . . . . . . 436
Writing a System Log Message from the Security Policy Violations Workspace. . . . . . . . . . 438
Writing a System Log Message from the Sensitive Fields Page. . . . . . . . . . . . . . . . . . . . . 440
Writing a System Log Message from the Top Data Domains List Page. . . . . . . . . . . . . . . . 442
Writing a System Log Message from the Top Data Stores Grid Page. . . . . . . . . . . . . . . . . 443
14 Table of Contents
Data Store Policy Example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447
Decryption Policy Example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 448
User Activity Policy Example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 448
Security Policies Workspace. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449
Security Policies List Page. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449
Security Policy Details Page. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449
Security Policy Properties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450
Security Policy Management. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453
Creating a Security Policy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453
Editing a Security Policy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455
Copying a Security Policy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455
Deleting a Security Policy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455
Table of Contents 15
Exporting Violations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 478
Deleting a Violation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 479
16 Table of Contents
Enterprise-Level Summary of Sensitive Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 511
Discovery Bar Indicator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 512
Risk Score Indicator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 512
Protection Status Indicator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 514
Sensitivity Level Indicator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 515
Residual Risk Indicator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 515
Security Dashboard Indicators. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 516
Sensitive Data by Data Store Groups Indicator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 517
Sensitive Data for Location Indicator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 518
Sensitive Data Proliferation Indicator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 519
Top Data Domains Indicator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 520
Top Data Stores Indicator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 522
Top Departments Indicator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 524
Top Users Indicator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 526
User Access Indicator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 528
Pages Accessed from the Security Dashboard. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 529
Data Domain Details Page. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 529
Data Domains List Page. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 531
Data Stores Grid Page. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533
Data Stores List Page. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535
Departments Page. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 538
Proliferation Page. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 539
Sensitive Data by Data Store Group Page. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 542
Sensitive Data Proliferation by Classification Policy Page. . . . . . . . . . . . . . . . . . . . . . . . 543
Sensitive Data Proliferation by Data Store Group Page. . . . . . . . . . . . . . . . . . . . . . . . . . 544
Sensitive Data Locations Page. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 545
Sensitive Fields Page. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 546
Sensitive Files Page. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 555
User Access Page. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 561
User Activity Page. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 562
User Profile Page. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 564
Users Page. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567
Security Dashboard Management. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 569
Customizing the Security Dashboard. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 569
Filtering Information on the Security Dashboard. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 570
Exporting Information on the Security Dashboard. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 576
Table of Contents 17
Summary Workspace Management. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 587
Exporting Information from the Summary Workspaces. . . . . . . . . . . . . . . . . . . . . . . . . . 587
18 Table of Contents
Subject Registry Workspace. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 611
Subject Registry Search. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 612
Subject Registry Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 613
Subject Registry Workspace Management. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 613
Importing a Master List of Categories. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 614
Importing a Master List of Purposes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 614
Subject Details Page. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 615
Subject Registry Data Store Details Page. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 616
DSAR Report Formats. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 617
Customize the DSAR Report PDF Format. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 619
Subject Details Page Management. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 620
Applying and Removing Legal Holds. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 621
Preparing a DSAR Report. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 622
Downloading a DSAR Report. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 622
Creating a Service Management Ticket. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 623
Running a Custom Script. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 624
Sending an Email. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625
Table of Contents 19
Appendix B: Updating Keystore and Truststore Certificates to Maintain
Secure Communication. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 646
Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 646
Updating the Keystore and Truststore with Informatica Certificates. . . . . . . . . . . . . . . . . . . . . 646
Updating the Keystore and Truststore with OpenSSL-Generated Certificates. . . . . . . . . . . . . . . 647
Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 649
20 Table of Contents
Preface
Use the Data Privacy Management User Guide to learn how to use the Data Privacy Management application.
You can discover, classify, and protect sensitive data. Detect user activity on sensitive data, view data
subject information and generate data subject reports for compliance with data privacy regulations.
Informatica Resources
Informatica provides you with a range of product resources through the Informatica Network and other online
portals. Use the resources to get the most from your Informatica products and solutions and to learn from
other Informatica users and subject matter experts.
Informatica Network
The Informatica Network is the gateway to many resources, including the Informatica Knowledge Base and
Informatica Global Customer Support. To enter the Informatica Network, visit
https://network.informatica.com.
To search the Knowledge Base, visit https://search.informatica.com. If you have questions, comments, or
ideas about the Knowledge Base, contact the Informatica Knowledge Base team at
[email protected].
Informatica Documentation
Use the Informatica Documentation Portal to explore an extensive library of documentation for current and
recent product releases. To explore the Documentation Portal, visit https://docs.informatica.com.
If you have questions, comments, or ideas about the product documentation, contact the Informatica
Documentation team at [email protected].
21
Informatica Product Availability Matrices
Product Availability Matrices (PAMs) indicate the versions of the operating systems, databases, and types of
data sources and targets that a product release supports. You can browse the Informatica PAMs at
https://network.informatica.com/community/informatica-network/product-availability-matrices.
Informatica Velocity
Informatica Velocity is a collection of tips and best practices developed by Informatica Professional Services
and based on real-world experiences from hundreds of data management projects. Informatica Velocity
represents the collective knowledge of Informatica consultants who work with organizations around the
world to plan, develop, deploy, and maintain successful data management solutions.
You can find Informatica Velocity resources at http://velocity.informatica.com. If you have questions,
comments, or ideas about Informatica Velocity, contact Informatica Professional Services at
[email protected].
Informatica Marketplace
The Informatica Marketplace is a forum where you can find solutions that extend and enhance your
Informatica implementations. Leverage any of the hundreds of solutions from Informatica developers and
partners on the Marketplace to improve your productivity and speed up time to implementation on your
projects. You can find the Informatica Marketplace at https://marketplace.informatica.com.
To find your local Informatica Global Customer Support telephone number, visit the Informatica website at
the following link:
https://www.informatica.com/services-and-training/customer-success-services/contact-us.html.
To find online support resources on the Informatica Network, visit https://network.informatica.com and
select the eSupport option.
22 Preface
Chapter 1
You can use Data Privacy Management to monitor and track subject data that you store in data stores within
an organization. Create a Subject Registry that consists of all data stores that contain subject data. To
comply with governmental or organizational regulations, you might need to identify and provide a summary of
all information that you store on a data subject. For example, GDPR, the right-to-be-forgotten, or other data
privacy requests. To respond to and comply with such requests, you must be able to identify where you store
information and how you share the information. You must also be able to track and delete all information on
a subject if a request to do so is raised. Run scans to identify subject data and create a Subject Registry. You
can raise subject requests and track the requests to completion. Monitor Subject Registry data and requests
from the Data Privacy dashboard.
To identify sensitive data in your organization and sensitive information about a data subject, you scan data
stores. A data store is a repository object that connects to the data source that you want to analyze. You can
create data stores that connect to relational databases, Cloud applications, Cloudera Navigator, file systems,
Hadoop, Informatica Data Engineering Integration, Informatica Intelligent Cloud Services (IICS), Informatica
PowerCenter, and SQL Server Integration Services. You scan data stores based on data domain and
classification policy configuration.
Data domains define the rules that identify specific columns in a data store. For example, you can create data
domains to identify social security numbers or credit card numbers. You assign data domains to
classification policies, which are collections of related data domains. You can define classification policies
based on data security standards and data privacy regulation requirements. For example, you can create a
23
classification policy that identifies PII. You include data domains for social security number, first name, last
name, and date of birth. You can also create a classification policy that identifies information about a data
subject. For example, to comply with GDPR, you create a GDPR classification policy to automate the process
of creating reports to satisfy data subject requests about the personal information that your enterprise holds
for an individual.
You scan a data store to identify sensitive and private data. Data Privacy Management classifies the data
stores based on the match to a classification policy in the scan. Data Privacy Management evaluates
PowerCenter mappings to determine sensitive data proliferation. For example, you might have mappings that
move sensitive data from one source to multiple targets. To identify if sensitive data is protected, Data
Privacy Management verifies if a PowerCenter mapping contains a data masking transformation to mask
data.
After you scan data stores, you view the scan results on the Overview workspace. In the Security Dashboard
view, the Overview workspace displays a high-level summary of the total risk score and protection status of
all data stores. The dashboard also includes predefined queries for a detailed analysis of the scan results.
For example, you can view scan results by region, department, organizational hierarchy, and classification
policy. In the Privacy Dashboard view, the Overview workspace displays a high-level summary of subject data
and indicators that show information about data subjects, data subject access requests (DSARs), data
subject types, and data stores associated with data subjects.
You can create security policies and alerts to automate the delegation of data subject reporting, identification
of suspicious behavior such as unauthorized use, detection of policy violations, and sensitive data protection
with Persistent Data Masking, Dynamic Data Masking, and encryption. You can also intervene manually to
protect sensitive data or send notifications about potential data security risks or data privacy regulation
compliance.
Example
You are a security officer for a financial company. The company has applications for investment banking,
commercial lending, and retail. The data is stored in several relational databases. The CIO wants to conduct
an internal audit to verify compliance for PCI data security standards across all applications. You need to
identify the databases that contain PCI data such as account numbers, birth dates, and credit cards. You also
need to verify if sensitive data is protected and which sensitive data is sent to other databases.
As a security officer, you also need to respond and take action on requests raised by data subjects. You
identify what data and in which data stores the organization stores subject data. You provide a summary of
that data, and can delete all data stored on a subject if a valid request is raised.
Use the following capabilities of Data Privacy Management to achieve the data security goals of your
organization:
Identify all sensitive data that exists in relational databases, cloud applications, and file systems. You
might not know where sensitive data exists in all applications. You might have an internal catalog of
applications that contain sensitive data. However, you are unsure of the accuracy and completeness of
the catalog. Use Data Privacy Management to validate your internal catalog and to quickly identify where
sensitive data exists.
After you discover individual table columns that contain sensitive data, identify if you have groupings of
sensitive data as defined by data security standards. If you have groups of data defined as sensitive, you
must comply with industry requirements for security management. For example, if you are a merchant
that accepts credit cards, you need to comply with PCI data security standards. Use Data Privacy
Management to identify the classification of sensitive data.
Data Privacy Management can identify sensitive data that is masked by a PowerCenter data masking
transformation.
You can manually identify the sensitive data protection status for table columns that are not included in
PowerCenter mappings.
Sensitive data can originate in one system but can end up in multiple systems. For example, you use
PowerCenter to extract, transform, and load data from a production system to various test and
development systems. Use Data Privacy Management to identify the target systems to which sensitive
data is extracted and loaded.
Identify how many and which users have access to sensitive data
After you discover sensitive data and import users, find out who can access sensitive data. Use Data
Privacy Management to determine the number of users with access to sensitive data. You can view the
data stores that a user can access, and the user groups that include a specific user.
Data Privacy Management displays details about each user and their activity, and ranks users by activity
on sensitive data. You can determine the type of information the user accessed. You can also detect any
unusual patterns of access such as increased activity on sensitive data during non-business hours.
The dashboard displays the risk cost and risk score of sensitive data. The risk cost is the total monetary
loss that the organization might have to bear if sensitive data is lost, exposed to an unauthorized user, or
becomes unavailable for use in ongoing operations. The risk score is based on a variety of risk factors
that together determine the potential of a security risk. Use the risk cost and risk score values to
prioritize and remediate security risks.
1. Define locations to identify the geographic areas where your data sources exist.
2. Define data store groups to logically group similar data stores.
3. Define data stores to access data from data sources.
4. Define data domains to include the rules that Data Privacy Management uses to identify sensitive
columns in a data store.
5. Define classification policies to classify sensitive data based on industry or organization data security
standards.
6. Define scans to determine how and when Data Privacy Management discovers and classifies sensitive
data for a data store.
7. Monitor the status of Scan jobs to verify that the jobs completed successfully.
8. View the scan results to analyze the discovery and classification of sensitive data.
9. Import user and user access metadata to display which users have access and activity on sensitive data.
10. Define security policies to detect unusual or high-risk situations.
When you create a location, you specify a pattern of IP addresses or host names to identify the data sources
that belong to the location. When you create a data store, Data Privacy Management tries to identify the data
store location based on the data store IP address or host name. Alternatively, you can manually assign a
location to the data store. After you scan data stores, you can view scan results for all data stores in the
same location.
You can create or import data stores. You can import data stores from a parent repository, such as
Informatica PowerCenter, or from a file. When you create a scan, you assign data stores to the scan. After
you run the scan, you can view the scan results for the data stores. For example, you can scan results for
data stores by department, location, or data store group.
Data Privacy Management includes predefined data domains. You can also create data domains or import
data domains from a file.
You include data domains in classification policies. When you scan a data store, you select classification
policies. Data Privacy Management uses the data domains in the classification policy to perform data
domain discovery. The data domain discovery identifies the sensitive columns in the data store based on the
metadata or data domain conditions.
Data Privacy Management includes predefined classification policies. You can also create or import
classification policies. When you create a classification policy, you assign data domains and define data
domain match conditions. When you create a scan, you select the classification policies and data stores that
you want to evaluate. Data Privacy Management uses the data domain match conditions to evaluate if the
data stores include all of the sensitive data domains as defined in the classification policy.
After you scan data stores, you can view scan results for data stores that match a classification policy.
When you create a scan, you specify the type of scan to run and the data stores to scan. You assign
classification policies to the scan and schedule when the scan runs.
Data Privacy Management runs data domain discovery to identify the sensitive columns in the data store.
Data Privacy Management compares the data domains in the classification policy against the metadata and
data from the data store. Then, Data Privacy Management evaluates if the data store contains the sensitive
data specified in the classification policy. Data Privacy Management uses the data domain match conditions
in the classification policy to evaluate if the data store contains the sensitive data specified in the
classification policy.
You can import metadata about users, groups, and group memberships from LDAP directory services and
from CSV files. You can import user access and user alias metadata from CSV files. You can import users,
user access, and user activity from Salesforce.
When you import user metadata, you schedule an Import job. The Import job adds the user metadata to the
Data Privacy Management repository. The Data Privacy Management Service augments scan results and user
activity data with the imported user metadata. On the Overview workspace, you can view users that have
permission to access sensitive data in data stores. When you view user activity on data stores, you can also
view information about the user account that performed an event on the sensitive data.
1. Header
2. Workspace tabs
3. Workspace
Header
The header appears at the top of the user interface. Use the header icons to define system settings, view
subject registry, view tasks, view security policy violations, view anomalies, filter a workspace list page, and
view online help. Use the header menus to view summaries of scan results, open workspaces, change
passwords, and log out of Data Privacy Management.
The following table describes the actions you can perform on the header:
1 Settings icon Open the Settings workspace and configure the background color for the user
interface, sensitivity levels, risk score factor weights, anomaly factor weights, risk
cost and conformance score criteria, anomaly severity levels, and an Axon resource
name.
2 Subject Registry Open the Subject Registry workspace and view subject details and history. Create
icon an automated DSAR report for a data subject.
3 Tasks icon Open the Tasks workspace and configure, manage, and run tasks.
4 Security Policy Open the Security Policy Violations workspace and view a list of incidents that met
Violations icon the conditions specified in security policies.
User Interface 29
Number Name Description
5 Anomaly Detection Open the Anomaly Detection workspace and view incidents of unusual activity.
icon
7 Summary Analytics Open a workspace that shows scan result summaries for data stores, departments,
menu or user activity.
8 Manage menu Open a workspace for actions, always sensitive/never sensitive fields,
classification policies, data domains, data store groups, data stores, encryption
rules. extensions, jobs, locations, remote agents, risk simulation plans, scans,
security policies, security policy groups, suppression rules, and users.
9 <User name> menu Change your password and log out of Data Privacy Management.
10 Help icon Access online help for the current workspace and access the Start Up window.
Workspaces
A workspace is a page that contains all data and functions related to an aspect of sensitive data in your
organization.
You can access workspaces from the main menu or from icons in the header. You can also access
workspaces from tasks that you perform in other workspaces, such as in the following examples:
• When you select a data store on the Data Store Groups workspace, the Data Store workspace opens in a
new tab.
• When you click a blue object in a column, such as a number in the Sensitive Fields column, a user's name,
or a number of classification policies, the associated workspace and specific page or list opens in a new
tab.
To view any workspace, you must have access privileges. To perform tasks in a workspace, you must have
privileges for specific types of tasks. Multiple open workspaces display as tabs. Use the tabs to navigate
between workspaces.
The following table describes the workspaces included in Data Privacy Management:
Workspace Description
Name
Actions Create reusable custom, email, service management, and system log actions. After you create a
reusable action, you can include it in security policies that allow the action type.
Reusable actions standardize the task that Data Privacy Management must perform when security
policy violations that include the actions occur or when the conditions of the policy are met.
Reusable actions also decrease the amount of time it takes to create a security policy.
Always View and manage Always Sensitive and Never Sensitive rules that specify column sensitivity for a
Sensitive/Never specific data domain, or for any data domain. Import or export the global sensitivity status of
Sensitive columns to the Data Privacy Management repository.
Anomaly View a generated list of unusual incidents that Data Privacy Management detects.
Detection For example, view a list of users who accessed a large volume of sensitive data during non-work
hours.
Classification Create, import, view, and manage classification policies to classify groupings of sensitive data
Policies according to industry or organizational data security standards.
Data Domains Create, import, view, and manage data domains to identify the semantics of sensitive columns
based on the column name or column data. Manage data domains that exist in the Model
repository.
Data Store Create, view, and manage data store groups. Logically group data stores into hierarchical
Groups groupings with parent and child levels, and then drag data stores into the groups.
Data Stores Create data stores or import data stores from a file to access data from all supported data
sources.
View a list and a summary of all data stores, view or edit data store properties, test data store
connections, merge duplicate data stores, synchronize users, update the username and password
associated with a data store, upload configuration files to fix data stores that have incomplete or
missing connection properties, import or export data store information, export data breach
reports, import protection status details for sensitive fields in a data store, and import catalog
metadata and resources.
Encryption Rules Create and manage encryption rules and associated encryption keys to protect sensitive fields in
Cloudera Hive data stores. Define encryption rules with the following encryption methods:
- Change metadata
- Preserve format and metadata
- Preserve metadata
Specify encryption rule definitions and keys in data domains and encryption protection task
properties.
Extensions View, configure, and manage extension connection properties to simplify the use of these
properties in actions and tasks. Specify extensions in actions you attach to security policies that
run tasks when the policies are violated or when the policy conditions are met. You must specify
an extension when you manually configure tasks. You can only specify extensions if the
extensions have an Active status and your role includes privileges to manage extensions.
Data Privacy Management includes the following extensions:
- Custom
- Email
- Protection: Encryption, Persistent Data Masking - Big Data, and Persistent Data Masking -
Remote Domain
- Service Management: ServiceNow
- System Log
Jobs View and manage past, present, and future jobs. For each job, view the job type, job properties, the
status of jobs and job steps, and the user who scheduled the job. View or export job logs that
contain informational, warning, and error messages that the job encountered. Pause, resume, and
terminate jobs and job steps.
User Interface 31
Workspace Description
Name
Locations Create, view, and manage locations to identify the geographic areas for each data source. View a
list and a summary of all locations.
For example, view how many data stores are assigned to a location.
Overview In the Privacy Dashboard, which is the default view when you log in to Data Privacy Management,
view and access enterprise-level information about subject data in the Subject Registry such as
the number of subjects, subject requests, subject types, and data stores.
In the Security Dashboard, view enterprise-level metrics such as the percentage of scanned data
stores, and the risk score, protection status, sensitivity level, number of policy impressions, and
risk cost of scanned data stores. View summary results based on elements such as geography,
sensitive data proliferation, departments, data domains, data stores, data store groups, user
activity, and risk score. Drill down to view details that help identify and remediate sensitive and
privacy data risks.
To customize the indicators that display on the Overview workspace, select Actions > Customize
Dashboard.
To change the dashboard view from the Privacy Dashboard to the Security Dashboard, select the
option on the Settings workspace.
Remote Agents Create and manage remote connections to data sources with sensitive data you encrypt or with
subject registry information. Associate a subject registry remote agent with unstructured data
stores to perform scans on unstructured data sources.
Configure the host and administrator port, associate the remote agent with data stores, test the
connection, export connection details to a CSV file, and publish the data store connection details
to the remote agent.
Risk Simulation Model the projected effects of data protection, including the estimated risk score, protection
Plans status, and estimated residual risk cost.
For example, model the outcomes of supporting a budget request for a masking solution, create a
risk simulation plan to model the projected ROI of prioritizing a data protection initiative, or
estimate the financial risk of unprotected sensitive data in the event of a data breach.
Scans Create, view, and manage scans to configure how and when Data Privacy Management discovers
and classifies sensitive data. View a list and a summary of all scans, including scans that are in
progress or scans that failed. View the scan properties, scan status, and a summary of the scan
results.
Security Policies Create, view, and manage security policy rules to identify specific situations. Data Privacy
Management includes the following security policy types:
- Anomaly
- Data Store
- Decryption
- User Activity
For example, create a security policy to determine when sensitive data moves across locations or
when a user accessed a table in a high-risk data store.
Security Policy Create security policy groups to organize related security policies.
Groups For example, create a security policy group for all PCI related security policies. Then, use security
policy groups as a filter option to quickly locate a violation on the Security Policy Violations
workspace.
Security Policy View a list of violations that Data Privacy Management generates when the conditions in a
Violations security policy are met.
Settings Specify the Security or Privacy dashboard view, the background color for the user interface, an
Axon resource name, and the default settings for remote agent scans, data privacy, sensitivity
levels, risk score factor weights, anomaly factor weights, risk cost and conformance score criteria,
and anomaly severity levels.
Subject Registry Locate and manage sensitive information related to a data subject. From the Subject Details page,
prepare and download an automated DSAR report. View details about sensitive data held for a
data subject, including whether the information is shared with third parties. Apply and clear legal
holds for data subjects. Create custom, email, and service management tasks to fulfill data
subject requests.
Suppression View and manage the rules that prevent specific anomalous factors from triggering an anomaly.
Rules
Tasks View and manage tasks that are associated with data stores to which you have access. Custom,
DSAR, email, encryption, protection, service management, and system log tasks run when you
create the tasks, when you schedule the tasks to run, or when the conditions of a security policy
that contains an action to perform the tasks are met.
Configure and schedule protection tasks on the Tasks workspace, and optionally rollback/decrypt
a closed encryption task that ran successfully. Run a DSAR task when you create a DSAR report
for a data subject on the Subject Details page.
User Activity View scan results based on user activity. For example, view the number of sensitive data
Summary impressions that users accessed by data domain, sensitivity level, and day of the week.
Users Import metadata about users and user access to display user activity on sensitive data. Import
metadata about users, user groups, and group memberships from LDAP directory services and
from CSV files. Import user access and user alias metadata from CSV files. Import users, user
access, and user activity from Salesforce.
The administrator can log in to Data Privacy Management directly from the service URL in Informatica
Administrator. When you log in to Data Privacy Management, you enter a user name and password.
• System Settings, 36
• Locations, 51
• Data Store Groups, 60
• Extensions, 65
• Remote Agents, 76
35
Chapter 2
System Settings
This chapter includes the following topics:
• Settings Workspace, 36
• Settings Workspace Properties, 39
• Changing the Default Background Color of the User Interface, 40
• Changing the Default Dashboard, 41
• Configuring Agent Scan Settings, 41
• Configuring Data Privacy Settings, 41
• Specifying Data Store Sensitivity Levels, 42
• Changing Risk Score Factor Weights, 42
• Adding the Maximum Risk Score to Data Store Details, 43
• Adding Custom Risk Score Factors, 43
• Changing Anomaly Factor Weights, 44
• Changing the Default Risk Cost, 45
• Changing the Default Conformance Ranges, 45
• Changing the Default Anomaly Severity Ranges, 46
• Configuring Axon Data Governance Integration, 47
• Configuring a Brand Logo, 47
Settings Workspace
On the Settings workspace, you can customize system settings that determine the color theme, default
dashboard, remote agent scan settings, data privacy settings, data store sensitivity levels, risk score and
anomaly factor weighting criteria, risk cost and conformance ranges, and severity ranges for anomalous user
behavior. You can also specify an Axon resource name.
To access the Settings workspace, click the Settings icon on the header.
36
The following image shows the default values on the Settings workspace:
Settings Workspace 37
38 Chapter 2: System Settings
Settings Workspace Properties
Customize the Settings workspace properties to specify global settings in Data Privacy Management that
best meet the needs of your organization or role.
Category Description
Defaults Dashboard: The dashboard that opens when you login to Data Privacy Management. Default is
Privacy Dashboard.
Theme: The background color for the user interface. Options are black and white. Default is a
black background.
Agent Scan The maximum file size that Data Privacy Management scans for unstructured data stores
Settings associated with a remote agent. When you scan an unstructured data store, Data Privacy
Management skips files that exceed the specified size, in MB.
Default is 200 MB.
You can enable Optical Character Recognition in unstructured scans. Default is disabled. You
cannot disable OCR after you enable and save the change.
Data Privacy In the Subject Data Report Purge Interval property, specify the number of days between each
Settings Subject Data Report Purge job. Default is 2.
In the Subject Data Report Retention Period property, specify the number of days to retain DSAR
reports. To always retain the subject data, enter 0. Default is 0.
Security Settings Sensitivity Levels: The degrees of data sensitivity. You can define a maximum of five sensitivity
levels that you select when you define classification policies. If a data store matches a
classification policy, Data Privacy Management assigns the sensitivity level of the classification
policy to the data store.
The default sensitivity levels from lowest to highest sensitivity are: Public, Internal, Confidential,
and Restricted. You can rename the defaults, add one sensitivity level, or delete a sensitivity level
if it is not associated with a classification policy.
Risk Score When you scan data stores, Data Privacy Management calculates a risk score for each data store
based on the weight of risk score factors. You can customize the risk score factor weights. At
least one factor must have a weight that is greater than zero. A pie chart dynamically adjusts the
weight of each factor when you move a slider.
The Settings workspace includes the following default risk score factors and weights:
- Sensitivity Level. Default weight is 15 percent.
- Protection Status. Default weight is 15 percent.
- Number of Sensitive Fields/Files. Default weight is 7 percent.
- Policy Impressions. Default weight is 7 percent.
- Number of Targets. Default weight is 15 percent.
- Residual Risk Cost. Default weight is 15 percent.
- User Access Count. Default weight is 15 percent.
- Impressions. Default weight is 7 percent.
To add a maximum of five new factors, click Add Custom Risk Score Factor and define settings
for each new factor.
Choose View Max Risk Score to view the maximum risk score value on data store details pages in
addition to the average risk score. The average risk score appears by default.
Anomaly Factor Anomaly detection identifies irregular or unusual patterns of user activity on sensitive data. Data
Weights Privacy Management detects anomalies based on anomalous factors and assigns a score to each
anomaly. The Settings workspace includes the following anomalous factors
- Impressions
- Data Domains
- Sensitive Events
- Time Of Day
- Day Of Week
- Data Stores.
- Unexpected Data Store
- Relocation
The default weight of each factor is 12 percent.
You can customize the anomaly score factor weights. At least one factor must have a weight that
is greater than zero. A pie chart dynamically adjusts the weight of each factor when you move a
slider.
Risk Cost and Risk Cost: An estimate of the default monetary cost to an organization if sensitive data is lost,
Conformance exposed to an unauthorized user, or becomes unavailable for use in ongoing operations. The
default cost is $50 for each occurrence of sensitive data. You can change the default currency unit
and amount.
Conformance: When a Scan job runs data profiling, Data Privacy Management calculates a
conformance score based on the percentage of column values that must match a data domain for
fields to be identified as sensitive, not sensitive, or requiring validation.
The default conformance matching ranges are:
- Reject: 0-39%
- Validate: 40-79%
- Accept: 80-100%
You can move the sliders to change the conformance limits. To automatically identify fields in the
Validate range as sensitive when you scan a data store, select Mark columns within the validate
range as sensitive. You can also change the default conformance limits in data domain
properties.
Anomaly Severity Range: Based on anomaly scores, Data Privacy Management identifies anomalies as high,
Severity Levels medium, or low severity. Default anomaly severity ranges are:
- Low. Anomaly scores of 1-39.
- Medium. Anomaly scores of 40-79.
- High. Anomaly scores of 80-100.
Axon Resource Axon Resource Name: Integrates Data Privacy Management with Axon Data Governance. Enter the
name of the Axon resource created in Enterprise Data Catalog.
To change the default dashboard, your user account must fulfil any of the following requirements:
When you create a classification policy, you select one of the sensitivity levels defined on the Settings
workspace. When a data store matches a classification policy, Data Privacy Management assigns the
sensitivity level of the classification policy to the data store.
Related Topics:
• “Sensitivity Level Indicator” on page 515
• “Departments Page” on page 538
• “Creating a Classification Policy” on page 226
Related Topics:
• “Risk Score Indicator” on page 512
3. Move the slider to select a weight for the custom risk score factor.
• Calculates the risk scores of data stores in future Scan jobs using the new settings.
• Starts the Recalculate Risk Scores job to recalculate and update the risk scores of previously scanned
data stores.
• Updates the new risk scores on the Security Dashboard, the Scan Details page, and the Data Stores list
page.
• Displays the custom risk score factor on the Data Store Details page for existing and future data stores.
• Displays the default value of the custom risk score factor on the Data Store Details page for future data
stores.
Related Topics:
• “Risk Score Indicator” on page 512
Related Topics:
• “Anomalous Factors” on page 358
Related Topics:
• “Residual Risk Indicator” on page 515
• “Classification Policy Properties” on page 220
3. To automatically identify fields in the Validate range as sensitive when you scan a data store, select
Mark columns within the validate range as sensitive.
4. Click Save.
When you create a data domain, the conformance properties on the Settings workspace are the default
settings on the Data Match tab. You can change the default settings for each data domain.
Related Topics:
• “Downloading Scan Job Reports” on page 306
• “Conformance Score” on page 209
Before you enter the Axon resource name, log in to Enterprise Data Catalog and create an Axon resource.
You can customize the logo that appears in the following locations:
• Favicon. The icon that appears with the Data Privacy Management web page.
• Login page
• Splash screen
• Startup screen
• Main page. The icon that appears on the main header on all pages.
You can choose to use the same image in all locations or use different images in specific locations. For
example, if the background color differs on the splash screen or startup screen, an image created for a
different background color might not appear properly on these screens. You can create different image files
with appropriate colors to appear on these screens
Read the reference information in the Image File Specifications section to understand recommended file
sizes for each location.
favicon Appears with the name of the web page. You can use one of the following file types:
- .ico
- .png
- .svg
.ico is the recommended file type.
logo Appears on the following pages unless an You can use one of the following file types:
override image file is created for the page: - .png
- Login page - .svg
- Main header on all pages .svg is the recommended file type.
- Startup screen
- Splash screen
loginPageLogo Login page You can use one of the following file types:
- .png
- .svg
.svg is the recommended file type.
mainHeaderLogo Main header on all pages You can use one of the following file types:
- .png
- .svg
.svg is the recommended file type.
startupScreenLogo Startup screen You can use one of the following file types:
- .png
- .svg
.svg is the recommended file type.
splashScreenLogo Splash screen You can use one of the following file types:
- .png
- .svg
.svg is the recommended file type.
Location Specification
Login page The login page image has the following specifications:
- Container Height: 30 px
- Background: #000000(Black)
Splash screen The splash screen image has the following specifications:
- Container Height: 40 px
- Background: #F2F2F2(Light)
Startup screen The startup screen image has the following specifications:
- Container Height: 33 px
- Background: #F2F2F2(Light)
Main header The main header image has the following specifications:
- Container Height: 26 px
- Background: #1d1d1d(Dark)
Locations
This chapter includes the following topics:
• Locations Overview, 51
• Locations Workspace, 51
• Location Properties, 52
• Location Management, 54
Locations Overview
A location is a geographic area of a data center, a geographic area from which user activity originates, or
both. For example, you can create a location to indicate the country, state, or city of the data center or user
activity.
Create a location for each data center that hosts the data sources that you want to scan. When you create a
location, you specify a pattern of host names or IP addresses to identify the data sources that belong to the
location. When you create or import a data store, Data Privacy Management tries to identify the data store
location based on the location configuration. If there is a match, Data Privacy Management assigns the
location to the data store. You can also manually assign a location to the data store when you update data
store properties.
Create a location for each geographic area from which user activity originates. When you create a location,
you specify an IP address subnet to identify a geographic origin of user activity. When evaluating user activity
events for anomalous behavior, Data Privacy Management tries to identify the event location by matching the
IP address in the event to the IP address range in the location configuration.
After you scan data stores, you can access the dashboard or the data store summary to view scan results for
all data stores assigned to a location. For example, you can view the scan results for all data stores in the
same country or state. You can view user activity anomalies based on the location or relocation anomalous
factors on the Anomaly Detection workspace.
Locations Workspace
You can manage locations on the Locations workspace. The Locations workspace displays a list of all
locations. You can expand a location to view the location details.
51
The following image shows an example of the Locations workspace:
1. Locations
2. Clear Filter icon
3. Filter conditions
4. Clear Filter Condition icon
5. Location details
6. Location summary
7. Filter
8. Actions menu
9. Refresh icon
Location Properties
Configure the location properties to identify the geographic areas of data sources and user activity.
Property Description
Name Required. The name is not case sensitive and must be unique within the Data Privacy Management
repository. The name cannot exceed 255 characters, contain spaces, or include the following special
characters: ~!$%^&*()+
Description Optional. Enter a long description of the location that is a maximum of 255 characters.
52 Chapter 3: Locations
Property Description
Regular Optional. Enter a Java-based regular expression that describes a pattern of values for the host name
Expression or IP address of the data stores that belong to the location. Data Privacy Management validates the
syntax of the expression when you save a location, and evaluates the expression to assign a default
value for the data store location when you create, edit, or import a data store.
If a data store matches more than one location, Data Privacy Management assigns the location of
the first matched regular expression. If a data store does not match a location, the location is
shown as Unknown. You can change the location if you do not want to use the matched location.
For example, enter 10\\.65\\.123.* to match data stores with an IP address that starts with
10.65.123. Enter injrlx.* to match data stores with a host name that starts with injrlx. The
host name match is not case sensitive.
The regular expression only identifies the location of data stores. To specify the location for user
activity, configure the IP Address / Range property.
For information about creating regular expressions, see tutorials and documentation for regular
expressions on the internet such as http://www.regular-expressions.info/tutorial.html.
Country Optional. Select a country from the list of countries for the selected region.
State Optional. For Canada and the United States, select a state from the list.
City Optional. For Canada and the United States, select a city from the list.
IP Address / Optional. Enter an IP address range or subnet that identifies a geographic location from which a
Range user performs an event on a data store. Enter IP addresses in the CIDR format. For example,
10.10.10.0/6. If you enter a format that is not CIDR, you receive an error when you save the
location. To add multiple entries, click the + icon. Data Privacy Management evaluates the IP
address range to assign a location to user activity events and detect anomalous user behavior.
Data Privacy Management evaluates user activity events for anomalous behavior and attempts to
identify the event location by matching the IP address in the event to the IP address range in
locations. If there is a match, Data Privacy Management assigns a location to the event. If there are
no matches, the event location is shown as Unknown.
Locations help you detect anomalous behavior based on the origin of user activity. For example, a
user performs an event from one location. An hour later, the user performs an event from a location
that is too far to travel within an hour. Data Privacy Management detects an anomaly to indicate
potential shared or compromised credentials.
To specify the location of data stores, configure the Regular Expression property.
VPN Optional. Specifies whether the IP Address / Range value is from a VPN provider. If enabled, Data
Privacy Management does not detect anomalies based on distance for the Relocation anomaly
factor. If disabled, Data Privacy Management detects anomalies based on distance for the
Relocation anomaly factor.
Location Properties 53
Location Management
You can create and manage locations on the Locations workspace. Location information is stored in the Data
Privacy Management repository.
You can create, view, edit, copy, or delete locations. You can import or export information about locations.
Creating a Location
Create a location for each geographic area of a data center or geographic area from which user activity
originates. For example, you can create a location to indicate the country, state, or city of the data center or
user activity.
Exporting Locations
You can export a list of locations to a CSV file. The export file includes a list of all locations and the
corresponding location properties. If you filtered the location list, the export file contains the locations that
match the active filter conditions.
You can use the exported CSV file to create, update, or delete locations in the Data Privacy Management
repository. The export file contains the same format that is required to import locations. You can edit and
then import the file to update location information.
54 Chapter 3: Locations
Importing Locations
When you import location information from a CSV file, the Import job runs immediately, using the details in
the file to create, update, or delete a location in the Data Privacy Management repository.
After the Import job finishes, you can view a list of the imported locations on the Locations workspace.
For more detailed descriptions of the column values, see “Location Properties” on page 52. The following
table lists the order of the columns, column headings, and the format for the values in the CSV file:
1 Name Required. The name is not case sensitive and must be unique within the repository. The
name cannot exceed 255 characters, contain spaces, or include the following special
characters: ~!$%^&*()+
If blank, the import job rejects the row.
2 Description Optional. Enter a long description of the location that is a maximum of 255 characters.
Location Management 55
Column Column Column Value Description
Order Heading
3 Regular Optional. Enter a Java-based regular expression that describes a pattern of values for
Expression the host name or IP address of the data stores that belong to the location. Data Privacy
Management validates the syntax of the expression when you save a location, and
evaluates the expression to assign a default value for the data store location when you
create, edit, or import a data store.
If the syntax is not valid, the Import job rejects the row.
4 Region Required. Region for the location. Enter one of the following values:
- APJ
- EMEA.
- LATAM
- NA
If blank, the Import job rejects the row.
5 Country Optional. Enter a valid country value. To view a list of valid values, create a new location
in the user interface. If the entry is not valid, the Import job rejects the row.
If blank, the Import job inserts an empty string.
6 State Optional. For Canada and the United States, enter a valid state value. To view a list of
valid values, create a new location in the user interface. If the entry is not valid, the
Import job rejects the row.
If blank, the Import job inserts an empty string.
7 City Optional. For Canada and the United States, enter a valid city value. To view a list of
valid values, create a new location in the user interface. If the entry is not valid, the
Import job rejects the row.
If blank, the Import job inserts an empty string.
56 Chapter 3: Locations
Column Column Column Value Description
Order Heading
8 IP Address Optional. Enter an IP address range or subnet that identifies a geographic location from
Range/VPN which a user performs an event on a data store. Data Privacy Management evaluates the
IP address range to assign a location to user activity events and detect anomalous user
behavior.
Use the following syntax to enter values:
<IP address in CIDR format>,<VPN>
Where
- <IP address CIDR format> includes the range of IP addresses. For example,
10.10.10.0/6.
- <VPN> specifies if the IP address or range is from a VPN provider. Enter Y to indicate
the IP address is from a VPN provider. Enter N to indicate the IP address is not from a
VPN provider.
If the entry is a VPN address, anomalies are not detected based on distance for the
Relocation anomaly factor.
Use a semicolon (;) to separate multiple entries. For example:
100.100.0.1/32,Y;10.10.10.3/9,N;1.1.1.1/2,N
If the entry is not in CIDR format, the Import job rejects the row.
9 Action Optional. Instructs the import job to create, update, or delete a location in the repository.
Enter one of the following values:
- U. Creates or updates a location in the repository.
If the location does not exist, the Import job creates the location.
If the location exists and the Replace Duplicates with Items Imported option is
enabled, the Import job updates the location.
If the location exists and the Replace Duplicates with Items Imported option is not
enabled, the Import job skips the row.
Warning: If a column in the CSV file contains a null value, the Import job replaces the
value in the repository with a null value.
- D. Deletes a location from the repository. If the location does not exist, the Import job
rejects the row.
If blank, default is U. Creates or updates a location in the repository.
Consider the following rules and guidelines when you import locations from a CSV file:
• The first row of the CSV file must contain case-sensitive column headings in the required sequence. If the
CSV file does not contain all of the column headings in the required sequence and case, the Import job
fails.
• Each subsequent row after the column headings defines a location. The Name column defines a unique
location. If the Name column in one row is the same as the Name column in a second row, the Import job
treats the second row as a duplicate of the first row.
• If the CSV file contains duplicate entries, the Import job processes the entries in the order in which the
rows appear in the file.
Location Management 57
• The column values cannot contain the following keywords or terms:
SELECT FROM
SELECT UNION
DELETE FROM
UPDATE SET
<SCRIPT
</SCRIPT>
ALERT(
"/>"
If a row contains keywords or characters that are not valid, the Import job rejects the row. The Import job
logs an error in the job log and processes the remaining rows.
• The column values cannot exceed 255 characters. If a row contains more than 255 characters in any
column, the Import job rejects the row.
• Enclose names that include a dot (.) or comma (,) within quotes for any column value that includes a
name. If a value for a name column includes a dot or comma not enclosed within quotes, the Import job
might not process the column values correctly. For example, the Import job processes a comma as a
delimiter between column values.
Editing a Location
When you edit a location, you can change all of the location properties.
Changes to the location properties do not retroactively change objects that are associated with the location.
For example, if you change the regular expression, data stores that are assigned to the location do not
change. If you change the IP address range, user activity events that matched the IP address do not change.
Changes apply the next time you create or import a data store, or the next time Data Privacy Management
evaluates anomalous behavior.
Copying a Location
To quickly create a location that has similar properties to an existing location, you can copy a location and
then edit the location properties.
58 Chapter 3: Locations
Changing Data Store Location Assignments
When you create a data store, you assign a location to the data store. When you import data stores from a
parent data store, such as Informatica PowerCenter, or from a file, Data Privacy Management assigns a
location to the data stores.
You can change location assignments for data stores manually or in bulk.
To update data store locations manually, go to the Data Stores workspace. Select the data store with a
location you want to change. From the Actions menu, select Edit and select a different value in the Location
property.
To update data store locations in bulk, go to the Data Stores workspace. From the Actions menu, select
Export > Data Stores. The export CSV file includes a list of all data stores and the corresponding data store
properties. Update the location assignment in the file. Then, select Import from the Actions menu to import
the file.
Related Topics:
• “Editing a Data Store” on page 94
• “Importing Data Stores” on page 97
Deleting a Location
You can delete a location if the location is not assigned to any data stores. To delete a location that is
assigned to data stores, first change the location assignment for the data stores associated with the
location.
Location Management 59
Chapter 4
Data Privacy Management includes system-defined data store groups to help you manage data stores and
data store group assignments.
You can also create data store groups and define the hierarchy of the groups. For example, you can group
data stores by application type or line of business. When you create a data store, you can assign a data store
group to the data store.
After you scan data stores, you can view the scan results for all data stores assigned to a data store group on
the Overview workspace or the Data Stores Summary workspace. For example, you can view the scan results
for all data stores in a Finance data store group.
To access the Data Store Groups workspace, click Manage > Data Store Groups.
60
The following image shows an example of the Data Store Groups workspace:
System- Description
Defined
Data Store
Group
Default When you create a data store, you must assign a data store group. You might want to temporarily
assign a data store to the Default data store group until you set up data store groups. When you import
data stores from a parent repository, such as Informatica PowerCenter, the data stores are assigned to
the Default data store group.
You can add or remove data stores from the Default data store group, but you cannot edit or delete the
Default data store group. Also, you cannot nest another data store group within the Default data store
group.
All The All data store group displays a list of data stores across all data store groups. You can use the All
data store group to view data stores or to change the data store group assignment for multiple data
stores. When you click the All data store group, a list of all data stores appears. To change the data
store group assignment, select multiple data stores. Then, drag the data stores to another data store
group.
The All data store group only appears in the Data Store Groups workspace. You cannot edit or delete
the All data store group, assign data stores to the All data store group, or nest another data store group
within the All data store group.
You can use the data store group toolbar icons to perform the following tasks:
You can change data store group assignments manually on the Data Store Groups workspace. You can
change data store group assignments manually or in bulk on the Data Stores workspace.
To change data store group assignments manually on the Data Store Groups workspace, perform the
following steps:
To change data store group assignments in bulk on the Data Stores workspace, select Export > Data Stores
from the Actions menu. The export CSV file includes a list of all data stores and the corresponding data store
Extensions
This chapter includes the following topics:
• Extensions Overview, 65
• Extension Types, 66
• Supported Data Store Types for Protection Extension Plugins, 66
• Extensions Workspace, 67
• Extension Properties, 68
• Extension Management, 73
Extensions Overview
Data Privacy Management stores extension details so you can configure connection properties one time, and
then re-use the connection properties when you specify the extension name in actions, security policies, or
tasks.
When you add an action to a security policy or create an action to run a task, you specify an extension
associated with the type of action. For example, if you add an action to send an email in a user activity
security policy, you select a configured email extension.
Note: To specify an extension in actions and tasks, the extension must have an Active status.
• Custom
• Email
• Protection
• Service management
• System log
Data Privacy Management provides plugins that correspond to the extension types. You select the plugin
when you create an extension.
65
Extension Types
Data Privacy Management includes extensions that you can use in custom, email, protection, service
management, and system log actions and tasks.
You can create or edit the following types of extensions if your role includes privileges to manage extensions:
Custom - Custom Plugin V1 Specifies the file path to a directory that contains an executable file,
- DSR Custom Plugin such as a script file, that performs a custom task when the task runs.
The executable file must be located on the same server on which Data
Privacy Management is installed.
Email - DSR Email Plugin Configures the server connection and security settings for actions that
- Email plugin V1 send an email to specified recipients when the email task runs.
Protection - Encryption Determines the way that tasks will protect sensitive data. Protection
- Persistent Data tasks are associated with one data store.
Masking - Big Data You can add protection extensions to data domains to define the
- Persistent Data default protection rules for sensitive fields.
Masking - Remote
Domain For example, if you create a protection task for sensitive fields in a
data domain and the sensitive fields exist in six data stores, six
protection tasks appear on the Tasks workspace.
Service - DSR ServiceNow Creates third-party service management tickets for issues that users or
Management Plugin security policy violations identify or to fulfill data subject requests.
- ServiceNow Plugin V1 Data Privacy Management supports service management extensions
for the ServiceNow application.
System Log System Log plugin V1 Creates a log message that Data Privacy Management writes to a
remote system log server when system log tasks run.
The following table lists the supported data store types for each protection extension plugin:
Data store types not listed in this table are not supported for protection extension plugins.
66 Chapter 5: Extensions
Protection Extension Plugin Supported Data Store Types
Extensions Workspace
On the Extensions workspace, you can view and manage custom, email, protection, service management, and
system log extensions. The workspace includes a page that displays a list of extensions and a detail page for
each extension.
The workspace lists extensions by name, category, status, and plugin. You can view extensions that are
associated with at least one data store that you can access. You can also create new extensions and edit or
delete existing extensions if your role includes privileges to manage extensions.
Extensions Workspace 67
Extensions List Page
The Extensions list page displays the extensions that you have access to view or manage. By default, you
view the Extensions list page when you access the Extensions workspace.
1. Extensions
2. Extension count
3. Clear Filter icon
4. Filter conditions
5. Clear Filter Condition icon
6. Filter icon
7. Extension properties
8. New button
9. Actions menu
10.Refresh icon
You can access extension properties from this page by clicking an extension name.
Extension Properties
Extension properties configure details such as the extension category, plugin, status, connection information,
and user credentials. Data Privacy Management re-uses the connection properties when you select an
extension name in actions, security policies, and tasks.
Property Description
Name Required. Unique name for the extension. The name cannot exceed 255 characters.
Description Optional. Long description of the extension that cannot exceed 255 characters.
68 Chapter 5: Extensions
Property Description
Plugin Required. The name that specifies the sub-category for each extension. When you select an extension
category, the plugin or list of plugins configured for the category appears. You cannot edit plugin
names.
Active Optional. Select to indicate that the extension is currently in use. To specify an extension in actions,
security policies, and tasks, the extension must have an Active status.
Note: You cannot delete extensions that are associated with Active tasks.
Important: Before you create a custom extension, you must first create a script file that you can specify in the
extension properties.
You can specify custom extensions in actions, security policies, and custom tasks.
Property Description
Script File Required. Enter the file path for the executable script file. For example: tmp/
Path updateLdapUserAtrribute.sh
The script file must be located on the same server where Data Privacy Management is installed. When
you add a custom extension to a custom action in a security policy, Data Privacy Management will run
the file to perform customized tasks in the event of a security policy violation. You can also add a
custom extension to a new or existing custom action to run the file immediately, or you can save the
action as a task to run manually later.
For example, you create a script that adds users who trigger security policy violations to a monitored
LDAP user group named HighRiskUsers. You provide the script file path in the custom extension
properties. When a violation of a security policy that contains the custom action occurs, the file adds
the user to the HighRiskUsers LDAP user group.
You can specify email extensions in actions, security policies, and email tasks.
Property Description
Server Host Name Required. The name of the host server that sends emails.
User Name Required. The user name with credentials for the specified host server.
Password Required. The password for the user at the specified host server. To change the password,
select Modify and enter the new password.
Extension Properties 69
Property Description
Authentication Enabled Optional. Select to indicate that secure authentication is required to connect to the
specified server.
Use Security Optional. Select to enable encryption for the connection to the specified server.
Security Protocol Required if you select Use Security. Select SSL (Secure Sockets Layer) or TLS (Transport
Layer Security).
Sender Email Address Required. The email address that displays in the From field by default in email actions that
use this extension.
You can specify encryption protection extensions in data domains, decryption security policies, and
protection tasks.
Property Description
Mark column as Optional. Select to mark sensitive fields in data domains as protected after a task
protected after configured with the extension runs.
execution
Port Required. The port number of the Persistent Data Masking Service.
Username Required. The user name with credentials for the specified host server.
Password Required. The password for the user specified for the Persistent Data Masking Service.
Is SSL Enabled Optional. Select to enable Secure Sockets Layer (SSL) encryption for the connection to the
specified Persistent Data Masking Service. Default is selected.
Max Parallel Sessions Optional. The maximum number of parallel sessions of the Persistent Data Masking Service
that can run concurrently. Default is 5.
Locale Code Optional. Select a language and associated country. Default is English (United States).
You can specify Persistent Data Masking - Big Data protection extensions in data domains, security policies,
and protection tasks.
70 Chapter 5: Extensions
The following table describes the Persistent Data Masking - Big Data protection extension properties:
Property Description
Mark column as Optional. Select to designate columns in data domains that use the extension as protected
protected after after an associated protection task runs.
execution
Port Required. The port number of the Persistent Data Masking Service.
Username Required. The user name with credentials for the specified Persistent Data Masking Service.
Password Required. The password for the user specified for the Persistent Data Masking Service.
Is SSL Enabled Optional. Select to enable Secure Sockets Layer (SSL) encryption for the connection to the
specified Persistent Data Masking Service.
Max Parallel Sessions Optional. The maximum number of parallel sessions of the masking service that can run
concurrently. Default is 5.
Locale Code Optional. Select a language and associated country. Default is English (United States).
You can specify Persistent Data Masking - Remote Domain protection extensions in data domains, security
policies, and protection tasks.
The following table describes the Persistent Data Masking - Remote Domain protection extension properties:
Property Description
Mark column as Optional. Select to designate columns in data domains that use the extension as protected
protected after after an associated protection task runs.
execution
Port Required. The port number of the Persistent Data Masking Service.
Username Required. The username with credentials for the specified Persistent Data Masking Service.
Password Required. The password for the user specified for the Persistent Data Masking Service.
Is SSL Enabled Optional. Select to enable Secure Sockets Layer (SSL) encryption for the connection to the
specified Persistent Data Masking service.
Stop on Error Optional. Specifies whether a protection task based on this protection extension will stop
running if it encounters the specified number of errors. Default is 1.
Extension Properties 71
Property Description
Suspend on Error Optional. Select to indicate that a protection task based on this protection extension will
suspend running if it encounters an error.
Rollback Transactions Optional. Select to indicate that a data source that uses this protection extension will
on Error rollback transactions if it encounters an error. Default is selected.
Enable High Availability Optional. Select to enable high availability on recovery of data sources associated with the
on Recovery protection extension.
Commit Interval Optional. Specify the interval, in milliseconds, between commits on the data domains
associated with the protection extension. Default is 10000.
Error Log Type Optional. Select the error log file format. Default is None.
Error Log File Directory Required if you select an error log type. The Data Privacy Management directory that
contains the error log file for the data domains that use the protection extension.
Max Parallel Sessions Optional. The maximum number of parallel sessions of the masking service that can run
concurrently. Default is 5.
DTM Buffer Size Optional. Specifies the amount of buffer memory that the Integration Service uses when the
Data Transformation Manager (DTM) processes a session. Default is Auto.
Default Buffer Block Optional. Specifies the amount of buffer memory used to move a block of data from the
Size source to the target. Default is Auto.
Locale Code Optional. Select a language and associated country. Default is English (United States).
Target Load Type Optional. The method that the masking service uses to load data to targets. Options are:
- Bulk. Loads data to targets without creating a database log. This option improves
session performance but prevents the target database from recovering from an
incomplete session. Target tables cannot contain key constraints in bulk loading.
- Normal. Creates database logs to enable the target database to recover from an
incomplete session, but slows performance.
Default is Normal.
Batch Update Optional. Indicates whether to update protected records in batches. Default is No.
Batch Size Optional if you select to perform batch updates. The number of records to update in each
batch update operation.
You can specify service management extensions in actions, security policies, and service management tasks.
72 Chapter 5: Extensions
The following table describes the service management extension properties:
Property Description
Username Required. The user name with credentials for the specified host server.
Password Required. The password for the user at the specified host server. To change the password, select
Modify and enter the new password.
Table Name Required. The name of the ServiceNow table in which Data Privacy Management creates a record of the
service management ticket when tasks configured with the extension run. Enter one of the following
table names in the format shown, based on the ticket type:
- incident
- problem
- change_request
Default is incident.
You can specify system log extensions in actions, security policies, and system log tasks.
Property Description
Protocol Required. The protocol that Data Privacy Management will use to connect to the system log server.
Options are: TCP, TCP/SSL, and UDP.
To use TCP/SSL protocol, first import the SSL certificate for the system log server to the Informatica
domain truststore, and then restart the Informatica domain.
Extension Management
You can create and manage extensions on the Extensions workspace.
When you update an extension on the Extensions workspace, actions and tasks that include the extension
also update with the new connection details.
Extension Management 73
Adding or Updating a Protection Extension on the Data Domain
Details Page
You can add a new protection extension to a data domain or update the existing protection rules that are
associated with a data domain.
Related Topics:
• “Data Domain Properties” on page 202
Creating an Extension
Create an extension to save the connection details for easy re-use in actions, security policies, and tasks.
74 Chapter 5: Extensions
8. Configure the properties for the extension.
9. Click Save.
Editing an Extension
When you edit an extension, the tasks associated with the extension update with the new properties.
Deleting an Extension
You can delete an inactive extension if there are no tasks or jobs in progress that are associated with the
extension. If jobs are running, go to the Jobs workspace. Terminate the jobs, delete the extension, and then
configure any associated tasks with a different extension.
Extension Management 75
Chapter 6
Remote Agents
This chapter includes the following topics:
Data Privacy Management includes remote agents for protection and subject registry.
You can create, edit, delete, copy, export, import, test connections, and view remote agent details on the
Remote Agents workspace.
You can also publish associated data store details and port connection information to the protection remote
agent.
After you create a subject registry remote agent, create and run a scan on the Scans workspace. You can run
a Subject Registry scan to build the index and identity map. You can run a Domain Discovery scan to identify
sensitive fields and files that match data domain conditions and classification policies.
76
Guidelines for Remote Agents
Consider the following guidelines when you configure remote agents:
• Configure the machines that host remote agents to use 1 GB per thread memory share for each thread.
Edit the following JVM parameter in the siagent.sh file: -Xmx to specify the correct amount of memory to
use for each thread. You can view the result in the following file in the Tomcat Logs folder on the remote
agent machine: catalina.out
For example, if the remote agent runs on an 8-core machine, the recommended memory heap size setting
is 10 GB.
• For Hive sources that you plan to associate with protection remote agents to encrypt and decrypt
sensitive data, set the Set hive.fetch.task.conversion Hive parameter to more after you configure
connectivity to the Hive source.
• For Subject Registry remote agents that connect to Microsoft SharePoint or OneDrive data stores, Scan
jobs finish successfully but the jobs report an error scanning folders with more than 5,000 files. The Scan
job error occurs because the file limit is a default SharePoint and OneDrive setting.
To prevent the error during a scan, you can split the data store folder into multiple folders or site
locations, move files into the new folders until each folder includes fewer than 5,000 files, and scan the
data store again.
• Configure properties in the siagent.properties file to manage the agent job. You can enable the following
properties:
- si.fileProcessingTimeout. A time limit for file processing. If the processing does not complete, the
specific file processing is canceled and reported in the Exception report. Enter the time in seconds. For
example, si.fileProcessingTimeout=1800 indicates a timeout value of 30 minutes.
- si.sentencePartitionLength. Block size of a sentence for file processing. Binary or unknown file types
might take a long time to process because of missing delimiters in the content. Enable this property to
ensure that sentences of a specific file type are divided into chunks based on the value specified in
bytes. For example: si.sentencePartitionLength=1000
Note: The property does not apply to file types that the agent supports. To include a file type, delete the
extension from the si.sentencePartitionExcludeList property.
- si.sentencePartitionExcludeList. A list of file types that do not apply to the si.sentencePartitionLength
property. Enter the file extensions separated by commas. For example:
si.sentencePartitionExcludeList=NO_EXTN,txt,zip,json,xml.
The workspace displays remote agents by name, host, remote agent type, and the user name of the user who
created the remote agent. You can create new remote agents and view, edit, copy, or delete existing remote
agents. You can also export and import remote agent details, test remote agent connections, and publish
associated data store details and port connection information to protection remote agents.
To access the Remote Agents workspace, click Manage > Remote Agents.
Name Enter a full or partial remote agent name. The workspace displays a list of remote agents that
match the name or name fragment.
Host Enter a full or partial remote agent host address. The workspace displays a list of remote
agents that match the host or host fragment.
Data Store Select one or more data stores to filter the list by remote agents associated with the selected
data stores.
• Click the name of a protection remote agent. To edit the properties, click Edit on the Remote Agent
Details page.
• Click the name of a subject registry remote agent. To edit the properties, click Edit on the Remote Agent
Details page.
• Select the check box next to a subject registry remote agent and then select Open or Edit from the
Actions menu.
Protection and subject registry remote agents include the following properties:
Property Description
Name Required. Unique name for the remote agent. The name cannot exceed 255 characters,
Description Optional. Long description of the remote agent that contains no more than 255 characters.
Admin Port Required. The admin port number of the remote agent at the specified host address.
Test Remote Optional. Verifies that the connection to the remote agent server is available and that the
Agent specified host and admin port properties successfully connect to the remote agent server. When
you create or edit a remote agent, Data Privacy Management does not automatically validate the
connection details.
Data Store Optional. To associate a data store with the remote agent, select a data store. To associate
additional data stores, click the plus sign (+).
Port Required if you specify a data store. Enter the port number for the data store.
After you create a subject registry remote agent, run a Subject Registry scan to build the index and identity
map. You can also run a Domain Discovery scan to discover and classify sensitive data in unstructured data
stores with the remote agent scanner.
The export file includes a list of remote agents and the corresponding properties. You can export all remote
agents, a filtered list of remote agents, or individually selected remote agents.
Export File Column Export File Column Remote Agent Property Name Description
Name
G Action - Optional.
By default, Replace Duplicates with Items Imported is enabled. This option prevents creating duplicate
remote agents by updating existing agents with new details from the imported file.
3. Click in the empty field.
The Open dialog box appears.
4. Browse to the directory that contains the CSV file with the remote agent details you want to import,
select the file, and then click Open.
The file name appears in the field on the Import Remote Agents page. A message indicates the number
of remote agents that Data Privacy Management found in the CSV file.
5. Click Select.
6. Click Import.
Before you can publish a protection remote agent, you must copy the keystore file from the Data Privacy
Management instance to the remote agent machine, and then import the keystore file to the truststore that
the remote agent uses.
When you publish a protection remote agent, Data Privacy Management pushes the connection details of
data stores associated with the remote agent to the remote agent machine. You can publish one or more
protection remote agents at a time.
Related Topics:
• “Security Policy Properties” on page 450
• “Decryption Policy Example” on page 448
• Data Stores, 86
• Data Store Properties, 125
• Data Domains, 199
• Classification Policies, 218
• Column Sensitivity , 232
• Scans, 243
• Scan Options, 257
• Jobs, 280
85
Chapter 7
Data Stores
This chapter includes the following topics:
You can create a data store that connects to one of the following data source types:
• SAP applications
• Big Data applications
• Cloud applications
• Data Integration services
• Databases
• File Management systems
• NoSQL
• Email Server
You can create data stores manually or you can import data stores from a parent repository, such as
Informatica PowerCenter, and from CSV files.
When you create a data store, define or import data store metadata. For example, assign a data store group
and security group to the data store, or import Catalog metadata or protection status from a CSV file for the
data store from the Actions menu.
For most data stores, you perform a scan to identify sensitive data. When you scan a data store that
connects to a parent repository, such as Informatica PowerCenter, a Scan job imports data stores.
If you scan a data store using Enterprise Data Catalog and then edit the data store to enable the Scan with
Remote Agent property, the scan history and sensitive data information is deleted from Data Privacy
Management but remains in Enterprise Data Catalog.
86
You can manage the data stores that the Scan job imported. For example, you might need to merge data
stores that appear to connect to the same data source. Also, you might need to update data stores that have
null or parameterized connection properties.
For Cloudera Hive data stores that you want to protect with encryption extensions, rules, and keys, you import
Catalog metadata and protection status for the data store on the Data Stores workspace.
After you scan a data store, you can view the data store to see a summary of the scan results.
The workspace displays the list of data stores by data store name, user name, data store type, risk score, the
number of scans, and the percentage of protected sensitive fields. You can filter the list by all data stores,
potential duplicates, incomplete data stores, data stores that are not valid, or data stores that are not
scanned or partially scanned.
You can configure Risk Score settings on the dashboard to display the maximum risk score in addition to the
average risk score that appears by default.
To access the Data Stores workspace, click Manage > Data Stores.
Access the Data Store Details page in one of the following ways:
• Click the name of a data store. To edit the properties, click Edit on the Data Store Details page.
• Select the check box next to a data store and then select Open or Edit from the Actions menu.
The following image shows an example Data Store Details page for a file system data store:
If the Associated Remote Agents property lists a remote agent, you can click the remote agent name to view
the details if you have the View Proxies privilege.
Related Topics:
• “Discovery Bar Indicator” on page 512
• “Proliferation Page” on page 539
• “Adding Custom Risk Score Factors” on page 43
Data Stores Default filter. Displays all data stores in the Data Privacy Management repository that you can
access.
Potential Displays groups of data stores that might connect to the same data source because more than one
Duplicates data store contains the same connection details.
Review the list of data stores and verify the connection details. Edit incorrect connection details,
and merge data stores that are duplicates. If you do not merge or correct duplicate data stores, the
Data Stores Summary page will contain an incorrect data store count.
Not Complete Displays data stores with required connection properties that are not configured. You cannot scan
data stores with incomplete connection properties.
To fix an incomplete data store, edit the connection properties. For data stores that were imported
from the PowerCenter repository, you can also select Fix Data Stores from the Actions menu on the
Data Stores workspace to upload configuration files.
Invalid Displays data stores to which Data Privacy Management cannot connect for one of the following
reasons:
- The data source machine or server is down for maintenance.
- The user or password for the data source has changed.
- Data Privacy Management cannot determine the data store type, and the type value is Any.
You cannot scan data stores that are not valid. Determine the reason for the Invalid status, and edit
the data store user, password, or type if required.
Not Scanned Displays data stores that are not scanned and partially scanned. A data store is not scanned if Data
and Partially Privacy Management has not completed at least one successful Scan job. A data store is partially
Scanned scanned if an Informatica PowerCenter Scan job imported the data store.
Related Topics:
• “Data Stores Summary Workspace” on page 578
The Data Privacy Management Service updates the data store summary metrics when a Scan job finishes or
when you import Catalog metadata or protection status information for a data store.
Risk Score Summarizes the risk level of the data store based on the Risk Score and Anomaly Factor Weights
configurations on the Settings workspace. The Scan job determines the risk score based on the
risk scoring configuration.
Scans List the number of times the data store has been scanned. The number includes completed scans,
scans that are in progress, scans that are scheduled to run in the future, and failed scans.
Sensitive Lists the number of sensitive data domains in the data store identified in a Scan job or from a
Domains manual import of Catalog metadata and protection status information on the Data Stores
Discovered workspace.
One data domain can include multiple columns in a data store. For this reason, the number of
sensitive data domains can be different from the number of sensitive fields, or columns, in a data
store.
Sensitivity Level Indicates the sensitivity level of the classification policy that matches the sensitive fields in data
domains that are included in the data store. The sensitivity levels are defined in the Security
Settings pane on the Settings workspace. If a data store matches more than one classification
policy, Data Privacy Management uses the most sensitive level.
Protection Indicates whether the sensitive data domains in the data store are Unprotected, Partially
Status Protected, or Protected.
After you create data stores, the next step is to discover sensitive columns using one of the following
methods:
• On the Scans workspace, run a Scan job for the data store.
• On the Data Stores workspace, import Catalog metadata and then import protection status.
After you identify the sensitive fields in data stores and determine if the fields are protected or unprotected,
you can perform the following tasks:
• Merge data stores that appear to connect to the same data source.
• Upload configuration files to fix data stores that have incomplete or missing connection properties.
• Update the user name and password associated with a data store.
• Synchronize user details.
• Import connection details, data store aliases, data owners, protection status, data lineage information,
connection assignments, Catalog metadata, and Catalog resources.
• Export connection details, data store aliases, and breach reports as CSV files. Export data store
configuration details as a text file to configure data store connectivity.
• View, copy, edit, or delete data stores.
Related Topics:
• “Data Store Properties Overview” on page 126
Verify that you configured connectivity to the data store. For example, if you create an Oracle data store,
configure the tnsnames.ora file on the machine that hosts Enterprise Data Catalog. If you do not configure
the connectivity, the connection test and the scan job will fail.
If you create an unstructured data store and enable the Scan with Remote Agent option, you can test the
connectivity to the data store after you associate the data store with a remote agent on the Remote Agents
workspace.
If you configure an Amazon S3, File System, HDFS, Microsoft Azure Blob Storage, Microsoft OneDrive, or
Microsoft Sharepoint data store to run scans with a remote agent, and you have not associated the data
store with a remote agent, Data Privacy Management tests the connection with Enterprise Data Catalog.
If the connection test is successful, a success message indicates that the connection test was successful
but that the data store must be associated with a remote agent before you can run a scan job.
When an unstructured data store is associated with multiple remote agents, Data Privacy Management tests
the connection through all associated remote agents. If any of the associated agents can connect to the
source, the connection test is successful.
If you configure a Google Drive or Office 365 Outlook data store to run scans with a remote agent, and you
have not associated the data store with a remote agent, a message informs you that the connection test will
not work until you associate the data store with a remote agent.
The following table lists the file extensions that Data Privacy Management can scan with a remote agent for
each file type:
Compressed Files gz, tgz, xz, 7z, Z, gtar, zip, bz2, tbz2, tar, tar.bz2, tar.gz, tar.xz, tgz2
Other rtf
PDF pdf
Webpage Files chm, oth, xhtml, xht, mhtml, html, htm, ihtml
Verify that the data store that you want to copy has the same category and data store type as the data store
that you want to create.
You can use the Not Valid filter to find data stores that do not have valid connection properties, and you can
use the Not Complete filter to find data stores with incomplete connection properties.
When you scan an Informatica PowerCenter data store, the Scan job imports data stores and associated
connection strings from the PowerCenter repository. The Scan job uses values from the database
configuration files or PowerCenter parameter files to update the remaining data store connection properties,
such as the host name and port.
The Scan job reads the files from the machine on which the PowerCenter Repository Service runs. If the Scan
job cannot access or read the files, the data stores can contain null or parameterized values for the
connection properties. You cannot scan data stores that do not have complete connection properties.
Consider the following rules and guidelines when you fix a data store:
• You can fix data stores that a Scan job imported from a PowerCenter repository. You cannot fix data
stores that you manually created or imported from a CSV file.
• When you fix data stores, you specify a PowerCenter repository from which a Scan job imported the data
stores. You can specify one PowerCenter repository. Run the procedure again to fix data stores that a
Scan job imported from another PowerCenter repository.
• When you fix data stores, you can specify multiple files. For example, you can select a tnsname.ora file
and an ODBC.ini file. The Data Privacy Management Service creates one Import job for each file.
• To fix data stores that connect to IBM Db2, Oracle, and Sybase databases, import the corresponding
database configuration files. To fix data stores that connect to all other supported relational databases,
import the ODBC configuration files.
• When you fix a data store that connects to an IBM Db2 database, you must import the db2cli.ini,
db2.out, and node_db2.out database configuration files in sequential order. Run the procedure once for
each file. If you try to import the files at the same time, the Import job does not update the data store
connection properties.
• If you specify multiple files that contain the same connection string, the Data Privacy Management
Service updates the data store with the connection properties in the last file that the service imports.
• The Import job updates connection properties for data stores that exist in the repository and replaces
existing connection properties with the values from the files.
• The Import job does not create data stores. If the files contain information for a data store that does not
exist in the repository, the Import job skips the row.
You must have the Delete Data Stores privilege to reset the classification results of a data store.
If you made changes to the data after you scanned the data store, you might want to run another scan to view
the sensitive fields. A reset purges all sensitivity results and retains the data store information, including
• You cannot reset a data store if it has a task in progress, failed, or completed state. The data store must
be in closed state.
• You cannot reset an encrypted data store.
• You cannot stop, terminate, or pause a reset job.
• You cannot reset multiple data stores. Run a reset on individual data stores.
• You must have the Delete Data Stores privilege to reset a data store.
If a selected data store is included in an active Scan job, the update fails for all data stores. An active Scan
job is a job that does not have a completed or terminated status.
Add new users and configure roles, privileges, and assign the users to user groups in the Administrator tool
before synchronizing users. For more information, see the Informatica Data Privacy Management
Administrator Guide.
After the Synchronize Users job completes successfully, you must log out of Data Privacy Management and
log in again for the new user details to appear.
You can import information for Application, Big Data, Cloud, Database Management, and File Management
data stores. You cannot import Cloudera Navigator or Data Integration data stores.
You might want to import data stores from a CSV file to analyze data sources that are not included in a
parent repository. For example, you have an asset management system that contains connection details for
all data sources in an organization. You can export the list of data sources in which you want to identify
sensitive data. Data Privacy Management creates one data store for each entry in the CSV file.
You can also create a CSV file to update an existing data store. Data Privacy Management replaces the
details of the existing data store with the details in the CSV file.
Related Topics:
• “Import Job” on page 296
Each column is equivalent to a data store property. The description for the data store property depends on
the category. For more information, see the Data Store Properties section to view the properties for the
corresponding category and data store type.
2 RepoType Required. Category to which you create a connection. Enter one of the
following values:
- Application. SAP
- Cloud
- DB. Database Management
- FS. File Server
- Hadoop. Big Data
- NoSQL. Apache Cassandra
You cannot import data stores for Cloudera Navigator or data store
types included in the Data Integration category.
3 DBType Required. Data store type for the data store category. The required entry
depends on the value you specified for the RepoType property.
- Application . Enter SAP.
- Cloud. Enter one of the following values: Amazon Redshift, Azure
Data Lake, Microsoft Azure SQL Data Warehouse, Microsoft Azure
SQL Database, Salesforce, or Snowflake.
- Database Management. Enter one of the following values: DB2,
DB2i5OS, DB2zOS, JDBC, Microsoft SQL Server, Netezza, Oracle, SAP
HANA, Sybase, or Teradata.
- File Server. Enter one of the following values: Amazon S3, AzureBlob,
Google Drive, HDFS, Microsoft OneDrive, Office365Outlook, or
Microsoft SharePoint.
- Hadoop. Enter Hive.
- NoSQL. Enter Apache Cassandra.
4 Host/URL/Account Optional. Description depends on the value you specified for the
RepoType property.
Application. The SAP host.
Cloud:
- Amazon Redshift. The host name or IP address for the Redhsift
server.
- Azure Data Lake. The OAuth 2.0 token endpoint URL in the Azure
portal.
- Microsoft Azure SQL Data Warehouse. The host name or IP address
for the Azure server.
- Microsoft Azure SQL Database. The host name or IP address for the
Azure server.
- Salesforce. The Salesforce Service URL.
- Snowflake. The name of the Snowflake account. In the Snowflake
URL, the account name is the first segment in the domain.
Database Management. Host name or IP address of the database server
for the following data store types: DB2, DB2i5OS, DB2zOS, JDBC,
Microsoft SQL Server, Netezza, Oracle, SAP HANA, Sybase, and Teradata
File Management:
- Amazon S3. The Amazon Web Services bucket URL.
- Azure Blob. The Microsoft Azure Blob Storage URL to access a
container.
- File Server. The primary name node URI.
- Google Drive. The Client ID.
- HDFS. The URI to the active HDFS NameNode.
- Microsoft OneDrive. The URL to access OneDrive.
- Microsoft SharePoint. The SharePoint URL.
Hadoop. JDBC connection URL that accesses the Hadoop server. Use
one of the following formats:
- Non-Kerberos source: jdbc:hive2://<host>:<port>/<DB>
- Kerberos source: jdbc:hive2://<host>:<port>/
<DB>;principal=<serviceName>/<host>@realm
NoSQL. The name of the Apache Cassandra host.
5 SchemaOption/ Required for File Management data stores and Cassandra data stores.
KeyspaceOption Specifies how you add schemas to the data store. Enter SCHEMA to enter
a schema name. Leave the column blank to enter a schema path.
6 Schema/Keyspace/Path Required for File Management data stores and Cassandra data stores. If
you entered SCHEMA in the SchemaOption property, enter the schema
name. If you left the property blank, enter the file path to the schema.
For example: /mount_cifs/Root_Folder_Unstructured/
FileTypes
7 Port Optional. Port number for the data store server. Valid for the following
data store types:
- Amazon Redshift
- Db2
- Db2 for z/OS
- File Server
- JDBC
- Microsoft Azure SQL Database
- Microsoft Azure SQL Data Warehouse
- Microsoft SQL Server
- Netezza
- Oracle
- SAP HANA
- Sybase
- Apache Cassandra
8 ConnectString Optional. Native connection string used to access data from the data
store. Valid for the following data store types:
- Db2
- Db2 for i5/OS
- JDBC
- Microsoft Azure SQL Database
- Microsoft SQL Server
- Netezza
- Oracle
- SAP HANA
- Sybase
- Teradata
9 Role Required for Snowflake data stores. Enter the name of the Snowflake
role assigned to the user.
10 Warehouse Required for Snowflake data stores. Enter the name of the Snowflake
warehouse.
11 AdditionalParam Required for Snowflake data stores. Specify one or more JDBC
parameters in the following format:
<param1>=<value>&<param2>=<value>&<param3>=<value>....
12 DatabaseName Optional. Database name. Valid for the following data store types:
- Amazon Redshift
- Db2
- Db2 for z/OS
- Microsoft Azure SQL Database
- Microsoft Azure SQL Data Warehouse
- Microsoft SQL Server
- Netezza
- SAP HANA
- Sybase
- Teradata
14 CyberArkSafe Optional. Name of the CyberArk safe that contains the database
password.
15 DataOwnerSecurityDomai Valid for Database Management data stores. The value of the Profile
n Execution Engine property on the Data Store Details page. Enter Native
or Hadoop.
17 ApplicationGroup Optional. Name of an existing data store group. The data store group
name cannot include a forward slash (/) or exceed 100 characters.
18 LocationName Optional. Geographical mapping of the data store. You can assign one
location to a data store.
19 UserName Optional. User name that the Scan job uses to connect to the data
source.
20 SecurityGroups Optional. Grants users access to data stores. For example, the security
group determines which data stores users see when users manage data
stores and view scan results.
21 Tags Optional. Keywords that add to the data store description. For example,
to indicate the data store environment, you can assign tags such as
production, development, or test to a data store.
23 StorageType Type of data source in the selected category. Applicable for HDFS
connections.
26 FileTypes File types that the Scan job reads from the location configured in the
Path property. Enter ALL or SELECT.
27 SelectedFileTypes Applicable if you enter Select in the File Types property. Enter file
types separated by commas. For example: text,xml,json,pdf
29 S3BucketName Name of the Amazon S3 bucket that contains the semi-structured data
to scan. If blank, the service sets the data store to Not Complete. You
cannot scan incomplete data stores.
30 SourceConnectionName Connection name in the Informatica domain. Do not enter a value for
File Management data stores.
31 Memory Memory that the Scan job uses when you scan the data store. Enter
High, Medium, or Low.
32 Action Optional. Instructs the Import job to create, update, or delete a data
store in the Data Privacy Management repository. Enter one of the
following values:
- U. Creates or updates a data store in the repository. The action that
the job performs depends on whether the data store exists in the
repository and whether you select the Replace Duplicates with Items
Imported option when you import the CSV file.
If the data store does not exist, the Import job creates the data store.
If the data store exists and the Replace Duplicates with Items
Imported option is enabled, the Import job updates the data store.
Warning: If a column in the CSV file contains a null value, the Import
job replaces the value in the repository with a null value.
If the data store exists and the Replace Duplicates with Items
Imported option is not enabled, the Import job skips the row.
- D. Deletes a data store from the repository. If the data store does not
exist, the Import job rejects the row.
If blank, default is U. Creates or updates a data store in the repository.
33 ThirdPartyShare Optional. Specifies that the data in the data store is shared with a third
party. Enter TRUE or FALSE.
34 ThirdPartyName Optional. Applicable if the data store is shared with a third party. Enter
the name of the third party.
35 RemoteAgentScanning Required. Indicates if the data store is associated with a remote agent.
Enter TRUE or FALSE.
36 AssociatedRemoteAgents Do not enter a value for this property. The Import job does not import
remote agent associations. After you import the data store and save the
configuration details, associate the data store with a remote agent on
the Remote Agents workspace.
37 FolderOption Required for unstructured data stores that are associated with a remote
agent. Enter All, FolderSpecific, or Regex.
41 TrustorePath Required for Cassandra data stores. Enter the location of the database
Truststore.
42 LocalDatacenter Required for Cassandra data stores. Enter the name of the datacenter
that contains the required node.
43 KeystorePath Required for Cassandra data stores. Enter the location of the database
Keystore.
44 SSLEnabled Required for Cassandra data stores. Option to enable SSL. Enter TRUE or
FALSE.
45 ClientId Applicable for Office 365 Outlook data stores. Client ID of the Azure
Active Directory application.
47 IncludeDepartments Applicable for Office 365 Outlook data stores. Enter the name of Office
365 departments that you want to include in scans. Use commas to
separate multiple values.
48 IncludeUsers Applicable for Office 365 Outlook data stores. Enter the email IDs of
users or accounts that you want to scan. Use commas to separate
multiple values.
49 ExcludeUsers Applicable for Office 365 Outlook data stores. Enter the email IDs of
users or accounts to exclude from the scan. Use commas to separate
multiple values. The Exclude Users values takes precedence over
Include Users.
50 EmailGroups Applicable for Office 365 Outlook data stores. Enter the email groups to
include. Use commas to separate multiple values.
51 EmailsWithAttachment Applicable for Office 365 Outlook data stores. Enter TRUE or FALSE.
Enter TRUE to include only emails with attachments. Scans ignore
emails without attachments.
If you enter TRUE, scans include the content body of all emails that have
attachments and the attachments.
If you enter TRUE and specify attachment types, the scan includes the
content body of all emails that have attachments, and the attachment
types that you specify.
If you enter FALSE, scans include the content body of all emails with and
without attachments and the attachments.
If you enter FALSE and you specify attachment types, the scan includes
the content body of all emails with and without attachments, and the
attachment types that you specify.
52 AttachmentFileType Applicable for Office 365 Outlook data stores. Specify attachment types
to include. Use commas to separate multiple values.
Choose from the following attachment types:
- Avro
- Compressed Files
- Delimited and Text
- Email
- Image
- JSON
- Microsoft Excel
- Microsoft PowerPoint
- Microsoft Word
- PDF
- Parquet
- Webpage Files
- XML
Note: If you enable Optical Character Recognition (OCR) and you select
the PDF file type, PDF content is processed as optical characters.
To choose the Image file type, OCR must be enabled for unstructured
scans in the System Settings.
53 IncludeMailFolders Applicable for Office 365 Outlook data stores. Folders to include in the
scan. You can enter top-level folder names. Use commas to separate
multiple values.
54 ExcludeMailFolders Applicable for Office 365 Outlook data stores. Folders to exclude in the
scan. You can enter top-level folder names. Use commas to separate
multiple values.
• The first row of the CSV file must contain case-sensitive column headings in the required sequence. If the
CSV file does not contain all of the column headings in the required sequence and case, the Import job
fails.
• Each subsequent row after the column headings defines a data store.
• If the CSV file contains duplicate entries, the Import job processes the entries in the order in which the
rows appear in the file.
• The column values cannot contain the following keywords or terms:
SELECT FROM
SELECT UNION
DELETE FROM
UPDATE SET
<SCRIPT
</SCRIPT>
ALERT(
"/>"
If a row contains keywords or characters that are not valid, the Import job rejects the row. The Import job
logs an error in the job log and processes the remaining rows.
• The column values cannot exceed 255 characters. If a row contains more than 255 characters in any
column, the Import job rejects the row.
User activity information is gathered from user actions taken on the source database or a parent repository
such as Informatica PowerCenter. If a data store name differs from the original source name, you might want
to give the data store an alias name. When you provide a data store alias name that matches the original
source name, the user activity information for that source is linked to the corresponding data store. You can
create an alias for any data store, whether you created the data store in Data Privacy Management or
imported the data store from a parent repository.
The following table lists the order of the columns, column headings, and the format for the values in the CSV
file:
1 DataStoreName Required. Name of the data store as it exists in the Data Privacy Management
repository.
The combination of the DataStoreName and AliasName columns identifies a unique
data store for which to create or delete an alias.
If blank, the Import job rejects the row.
3 Action Optional. Instructs the Import job to create or delete an alias in the repository.
Enter one of the following values:
- U. Creates an alias in the repository.
- D. Deletes an alias from the repository.
If blank, default is U. Creates an alias in the repository.
Consider the following rules and guidelines when you import data store aliases from a CSV file:
• The first row of the CSV file must contain case-sensitive column headings in the required sequence. If the
CSV file does not contain all of the column headings in the required sequence and case, the Import job
fails.
• Each subsequent row after the column headings defines one alias for one data store.
The combination of the DataStoreName and AliasName columns defines a unique data store. If two rows
contain the same values in the DataStoreName and AliasName columns, the Import job creates two
aliases for the data store.
• The column values cannot contain the following keywords or terms:
SELECT FROM
SELECT UNION
DELETE FROM
UPDATE SET
<SCRIPT
Data Privacy Management sets or updates the data owner for each row in the CSV file that contains data
owner details. If the file contains details for a data store that already exists, Data Privacy Management
replaces the details of the existing data owner with the details in the file. To update the data owner, you must
select the Replace Duplicates with Items Imported option when you upload the CSV file.
Related Topics:
• “Import Job” on page 296
Create the case-sensitive column headings in the first row of the CSV file. Each subsequent row that contains
values defines the data owner of one data store.
1 RepoName Short description of the data store. Required. The name is case-sensitive
and must be unique within the Data Privacy Management repository. The
name cannot exceed 100 characters, contain spaces, or include the
following special characters: \ ~ ! $ % ^ & * ( ) +
If the RepoName value contains a comma (,) character, you must enter the
value surrounded by double quotation marks. For example, enter
"Repo1,Name" for a data store named Repo1,Name.
If the RepoName value contains a quotation mark ("), you must enter the
value surrounded by double quotation marks with an additional double
quotation mark placed after the quotation mark that exists in the value. For
example, enter "Repo1""Name" for a data store named Repo"1Name.
2 DataOwnerSecurityDomain Optional. Name of the collection of user accounts and groups in the
Informatica domain that includes the specified data owner. Valid values are
Native and LDAP. The Native security domain contains the users and
groups created and managed in the Administrator tool. An LDAP security
domain contains users and groups imported from an LDAP directory service.
Note: To update an existing data owner, verify that the name of the data store in the RepoName column
matches the name of the data store in the Data Privacy Management repository.
Important: Before you import protection status for new sensitive fields in a data store, you must first import
Catalog metadata for the sensitive fields.
If you know that a field in a data store contains sensitive information, you can import a CSV file that identifies
the field as sensitive. For example, you scan a data store and the Scan job cannot automatically accept or
reject the sensitive field as a match to a data domain. In this case, you must upload a CSV file to manually
identify the field as sensitive.
You can also import information about whether or not the field is protected, and specify the protection
method. For example, an Informatica PowerCenter Scan job can only verify that a sensitive field is masked if
the field is included in a PowerCenter mapping with a data masking transformation. If you use a different
protection product to mask sensitive data, you must upload a CSV file to manually identify the sensitive field
as protected and specify the protection method.
After the Import job is finished, you can view the protection status information that you imported on the Data
Stores workspace.
Related Topics:
• “Import Job” on page 296
• “Protection Status Indicator” on page 514
You can create the CSV file or update one of the following files that have the required format:
• The Scan report includes fields that the Scan job could not automatically reject or identify as sensitive.
To download the report, go to the Jobs workspace and click the job ID of the Scan job. Then, click the icon
in the Download Report column of the Profiling job step.
• The DataStoreDetails.csv file includes columns that the Scan job identified as sensitive.
To download the file, navigate to the data store Sensitive Fields page, select Actions > Export Data Store
Details, and download the .zip file. Then extract the DataStoreDetails.csv file.
1 RepoName Required. Short description of the data store. The name is case-
sensitive and must be unique within the Data Privacy Management
repository. The name cannot exceed 100 characters, contain spaces, or
contain the following special characters: \ ~ ! $ % ^ & * ( ) +
2 SchemaName/ Required. Name of the schema or folder that contains the field with
FolderName protection status information.
3 Object Required. Name of the table or object that contains the field with
protection status information.
4 FieldName/FileType Required. Name of the field that has protection status information.
5 IsSensitive Optional. Indicates whether the field contains sensitive data. Enter one
of the following values:
- Y
- N
6 DomainName Optional. Name of the data domain that contains the sensitive field.
7 ConformanceMatch Optional. Specifies the minimum percent of sensitive records for Data
Privacy Management to automatically accept a field as sensitive.
8 ProtectionStatus Optional. Indicates whether or not the field is protected. Enter one of the
following values:
- Protected
- Unprotected
• The first row of the CSV file must contain case-sensitive column headings in the required sequence. If the
CSV file does not contain all of the column headings in the required sequence and case, the Import job
fails.
• Each subsequent row after the column headings defines the sensitivity and protection status of one field
in a data store.
Data lineage describes the flow of data between data stores. Data Privacy Management augments the data
lineage information you import with scan results for the data stores. To import custom data store lineage,
perform the following steps:
After the Import job finishes, you can access the Proliferation page from the Overview workspace to view the
lineage. The Overview workspace shows lineage for scanned data stores. If you imported lineage for an
unscanned data store, scan the data store to view the lineage.
The following table lists the order of the columns, column headings, and the format for the values in the CSV
file:
1 EtlTool Required. Name of the tool from which you extracted the lineage. If blank, the
Import job rejects the row.
2 FromDataStore Required. Name of the source data store from which the data moves. Enter the
exact name of the data store in the Data Privacy Management repository.
The job rejects the row if the value is blank or if the data store name does not exist
in the repository.
3 ToDataStore Required. Name of the target data store to which data moves. Enter the exact name
of the data store in the repository.
The job rejects the row if the value is blank or if the data store name does not exist
in the repository.
4 FromSchema Optional. Name of the schema in the source data store from which data moves.
5 ToSchema Optional. Name of the schema in the target data store to which data moves.
6 FromTable Required. Name of the table in the source data store from which data moves. If
blank, the Import job rejects the row.
7 ToTable Required. Name of the table in the target data store to which data moves. If blank,
the Import job rejects the row.
8 FromColumn Required. Name of the column in the source data store from which data moves. If
blank, the Import job rejects the row.
9 ToColumn Required. Name of the column in the target data store to which data moves. If
blank, the Import job rejects the row.
10 ProtectionStatus Optional. Indicates if the column data is protected. Enter one of the following
values:
- Yes. Data is protected.
- No. Data is not protected.
If blank, default is No.
Consider the following rules and guidelines when you import lineage from a CSV file:
• The first row of the CSV file must contain case-sensitive column headings in the required sequence. If the
CSV file does not contain all of the column headings in the required sequence and case, the Import job
fails.
• Each subsequent row after the column headings defines one level of lineage for a column. For example,
the social security number column flows from data store A to data store B.
The combination of the EtlTool, FromDataStore, ToDataStore, FromSchema, ToSchema, FromTable,
ToTable, FromColumn, and ToColumn columns defines a unique row of lineage. If the columns two rows
are identical, the Import job treats the rows as duplicates.
• If the CSV file contains duplicate entries, the Import job processes the entries in the order in which the
rows appear in the file.
• The column values cannot contain the following keywords or terms:
SELECT FROM
SELECT UNION
DELETE FROM
UPDATE SET
<SCRIPT
</SCRIPT>
ALERT(
"/>"
If a row contains keywords or characters that are not valid, the Import job rejects the row. The Import job
logs an error in the job log and processes the remaining rows.
• The column values cannot exceed 255 characters. If a row contains more than 255 characters in any
column, the Import job rejects the row.
• Enclose names that include a dot (.) or comma (,) within quotes for any column value that includes a
name. If a value for a name column includes a dot or comma not enclosed within quotes, the Import job
might not process the column values correctly. For example, the Import job processes a comma as a
delimiter between column values.
• The data stores in the CSV file must exist in the Data Privacy Management repository. If you import
lineage for a data store that does not exist, the Import job rejects the row.
• The data stores in the CSV file do not require scan results to import the lineage. However, scan results are
required to view the data store proliferation on the Overview workspace.
• When you import the file, the Import job rejects the row if a data store is included in a running scan. Wait
to import the file until the Scan job completes or terminates.
• To add new or changed lineage, append changes to the original file. When the Import job runs, the job
deletes all previously imported lineage. Then, the job imports the information from the current file.
Data Privacy Management sets or updates the connection assignments for each row in the CSV file that
contains connection assignment details. If the CSV file contains details for a data store that already exists,
Related Topics:
• “Import Job” on page 296
For Snowflake databases, you must enter the values in the SchemaName column in the following format:
<database name>.<schema name>
This is required to differentiate between schemas with the same name across different databases. The rows
are rejected if you do not enter the values in the correct format.
4. In the Display menu, select Missing Data Stores or All Data Stores.
5. In the Classification Policies field, select the classification policies to evaluate in the data stores.
6. Optionally, select the Include row count check box.
Note: Data Privacy Management includes row counts when you import Catalog resources only if you
clear the Automatically update scan results after incremental scans check box.
7. Optionally, clear the Automatically update scan results after incremental scans check box. Default is
selected.
• When enabled, the Sync Catalog Updates job runs when you import the resources. The Sync Catalog
Updates job imports only the updated columns that are included in the data domain resources that
match the selected classification policies and in the data store scan options. The Sync Catalog
Updates job does not include row counts.
• When disabled, the Import Catalog Resources job runs when you import the resources. The Import
Catalog Resources job imports the columns that are included in the data domain resources that
match the selected classification policies and in the data store scan options.
8. Select one or more data stores to import Enterprise Data Catalog resources.
9. Click Import.
Data Privacy Management imports the data domain resources that match the selected classification
policies for the selected data stores.
Related Topics:
• “Import Job” on page 296
To export a breach report, you must have permissions to manage data stores. Also, a Subject scan and data
profiling must complete successfully before breach reports are available to export.
If a data store does not contain information for a column, the column value is empty in the data breach
report.
The following table lists the order of the columns, column headings, and the format for the values in the CSV
file:
1 Name Name of the data store as it appears in the Name column on the Data Stores
workspace.
2 Data Store Type Type of data store as it appears in the Data Store Type column on the Data Stores
workspace.
3 Data Owner Name of the individual, group, or organization that is responsible for the accuracy
and integrity of the source data.
4 Location If the data store is associated with a location defined on the Locations workspace,
the location name.
6 Time Date and time that you export the data breach report, in the India Standard Time
(IST) zone. The column value uses the following format:
mm/dd/yyyy hh:mm:ss <AM_PM>
For example: 2/22/2020 12:10:16 AM
8 Fields/Files The number of sensitive fields or files in the data store that were impacted by the
breach.
9 Sensitive Fields The percentage of sensitive fields in the data store that are protected.
protected For data store types that protection extensions do not support, the value is Not
Applicable.
10 Protection Indicates whether the sensitive fields in the data store are Protected or
Status Unprotected.
For data store types that protection extensions do not support, the value is
Unprotected.
11 Protection If sensitive fields are protected, lists the names of the protection extensions
Extensions configured on the Extensions workspace. Multiple values are separated by a comma
(,).
For data store types that protection extensions do not support, the value is Not
Applicable.
12 Categories Names of the categories associated with data domains in the data stores. Multiple
values are separated by a comma (,).
For unstructured data stores configured for Subject Registry, data domains are not
associated with categories. The value for unstructured data stores is Not
Applicable.
13 Purposes The reason that you collected the sensitive data about the individual. Multiple
values are separated by a comma (,).
Unstructured data stores configured for Subject Registry cannot specify a purpose.
The value for unstructured data stores is Not Applicable.
14 Shared with Indicates that the data is shared with a third party such as a third-party organization
Third Party or vendor. Values are Yes or No.
15 Third-Party If the data is shared with a third party, lists the names of the third parties. Multiple
Description values are separated by a comma (,).
Before you export data store connection information, scan the data store.
To configure the connectivity for all data stores except for IBM Db2, add an entry for the data stores in the
corresponding configuration files. To configure the connectivity for IBM Db2 data stores, run a set of
commands from a command prompt. The configuration export file includes the configuration entries to copy
and the IBM Db2 commands to run.
For example, a Scan job imported data stores that connect to Oracle databases. Update the tnsnames.ora
configuration file to include an entry for each data store that the Scan job imported.
orasas =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = IRL64ILM04)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = sas.informatica.com)
)
)
INVR268 =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = INVR28ILM268)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = orcl.informatica.com)
)
################################################################################
# odbc.ini #
################################################################################
################################################################################
# db2 #
################################################################################
################################################################################
1. Names of the database configuration files on the machines that host the services.
2. Entries to add to the database configuration files on the machines that host the services.
3. Database configuration file entries that you copied from the export file.
In this example, you copy the orasas and INVR98 entries from the export file to the tnsnames.ora file.
You copy the SQL Server Wire Protocol 1 entry from the export file to the odbc.ini file.
7. For IBM Db2 data stores, access the command line on the machine that hosts the Data Integration
Service.
8. Copy the IBM Db2 commands from the export file and run the commands. If the services are hosted on
different machines, run the commands on each machine.
Duplicate data stores can have different metadata such as location, owner, and data store group. Data
Privacy Management only uses the connection properties to determine duplicate data stores.
The following table lists sample data store connection properties for an Oracle database:
Data Store Name Data Store Type Host Port Service Name
• Two Oracle 11g Data Stores [ilm600:1521]. This group includes Data Store 1 and Data Store 2.
• Two Oracle 11g Data Stores [ilm600:1521:orcl]. This group includes Data Store 3 and Data Store 4.
Data Store 5 has the same data store type, host, and port as the other data stores. However, the service name
is unique, so Data Store 5 is not a duplicate data store.
You might merge data stores for one or more of the following purposes:
Consolidate duplicate data stores that you imported from a parent repository.
When a Scan job imports data stores from a parent repository, such as Informatica PowerCenter, the
Scan job can create duplicate data stores. For example, when you scan an Informatica PowerCenter data
store, the Scan job creates one data store for each source and target connection in the PowerCenter
repository mappings. You can have duplicate data stores if a mapping includes the same connection as
a source and a target.
Consolidate duplicate data stores that you imported from CSV files.
When you import data stores from CSV files, the Import job creates one data store for each connection
in the file. You can have duplicate data stores if the CSV file includes multiple data stores with the same
connection properties.
You might have duplicate data stores if you use a combination of import and creation methods to create
data stores. For example, you manually create data stores, import data stores from a CSV file, and
import data stores from Informatica PowerCenter. The CSV file might contain one or more data stores
that you imported from Informatica PowerCenter or that you manually created.
You might want to delete data stores that connect to the same database. When you create or import data
stores, Data Privacy Management does not validate if a data store with the same connection properties
exists. Data Privacy Management identifies data stores that contain the same connection properties as
duplicate data stores.
Review rules and guidelines to learn when you can delete stores and what other options you might need to
consider. You might want to merge data stores instead of deleting data stores.
• You cannot delete data stores that are included in a completed scan, a scan that is currently running, or a
scan that is scheduled to run in the future.
• When you delete data stores, you delete the data store from the Data Privacy Management repository. You
also delete the corresponding audit trail for the data store. If the data store that you want to delete is a
125
Data Store Properties Overview
When you create a data store or import data stores from Informatica PowerCenter or Informatica Data
Engineering Integration, you configure connection properties to the data source and additional properties that
contain metadata about the data store.
When you view a Data Store Details page, the page contains sections for the following property types:
Connection Contain the following information for all data store types: data store name, description, category,
properties and data store type.
Additional Contain the following information: metadata such as location and tags, third party share status,
properties the remote agent associated with the data store, and whether the Scan job will automatically
synchronize with Enterprise Data Catalog.
The additional properties you can configure depend on the data store type.
Property Description
Name Required. Short description of the data store. The name is not case sensitive and must be unique
within the Data Privacy Management repository. The name cannot exceed 255 characters or contain
spaces or the following special characters: \~!$%^&*()+
Description Optional. Long description of the data store that does not exceed 255 characters.
Data Store Required. Select one of the following options, depending on the category:
Type - Application: SAP, and Active Directory.
- Big Data: Cloudera Navigator, Hadoop Distributed File System, and Hive.
- Cloud: Amazon Redshift, Amazon S3, Azure Data Lake, Microsoft Azure Blob Storage, Microsoft
Azure SQL Data Warehouse, Microsoft Azure SQL Database, Google BigQuery, Salesforce, and
Snowflake.
- Data Integration: Informatica Cloud, Informatica Data Engineering Integration, Informatica
PowerCenter on IBM Db2, Informatica PowerCenter on Microsoft SQL Server, Informatica
PowerCenter on Oracle, and SQL Server Integration Services.
- Database Management: IBM Db2, IBM Db2 for i5/OS, IBM Db2 for z/OS, JDBC, Microsoft SQL
Server, Netezza, Oracle, SAP HANA, Sybase, and Teradata.
- File Management: File System, Google Drive, Microsoft OneDrive, and Microsoft SharePoint.
- NoSQL: Apache Cassandra
- Email Server: Office 365 Outlook
After you save the data store, you cannot change the data store type.
Additional Properties
Configure additional properties related to the data store metadata. Configure the data owner and department
for the data store. Assign data stores to locations, data store groups, and security groups. You can use the
additional properties for drill-down analysis of the scan results.
The following table describes the additional properties you can configure for all data store types:
Connection Description
Property
Data Store Required. Select a logical grouping of related data stores from the list of groups defined on the
Group Data Store Groups workspace. You can assign a data store to one data store group.
Security Groups Required. Select a security group from the list of all security groups in the Informatica domain,
including native and LDAP groups. Native groups display the group name. For example,
Administrator. LDAP groups include the security domain and the group name. For example,
<security domain>/<LDAP group name>.
Tags Optional. Keywords that add to the description. Create a new tag or select an existing tag from the
list. You can assign multiple tags to a data store.
After you assign a tag to a data store, you can use tags in alert rules to improve the specificity of
the alert. For example, you want to be alerted when the risk score of any production data store
increases by 20%. However, you do not want to be alerted for non-production data stores. You can
assign the tag PROD to the production data stores. Then, configure the alert conditions to alert you
when the risk score increases by 20% for data stores that have the PROD tag.
When you delete a data store, if a tag was only assigned to the deleted data store, the Data Privacy
Management Service deletes the tag from the repository.
Department Optional. Enter the department that is responsible for the maintenance or ownership of the data
store. The department can be a line of business, such as investment banking or wealth
management, or a corporate department, such as Sales or Marketing. The name is case-sensitive
and cannot exceed 100 characters.
After you scan data stores, you can view a summary of scan results for all data stores by
department. You can filter the scan results by location on the Overview workspace.
Data Owner Optional. Select the individual, group, or organization name that is responsible for the accuracy and
integrity of the source data on which you run a classification profiling scan.
Shared with Optional. Indicates that the data in the data store is shared with a third party such as a third-party
Third Party organization or vendor.
Third-Party Optional. Appears if you select the Shared with Third Party check box. Enter a description of the
Description third party. You can enter up to 2000 characters and use the following special characters: , _ @
Location Optional. Select a location that is defined on the Locations workspace. If you do not select a
location, Data Privacy Management assigns the Unknown location to the data store.
Auto Sync Optional. Valid only for data stores that are not scanned with remote agents. Determines if the
Catalog Scan job automatically synchronizes data source information in Enterprise Data Catalog with the
corresponding data store information in Data Privacy Management. Default is disabled.
Custom Risk Optional. If you created a custom risk score factor on the Settings workspace, the factor appears
Score Factor as a property. For existing data stores, you can select a value or range. For new data stores, the
property shows the default value or range that you specified on the Settings workspace. You can
select another value or range from the list.
Property Description
URL Enter the JDBC connection URL used to access the server.
Password Optional. Password for the database user name. When you save the data store, Data Privacy
Management encrypts the password.
Agent URL Optional. Enter the URL of the Enterprise Data Catalog agent in the following syntax: http://<host
name or IP address>:<port>/<directory>
Connection Optional. Enter the value that you enter for the Agent URL.
String
Secure JDBC Optional. Database parameters to access databases that are secured with the SSL protocol or Azure
Parameters Key Vault. The parameter string cannot exceed 500 characters. When you save the data store, Data
Privacy Management encrypts the parameter string.
Enter the parameters as name=value pairs separated by semicolon characters (;).
Data Privacy Management appends the secure JDBC parameters to the connection string.
Schema Enter one or more schemas that Data Privacy Management reads when you scan the data store. The
schema is case-sensitive. Use a comma to separate multiple entries. If empty, Data Privacy
Management reads all schemas from the database.
Source Optional. Default is All. A filter to specify tables that you want to include or exclude in the scan. Enter
Metadata a regular expression to include multiple filter parameters. For example: Table1, NOT Table2,
Filter SUB%, %EMP, %DATA%. The filter includes Table1, tables with names that start with SUB, names that
end with EMP, and tables that include DATA anywhere in the name. The filter excludes Table2.
Memory Optional. Memory that the Enterprise Data Catalog scanner uses when you run a scan on the data
store. Select High, Low, or Medium. Default is Low.
You might want to increase the scanner memory in the following example scenarios: you scan a large
data store, you do not specify the schemas for a data store, or you scan multiple data stores in
parallel. If you change the scanner memory, you must change the memory settings in the Yarn
Resource Manager.
Custom JVM Optional. JVM parameters that configure the scanner container. Use the following arguments to
Options configure the parameters:
- Dscannerloglevel=<DEBUG/INFO/ERROR>. Changes the log level of the scanner to values such
as DEBUG, ERROR, or INFO. Default value is INFO.
- Dscanner.container.core=<Number of cores>. Increases the core for the scanner
container. The value must be a number.
- Dscanner.yarn.app.environment=<key=value>. Key pair value that you need to set in the
Yarn environment. Use a comma to separate the key pair values.
- Dscanner.pmem.enabled.container.memory.jvm.memory.ratio=<1.0/2.0>. Increases
the scanner container memory when pmem is enabled. Default value is 1.
Agent Optional. Enter the Agent configuration options to include in the scan.
Options
CyberArk Optional. Name of the CyberArk safe that contains the database password. When you test the data
Safe store connection or you scan the data store, Data Privacy Management retrieves the password from
CyberArk.
Include Views Optional. Determines whether a Scan job evaluates data in views. If enabled, a Scan job identifies
in the Scan sensitive data in views during metadata and data profiling. The Scan job evaluates data in all views
for the schemas specified in the Schema property. If the property is blank, the Scan job evaluates
data in all views of all schemas.
If disabled, the Scan job does not evaluate data in views. If you disable the property after a Scan job
completes, any scans that you run in the future will not evaluate data in views. However, the
completed Scan job results remain.
Source Optional. Connection name in the Informatica domain that the scan job uses to connect to the source
Connection database. If the Scan job cannot connect to the source, the job fails.
Name
When you test a connection to Amazon Redshift, the Data Privacy Management Service tests the
combination of the User, Host, Port, and Database properties. If any of the property values are not valid, the
test connection fails.
Property Description
User Required. Name of the user account to log in to Amazon Redshift. The user must have read
permission to the source files and have access to the IP address.
Password Optional. Password for the user account to log in to Amazon Redshift. When you save the data store,
Data Privacy Management encrypts the password.
AWS Access Required. Access key ID to access the Amazon account resources. For example,
Key ID AKIAIOSFODNN7EXAMPLE. When you save the data store, the Data Privacy Management Service
encrypts the value in the repository.
If blank, the data store has a status of Not Complete. You cannot scan incomplete data stores.
AWS Secret Required. Secret access key to access the Amazon account resources. This value is associated with
Access Key the access key ID and uniquely identifies the account. For example, wJalrXUtnFEMI/K7MDENG/
bPxRfiCYEXAMPLEKEY. When you save the data store, the Data Privacy Management Service
encrypts the value in the repository.
If blank, the data store has a status of Not Complete. You cannot scan incomplete data stores.
AWS Master Optional. Provide a 256-bit AES encryption key in the Base64 format. You can generate a key using a
Symmetric third-party tool.
Key
Schema Optional. Select one of the following options to determine how you add schemas to the data store:
Option - All. Includes all schemas that you can access.
- Select From List. Displays a list of schemas in the Schema property.
- Specify Regex. Enables the Regular Expression for Schema property. Select this option if you
want to enter schemas as a regular expression.
Regular Optional. Available if you select the Specify Regex option in the Schema Option property. Enter a
Expression for regular expression to specify schemas that you want to include or exclude from the scan. Schema
Schema names are case-sensitive.
AWS Cluster Optional. Node type of the Amazon Redshift cluster. Select one of the following options:
Node Type - dc1.8xlarge
- dc1.large
- ds1.8xlarge
- ds1.xlarge
- ds2.8xlarge
- ds2.xlarge
For more information about nodes in the cluster, see the Amazon Redshift documentation.
AWS Total Required. Number of nodes in the Amazon Redshift cluster. For more information about nodes in the
Nodes in cluster, see the Amazon Redshift documentation.
Cluster
Schema Optional. Displays a list of available schemas to include in the scan if you chose Select From List in
the Schema Option property.
S3 Bucket Required. Name of the Amazon S3 bucket that contains the semi-structured data to scan. If blank,
Name the data store has a status of Not Complete. You cannot scan incomplete data stores.
Case Optional. Select to indicate that the data store is configured for case sensitivity. Default is cleared.
Sensitive
Memory Optional. Memory that the Enterprise Data Catalog scanner uses when you run a scan on the data
store. Select High, Low, or Medium. Default is Low.
You might want to increase the scanner memory in the following example scenarios: you scan a
large data store, you do not specify the schemas for a data store, or you scan multiple data stores in
parallel. If you change the scanner memory, you must change the memory settings in the Yarn
Resource Manager.
JVM Options Optional. JVM parameters that configure the scanner container. Use the following arguments to
configure the parameters:
- Dscannerloglevel=<DEBUG/INFO/ERROR>. Changes the log level of the scanner to values
such as DEBUG, ERROR, or INFO. Default value is INFO.
- Dscanner.container.core=<Number of cores>. Increases the core for the scanner
container. The value must be a number.
- Dscanner.yarn.app.environment=<key=value>. Key pair value that you need to set in the
Yarn environment. Use a comma to separate the key pair values.
- Dscanner.pmem.enabled.container.memory.jvm.memory.ratio=<1.0/2.0>. Increases
the scanner container memory when pmem is enabled. Default value is 1.
Source Optional. Connection name in the Informatica domain that the Scan job uses to connect to the
Connection Amazon Redshift source.
Name
Create a data store for each Amazon Web Services (AWS) bucket. You can create multiple data stores that
connect to the same AWS bucket, but the Source Directory property must be unique for each data store.
Property Description
Scan with Select to associate the data store with a remote agent when you run scans to discover personal and
Remote Agent sensitive data or build a Subject Registry index.
Important: If you select this option, you must associate the data store with a remote agent before
running a scan. After you save the data store, you cannot change this property.
S3 Supported Specify if the data store connects to a storage type that is compatible with Amazon S3 and can be
Type accessed by Amazon Web Services REST APIs.
AWS Bucket Required if you select No in the S3 Supported Type field. Virtual-hosted-style URL for the AWS bucket
URL that contains the semi-structured data to scan. Use the following syntax for the URL:
<Bucket_Name>.s3.amazonaws.com
For example, if the bucket name is DPM_QA, enter DPM_QA.s3.amazonaws.com. Do not include the
HTTP protocol in the value.
If another data store has the same value for the AWS bucket URL, Data Privacy Management marks
the data store as a potential duplicate.
REST Required if you select Yes in the S3 Supported Type field. Enter the REST endpoint URL of the
EndPoint URL Amazon S3 compatible source. For example, the Scality Ring REST endpoint URL is: http://
s3.isv.scality.com
AWS Session Required if the credentials are temporary. Enter the AWS token for the session.
Token
AWS Access Optional. Access key ID to access the Amazon account resources. For example:
Key ID AKIAIOSFODNN7EXAMPLE. When you save the data store, the Data Privacy Management Service
encrypts the value in the repository.
If blank, the data store has a status of Not Complete. You cannot scan incomplete data stores.
AWS Secret Optional. Secret access key to access the Amazon account resources. The secret access key is
Access Key associated with the access key ID and uniquely identifies the account. For example,
wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY. When you save the data store, the Data Privacy
Management Service encrypts the value in the repository.
If blank, the data store has a status of Not Complete. You cannot scan incomplete data stores.
AWS Bucket Optional. Name of the Amazon S3 bucket that contains the semi-structured data to scan. Use the
Name bucket name configured in the AWS Bucket URL property. For example, if the AWS Bucket URL value
is DPM_QA.s3.amazonaws.com, enter DPM_QA.
If the bucket name does not match the bucket name in the AWS Bucket URL property, the Scan job
fails.
If blank, the data store has a status of Not Complete. You cannot scan incomplete data stores.
Source Optional. Directory path that contains the files to scan. To scan all root directories, enter a forward
Directory slash /. To scan a specific directory, use the following syntax: /<Directory_Name>
For example, if the directory name is Folder1, enter /Folder1. Do not include a forward slash / at
the end of the directory. The Scan job reads files from the last directory specified. For example, if
you enter /HumanResource/Employees/US, the Scan job only reads files from the US directory.
The scan reads files from the specified directory only. To enable a scan to read from subdirectories,
enable the Include Sub Directory property.
If blank, the data store has a status of Not Complete. You cannot scan incomplete data stores.
Region Name Optional. The name of the geographical region for the Amazon S3 bucket. The scan uses the region
name to create a connection in the Administrator tool. Select the same region that is associated with
the Amazon S3 bucket in Amazon S3. If you select a different region, the Cloud job fails during the
Profiling step.
If blank, the data store has a status of Not Complete. You cannot scan incomplete data stores.
Folder Required if you enable the Scan with Remote Agent property. Folders that the scan reads from the
Options location configured in the Source Directory property. Select one of the following options:
- All
- Select Specific Folders
- Use Regular Expression
Default is All.
Include Appears when you enable the Scan with Remote Agent property and choose the Select Specific
Folders Folders option in the Folder Options property.
Enter the file paths for at least one folder that you want to include in the scan. You can also skip this
property and list folders in the Exclude Folders property. In this case, Data Privacy Management
scans the folders in the data store except the excluded folders.
Exclude Optional if you choose the Select Specific Folders option in the Folder Options property.
Folders Enter the file paths for at least one folder that you want to exclude from the scan.
Folder Required if you enable the Scan with Remote Agent property and you choose the Use Regular
Regular Expression option in the Folder Options property.
Expression Enter a regular expression for at least one folder to include in the scan.
File Types Required if you enable the Scan with Remote Agent property. File types that the scan reads from the
location configured in the Source Directory property.
Choose All to scan all file types. Choose Select to select specific file types to scan.
Default is All.
Selected File Required if you choose Select in the File Types property. Click Select All to choose all file types, or
Types select one or more of the following options:
- Avro
- Compressed Files
- Delimited and Text
- Image
- JSON
- Microsoft Excel
- Microsoft PowerPoint
- Microsoft Word
- PDF
- Parquet
- Webpage Files
- XML
Note: If you enable Optical Character Recognition (OCR) and you select the PDF file type, the PDF file
content is processed as optical characters.
To choose the Image file type, OCR must be enabled for unstructured scans in the System Settings.
Enter File Optional. Available if you do not enable the Scan with Remote Agent property. Character that
Delimiter separates entries in the file. Specify the file delimiter if the file from which you extract metadata uses
a delimiter other than the following list of delimiters:
- Comma (,)
- Horizontal tab (\t)
- Semicolon (;)
- Colon (:)
- Pipe symbol (|)
Verify that you enclose the delimiter in single quotes. For example, '$'. Use a comma to separate
multiple delimiters. For example, '$','%','&'. Default is a comma (,).
First Level Optional. Available if you do not enable the Scan with Remote Agent property. Name of the folder
Directory that you want to scan. For example: subfolderA.
The folder must be an immediate subfolder of the directory that you specified in the Source Directory
property. If you do not provide a folder name, Data Privacy Management will scan the files in the
source directory.
To specify multiple folders, separate the folder names with a comma (,). For example: subfolderA,
subfolderB
Include Sub Optional. Available if you do not enable the Scan with Remote Agent property. If enabled, the scan
Directory reads data from the directory and all subdirectories of the location configured in the Source
Directory property or the First Level Directory property, if specified.
Default is disabled.
Memory Optional. Memory that the scanner uses when you run a scan on the data store. Select High, Low, or
Medium. Default is Low.
You might want to increase the scanner memory in the following example scenarios: you scan a large
data store, you do not specify the schemas for a data store, or you scan multiple data stores in
parallel. If you change the scanner memory, you must change the memory settings in the Yarn
Resource Manager.
JVM Options Optional. JVM parameters that configure the scanner container. Use the following arguments to
configure the parameters:
- Dscannerloglevel=<DEBUG/INFO/ERROR>. Changes the log level of the scanner to values such
as DEBUG, ERROR, or INFO. Default value is INFO.
- Dscanner.container.core=<Number of cores>. Increases the core for the scanner
container. The value must be a number.
- Dscanner.yarn.app.environment=<key=value>. Key pair value that you need to set in the
Yarn environment. Use a comma to separate the key pair values.
- Dscanner.pmem.enabled.container.memory.jvm.memory.ratio=<1.0/2.0>. Increases
the scanner container memory when pmem is enabled. Default value is 1.
Source Optional. Connection name in the Informatica domain that the scan uses to connect to the Amazon
Connection Redshift source.
Name
When you test a connection to Apache Cassandra, the Data Privacy Management Service tests the
combination of the User, password, Host, Port, and Data Center properties. If any of the property values are
not valid, the test connection fails.
Property Description
Port Optional. Port number of the Apache Cassandra server. Default port number is 9042.
Local Data Name of the datacenter that contains the required node.
Center
User Name Optional. Name of the user account to log in to the Apache Cassandra server. The user must have
read permission to the source files and have access to the Cassandra server.
Password Optional. Password for the user account to log in to the server. When you save the data store, Data
Privacy Management encrypts the password.
Keyspace Imports a particular database schema. Select one of the following options to determine how you add
Option keyspaces to the data store:
- All. Includes all keyspaces that you can access.
- Select From List. Displays a list of keyspaces in the Keyspace property.
- Specify Regex. Enables the Regular Expression for Keyspace property. Select this option if you
want to enter schemas as a regular expression.
Regular Optional. Available if you select the Specify Regex option in the Keyspace Option property. Enter a
Expression regular expression to specify teh keyspace that you want to include or exclude from the scan.
for Keyspace
Keyspace Optional. Displays a list of available keyspaces to include in the scan if you chose Select From List in
the Keyspace Option property.
Case Optional. Indicates that the data store is configured for case sensitivity. Default is cleared.
Sensitive Note: You cannot include keyspaces, tables, or columns that include uppercase characters in a
domain discovery scan on an Apache Cassandra data store.
Memory Optional. Memory that the Enterprise Data Catalog scanner uses when you run a scan on the data
store. Select High, Low, or Medium. Default is Low.
You might want to increase the scanner memory in the following example scenarios: you scan a large
data store, you do not specify the schemas for a data store, or you scan multiple data stores in
parallel. If you change the scanner memory, you must change the memory settings in the Yarn
Resource Manager.
Custom JVM Optional. JVM parameters that configure the scanner container. Use the following arguments to
Options configure the parameters:
- Dscannerloglevel=<DEBUG/INFO/ERROR>. Changes the log level of the scanner to values such
as DEBUG, ERROR, or INFO. Default value is INFO.
- Dscanner.container.core=<Number of cores>. Increases the core for the scanner
container. The value must be a number.
- Dscanner.yarn.app.environment=<key=value>. Key pair value that you need to set in the
Yarn environment. Use a comma to separate the key pair values.
- Dscanner.pmem.enabled.container.memory.jvm.memory.ratio=<1.0/2.0>. Increases
the scanner container memory when pmem is enabled. Default value is 1.
When you test a connection to Cloudera Navigator, the Data Privacy Management Service tests the
combination of the Navigator URL, User, and Password properties. The Data Privacy Management Service
uses the host name specified in the Navigator URL property to identify duplicate data stores.
Property Description
Navigator Required. Enter the URL of the Cloudera Navigator Metadata Server. The Cloudera Navigator Scan job
URL identifies proliferation between scanned Hive data stores that are on the same host as Cloudera
Navigator.
User Required. Name of the user account to log in to Cloudera Navigator. The Cloudera Navigator Scan job
requires a Cloudera Manager user account that includes the Full Administrator or Navigator
Administrator role. The roles include authorization to connect to and retrieve metadata from Cloudera
Navigator. If the user account does not have one of these roles, the Scan job fails.
Password Required. Password for the user account to log in to Cloudera Navigator. When you save the data
store, Data Privacy Management encrypts the password.
Disable SSL Optional. Determines whether Data Privacy Management validates the SSL certificate, truststore,
Validation keytab file, and other SSL details for a Cloudera Navigator data store encrypted with the SSL protocol.
When the check box is selected, Data Privacy Management does not validate the SSL details. When the
check box is cleared, Data Privacy Management validates the SSL details. Default is cleared.
Hive Optional. Hive databases for which the Cloudera Navigator Scan job identifies lineage. To determine
Database lineage for all databases in scanned Hive data stores that are on the same host as Cloudera
Navigator, do not enter a value in this property. The Hive database name is case-sensitive. Use a
comma to separate multiple entries.
Memory Optional. Memory that the Enterprise Data Catalog scanner uses when you run a scan on the data
store. Select High, Low, or Medium. Default is Low.
You might want to increase the scanner memory in the following example scenarios: you scan a large
data store, you do not specify the schemas for a data store, or you scan multiple data stores in
parallel. If you change the scanner memory, you must change the memory settings in the Yarn
Resource Manager.
JVM Options Optional. JVM parameters that configure the scanner container. Use the following arguments to
configure the parameters:
- Dscannerloglevel=<DEBUG/INFO/ERROR>. Changes the log level of the scanner to values such
as DEBUG, ERROR, or INFO. Default value is INFO.
- Dscanner.container.core=<Number of cores>. Increases the core for the scanner container.
The value must be a number.
- Dscanner.yarn.app.environment=<key=value>. Key pair value that you need to set in the
Yarn environment. Use a comma to separate the key pair values.
- Dscanner.pmem.enabled.container.memory.jvm.memory.ratio=<1.0/2.0>. Increases
the scanner container memory when pmem is enabled. Default value is 1.
Enable Optional. Select this option to extract metadata about resources in Enterprise Data Catalog that the
Reference data store references. Default is disabled.
Resources
Retain Optional. Appears when you select the Enable Reference Resources property. Reference assets are
Unresolved assets from reference resources in Enterprise Data Catalog, such as reference data sources and
Reference reference data sets. Select this option to keep unresolved reference assets. Default is enabled.
Assets
You can use the JDBC type with the corresponding CDATA connectors to create data stores for MongoDB,
Workday applications, and REST Services. You can perform domain discovery and Subject Registry scans on
MongoDB, Workday applications, and REST services.
The connection properties depend on the database. The following table describes the database management
connection properties:
Property Description
Authentication Required for Microsoft SQL Server data stores only. Select SQL Server Authentication or Windows
Mode Authentication. Default is SQL Server Authentication.
Driver Class Required for JDBC data stores only. Name of the JDBC driver class.
URL Required for JDBC data stores only. Enter the JDBC connection URL used to access the Hadoop
server in one of the following formats:
- Non-Kerberos source: jdbc:hive2://<host>:<port>/<DB>
- Kerberos source: jdbc:hive2://<host>:<port>/<DB>;principal=<serviceName>/
<host>@realm
For Workday applications, the URL must contain the following:
- User name and password
- Tenant
- Host
- Service
Optionally, you can add views in the URL.
User Optional for SAP HANA data stores only. Required for other database management data store
types. Name of the user account to log in to the database.
Password Optional. Password for the database user name. When you save the data store, Data Privacy
Management encrypts the password.
Sub System ID Required for IBM Db2 for z/OS data stores only. Enter the subsystem ID (SSID) for the Db2 system.
Agent URL Optional for IBM Db2 for i5/OS and JDBC data stores only. Enter the URL of the Enterprise Data
Catalog agent in the following syntax: http://<host name or IP address>:<port>/
<directory>
Host Required for IBM Db2, IBM Db2 for i5/OS, Microsoft SQL Server, Netezza, Oracle, SAP HANA,
Sybase, and Teradata data stores. Optional for IBM Db2 for z/OS data stores. Host name or IP
address for the database server.
Port Required for IBM Db2, IBM Db2 for i5/OS, Microsoft SQL Server, Netezza, Oracle, SAP HANA, and
Sybase data stores. Optional for IBM Db2 for z/OS data stores. Port number for the database
server.
DDF Location Appears for IBM Db2 for z/OS. Optional. Location of the DB2 subsystem on IBM Db2 for z/OS. Use
the DB2 'DISPLAY DDF' command to identify the value.
Database Option Appears for IBM Db2 for z/OS. Optional. Select one of the following options to determine how you
add IBM DB2 for z/OS databases to the data store:
- All. Includes all databases that you can access.
- Select From List. Displays a list of databases in the Database property.
- Specify Regex. Enables the Regular Expression for Database property. Select this option if you
want to enter databases as a regular expression.
Database Required for IBM Db2, IBM Db2 for i5/OS, Microsoft SQL Server, Netezza, Oracle, SAP HANA, and
Sybase data stores. Optional for IBM Db2 for z/OS and Teradata data stores. The name of the
database instance.
Service Oracle data stores only. Required. The name of the database service.
Jdbc Username Optional for IBM Db2 for z/OS data stores only. The user name for the PowerExchange user
account.
Jdbc Password Optional for IBM Db2 for z/OS data stores only. The password for the PowerExchange user
account.
Pwx Username Optional for IBM Db2 for i5/OS data stores only. The user name for the PowerExchange user
account.
Pwx Password Optional for IBM Db2 for i5/OS data stores only. The password for the PowerExchange user
account.
Connection Required for IBM Db2 for z/OS data stores. Optional for other database management data store
String types. The connection string cannot exceed 32 characters. Enter the connection string using the
following syntax:
- IBM: <database_name>
- JDBC: <JDBC_data_source_name>
Note: For MongoDB, Workday applications, and REST Services, do not enter a connection string.
- Microsoft SQL Server: <host>,<port>@<database_name> or for named instances:
<host_name\instance_name>@<database_name>
- Oracle: The connection string is the TNSNAMES entry: <database_name>
- Netezza: <ODBC_data_source_name>
- Sybase: <server_name>@<database_name>
- Teradata: <ODBC_data_source_name>
Secure JDBC Optional. Database parameters to access databases that are secured with the SSL protocol or
Parameters Azure Key Vault. The parameter string cannot exceed 500 characters. When you save the data
store, Data Privacy Management encrypts the parameter string.
Enter the parameters as name=value pairs separated by semicolon characters (;). You can enter
the following secure database parameters:
SSL Protocol:
- EncryptionMethod. Required. Indicates whether data is encrypted when transmitted over the
network. This parameter must be set to SSL.
- HostNameInCertificate. Optional. Host name of the machine that hosts the secure database. If
you specify a host name, Data Privacy Management validates the host name included in the
connection string against the host name in the SSL certificate.
- TrustStore. Required. Path and file name of the truststore file that contains the SSL certificate
for the database.
- TrustStorePassword. Required. Password for the truststore file.
- KeyStore. Directory of the keystore file.
- KeyStorePassword. Password used to access the keystore file.
- KeyPassword. Password used to access individual keys in the keystore file.
- ValidateServerCertificate. Optional. Set to True or False. If True, Data Privacy Management
validates the certificate that the database server sends. If you specify the
HostNameInCertificate parameter, Data Privacy Management also validates the host name in
the certificate. If False, Data Privacy Management does not validate the certificate that the
database server sends. Data Privacy Management ignores any truststore information that you
specify.
Data Privacy Management appends the secure JDBC parameters to the connection string.
Use the following JDBC parameters if the database uses Azure Key Vault to encrypt the data:
- columnEncryption. Indicates whether the driver is enabled for Always Encrypted functionality
when accessing data from encrypted columns.
- AEKEYSTOREPRINCIPALID. The principal ID used to authenticate against the Azure Key Vault.
Required when you enable Always Encrypted.
- AEKEYSTORECLIENTSECRET. The Client Secret used to authenticate against the Azure Key
Vault. Required when you enable Always Encrypted.
Data Privacy Management appends the secure JDBC parameters to the connection string.
Use DSN Optional for Microsoft SQL Server data stores only. Determines whether Data Privacy Management
uses the Data Source Name for the connection. If enabled, sets the Database and Instance
properties to the database name and server name from the DSN. If you do not select this option,
you must manually enter the database name and server name.
Required if the database is encrypted using Azure Key Vault.
Before you scan the data store encrypted using Azure Key Vault, add the following properties to
the database entry in the $ODBC_HOME/odbc.ini file:
ColumnEncryption=<Enabled>
AEKeystorePrincipalId=<principal ID>
Schema Option Optional for IBM Db2, IBM Db2 for z/OS, Microsoft SQL Server, Oracle, SAP HANA, Sybase, and
Teradata data stores only. Select one of the following options to determine how you add schemas
to the data store:
- All. Includes all schemas that you can access.
- Select From List. Displays a list of schemas in the Schema property.
- Specify Regex. Enables the Regular Expression for Schema property. Select this option if you
want to enter schemas as a regular expression.
Schema Optional for IBM Db2 for i5/OS, JDBC, and Netezza data stores. Enter one or more schemas that
Data Privacy Management reads when you scan the data store. The schema is case-sensitive. Use
a comma to separate multiple entries. If empty, Data Privacy Management reads all schemas from
the database.
Required for IBM Db2, IBM Db2 for z/OS, Microsoft SQL Server, Oracle, SAP HANA, Sybase, and
Teradata data stores if you select Select From List in the Schema Option property.
Regular Required for IBM Db2, IBM Db2 for z/OS, Microsoft SQL Server, Oracle, SAP HANA, Sybase, and
Expression for Teradata data stores if you select Specify Regex in the Schema Option property. Enter a regular
Schema expression to specify schemas that you want to include or exclude from the scan. Schema names
are case-sensitive.
Fetch Views Optional for Teradata data stores only. Select this option to enable the Scan job to retrieve the
Data Types data types in source columns from Teradata views.
Source Metadata Optional. Default is All. A filter to specify tables that you want to include or exclude in the scan.
Filter Enter a regular expression to include multiple filter parameters. For example: Table1, NOT
Table2, SUB%, %EMP, %DATA%. The filter includes Table1, tables with names that start with
SUB, names that end with EMP, and tables that include DATA anywhere in the name. The filter
excludes Table2.
Stored Optional for Microsoft SQL Server data stores only. The names of stored procedures or functions
Procedures, to include in the scan.
Functions
Case Sensitive Optional for IBM Db2, IBM Db2 for z/OS, Microsoft SQL Server, Netezza, Oracle, SAP HANA,
Sybase, and Teradata data stores. Select to indicate that the data store is configured for case
sensitivity. Default is cleared.
Case Sensitivity Required for IBM Db2 for i5/OS data stores only. Select one of the following options: Auto, Case
Insensitive, or Case Sensitive. Default is Auto.
Import Private Optional for Oracle data stores only. Select this option to import Oracle private and public
and Public synonyms for the schemas included in the data store scan.
Synonyms
Import Optional for Microsoft SQL Server data stores only. Select this option to import table synonyms
Synonyms from the source data store.
Import Database Optional for Oracle data stores only. Select to import database names from Enterprise Data
Names Catalog when the scan job runs.
Memory Optional. Memory that the Enterprise Data Catalog scanner uses when you run a scan on the data
store. Select High, Low, or Medium. Default is Low.
You might want to increase the scanner memory in the following example scenarios: you scan a
large data store, you do not specify the schemas for a data store, or you scan multiple data stores
in parallel. If you change the scanner memory, you must change the memory settings in the Yarn
Resource Manager.
JVM Options Optional. JVM parameters that configure the scanner container. Use the following arguments to
configure the parameters:
- Dscannerloglevel=<DEBUG/INFO/ERROR>. Changes the log level of the scanner to values
such as DEBUG, ERROR, or INFO. Default value is INFO.
- Dscanner.container.core=<Number of cores>. Increases the core for the scanner
container. The value must be a number.
- Dscanner.yarn.app.environment=<key=value>. Key pair value that you need to set in the
Yarn environment. Use a comma to separate the key pair values.
- Dscanner.pmem.enabled.container.memory.jvm.memory.ratio=<1.0/2.0>.
Increases the scanner container memory when pmem is enabled. Default value is 1.
Auto Assign Optional for Microsoft SQL Server data stores only. Select to automatically assign the schemas in
Connections Enterprise Data Catalog to the data store. Default is cleared.
Enable Optional for Microsoft SQL Server data stores only. Select to extract metadata about resources in
Reference Enterprise Data Catalog that the data store references. Default is cleared.
Resources
Retain Optional for Microsoft SQL Server data stores only. Appears when you select the Enable
Unresolved Reference Resources property. Reference assets are assets from reference resources in
Reference Enterprise Data Catalog, such as reference data sources and reference data sets. Select this
Assets option to keep unresolved reference assets. Default is enabled.
Agent Options Optional for IBM Db2 for i5/OS, JDBC, Microsoft SQL Server, Sybase, and Teradata data stores.
Optional. Enter the Agent configuration options to include in the scan.
CyberArk Safe Optional. Name of the CyberArk safe that contains the database password. When you test the data
store connection or you scan the data store, Data Privacy Management retrieves the password
from CyberArk.
Include Views in Optional. Determines whether a Scan job evaluates data in views. If enabled, a Scan job identifies
the Scan sensitive data in views during metadata and data profiling. The Scan job evaluates data in all
views for the schemas specified in the Schema property. If the property is blank, the Scan job
evaluates data in all views of all schemas.
If the data store was imported from a parent data store, such as Informatica PowerCenter, the
Scan job for the parent data store can identify the lineage of sensitive data in views. For example,
you ran an Informatica PowerCenter Scan job that imported a data store that included views. You
ran a Database Scan job on the data store that identified sensitive data in the views. When you run
an Informatica PowerCenter Scan job to identify data proliferation, the Scan job identifies lineage
for the sensitive data in the views.
If disabled, the Scan job does not evaluate data in views. If you disable the property after a Scan
job completes, any scans that you run in the future will not evaluate data in views. However, the
completed Scan job results remain.
Profile Execution Optional. Select one of the following profiling engine modes:
Engine - Hadoop. Data Privacy Management uses the Blaze engine to run profiles.
- Native. Data Privacy Management uses the Informatica Data Integration Service engine to run
profiles.
Default is Native.
Hadoop If you select Hadoop in the Profile Execution Engine property, enter the name of the Hadoop
Connection connection listed in the Administrator tool, Connections tab.
Name
Source Optional. Connection name in the Informatica domain that the scan job uses to connect to the
Connection source database. If the Scan job cannot connect to the source, the job fails.
Name
The following table describes the Office 365 Outlook connection properties:
Property Description
Include A filter to include departments. Enter the name of Office 365 departments that you want to
Departments include in scans. Use commas or press Enter to separate multiple values.
Include Users A filter to include users. Enter the email IDs of users or accounts that you want to scan. Use
commas or press Enter to separate multiple values.
Exclude Users A filter to exclude users. Enter the email IDs of users or accounts to exclude from the scan. Use
commas or press Enter to separate multiple values.
Note: The Exclude Users values takes precedence over Include Users.
Include Email A filter to include users based on group membership. Enter the group names. Use commas or
Groups press Enter to separate multiple values.
Include Mail A filter to include specific mail folders during scan. You can enter top-level folder names.
Folders
Exclude Mail A filter to exclude specific mail folders during scan. You can enter top-level folder names.
Folders
Include Only Emails Choose this option to include only emails with attachments. Scans ignore emails without
with Attachments attachments.
If you select the option, scans include the content body of all emails that have attachments and
the attachments.
If you select the option and specify attachment types, the scan includes the content body of all
emails that have attachments, and the attachment types that you specify.
If you do not select the option, scans include the content body of all emails with and without
attachments and the attachments.
If you do not select the option and you specify attachment types, the scan includes the content
body of all emails with and without attachments, and the attachment types that you specify.
Attachment Types You can choose to include specific types of attachments in a scan.
Choose from the following attachment types:
- Avro
- Compressed Files
- Delimited and Text
- Email
- Image
- JSON
- Microsoft Excel
- Microsoft PowerPoint
- Microsoft Word
- PDF
- Parquet
- Webpage Files
- XML
Note: If you enable Optical Character Recognition (OCR) and you select the PDF file type, PDF
content is processed as optical characters.
To choose the Image file type, OCR must be enabled for unstructured scans in the System
Settings.
When you test a connection to a file system data source, the Data Privacy Management Service verifies if the
path you enter in the Path property exists on the Domain node.
Property Description
Scan with Select to associate the data store with a remote agent when you run Scan jobs to discover personal
Remote and sensitive data or build a Subject Registry index.
Agent Important: If you select this option, you must associate the data store with a remote agent before
running a scan. After you save the data store, you cannot change this property.
File Protocol Required. Select one of the following file transfer protocols: Local File, SFTP, and SMB/CIFS. Default
is Local File.
User Required if you select SFTP or SMB/CIFS in the File Protocol property. Name of the user account to
access the data store.
Password Required if you select SFTP or SMB/CIFS in the File Protocol property. Password for the user account
that connects to the data store.
Host Required if you select SFTP or SMB/CIFS in the File Protocol property. Host name for the data store
server.
Port Required if you select SFTP in the File Protocol property. Port number for the data store server.
Path Required. The CIFS or NFS mount directory you want to scan. For example: /home/infa/Folder1
If you selected Scan with Remote Agent, make sure that the agent machine and Data Privacy
Management can access the directory.
Note: You must mount the directory on a path present on the Data Privacy Management domain and
on the cluster nodes. Use the CIFS mount option to mount a Windows directory. Use the NFS mount
option to mount a Linux or network directory.
Folder Required if you enable the Scan with Remote Agent property. Folders that the Scan job reads from the
Options location configured in the Path property. Select one of the following options:
- All
- Select Specific Folders
- Use Regular Expression
Default is All.
Include Appears if you if you enable the Scan with Remote Agent property and choose the Select Specific
Folders Folders option in the Folder Options property.
Enter the file paths for at least one folder that you want to include in the scan. You can also skip this
property and list folders in the Exclude Folders property. In this case, Data Privacy Management
scans the folders in the data store except the excluded folders.
Exclude Optional if you enable the Scan with Remote Agent property and choose the Select Specific Folders
Folders option in the Folder Options property. Enter the file paths for at least one folder that you want to
exclude from the scan.
Folder Required if you enable the Scan with Remote Agent property and choose the Use Regular Expression
Regular option in the Folder Options property. Enter a regular expression for at least one folder that you want
Expression to include in the scan.
File Types Required. File types that the scan reads from the location configured in the Path property. Select All
to scan all file types. Choose Select to select specific file types to scan.
Default is All.
Selected File Required if you choose Select in the File Types property. Click Select All to choose all file types, or
Types select one or more of the following options:
- Avro
- Compressed Files
- Delimited and Text
- Image
- JSON
- Microsoft Excel
- Microsoft PowerPoint
- Microsoft Word
- PDF
- Parquet
- Webpage Files
- XML
Note: If you enable Optical Character Recognition (OCR) and you select the PDF file type, the PDF file
content is processed as optical characters.
To choose the Image file type, OCR must be enabled for unstructured scans in the System Settings.
Enter File Optional. Available if you do not enable the Scan with Remote Agent property. Character that
Delimiter separates entries in the file. Specify the file delimiter if the file from which you extract metadata uses
a delimiter other than the following list of delimiters:
- Comma (,)
- Horizontal tab (\t)
- Semicolon (;)
- Colon (:)
- Pipe symbol (|)
Verify that you enclose the delimiter in single quotes. For example, '$'. Use a comma to separate
multiple delimiters. For example, '$','%','&'. Default is a comma (,).
First Level Optional. Available if you do not enable the Scan with Remote Agent property. Name of the folder that
Directory you want to scan. For example: subfolderA.
To specify multiple folders, separate the folder names with a comma (,). For example: subfolderA,
subfolderB
Include Sub Optional. Available if you do not enable the Scan with Remote Agent property. If enabled, the scan
Directory reads data from the directory and all subdirectories of the location configured in the Path property or
the First Level Directory property, if specified.
Default is disabled.
Memory Optional. Memory to use when you scan the data store. Select High, Low, or Medium. Default is Low.
You might want to increase the scanner memory in the following example scenarios: you scan a large
data store, you do not specify the schemas for a data store, or you scan multiple data stores in
parallel. If you change the scanner memory, you must change the memory settings in the Yarn
Resource Manager.
JVM Options Optional. JVM parameters that configure the scanner container. Use the following arguments to
configure the parameters:
- Dscannerloglevel=<DEBUG/INFO/ERROR>. Changes the log level of the scanner to values such
as DEBUG, ERROR, or INFO. Default value is INFO.
- Dscanner.container.core=<Number of cores>. Increases the core for the scanner
container. The value must be a number.
- Dscanner.yarn.app.environment=<key=value>. Key pair value that you need to set in the
Yarn environment. Use a comma to separate the key pair values.
- Dscanner.pmem.enabled.container.memory.jvm.memory.ratio=<1.0/2.0>. Increases
the scanner container memory when pmem is enabled. Default value is 1.
Hadoop If you select Hadoop in the Profile Execution Engine property, enter the name of the Hadoop
Connection connection listed in the Administrator tool, Connections tab.
Name
Source The connection name in the Informatica domain. Do not enter a value for File Management scans.
Connection
Name
Property Description
Project ID Name of the Google Cloud Platform project that you want to access.
Private Key The private key associated with the service account.
Client Email The client email address associated with the service account.
Dataset Option Datasets that the scan job reads from the location configured in the Project ID property. Select one
of the following options:
- All
- Select from list
- Specify Regex
Default is All.
Dataset Available if you select the List option in the Dataset Option property. Select the datasets that you
want to use to include in the scan from the list.
Dataset Available if you select the Specify Regex option in the Dataset Option property. Enter a regular
Regular expression to specify datasets that you want to include or exclude from the scan. Dataset names
Expression are case-sensitive.
Source You can include or exclude tables and views from the resource run. Enter the filter query as a Java
Metadata Filter regular expression. For example, to include all tables that have CONF or DEMO in the name, enter
the following expression:
.*CONF.*|.*DEMO.*
Memory Specifies the memory required to run the Enterprise Data Catalog scanner job. Select one of the
following values based on the data set size that you plan to create:
- Low
- Medium
- High
Custom JVM Optional. JVM parameters that configure the scanner container. Use the following arguments to
Options configure the parameters:
- Dscannerloglevel=<DEBUG/INFO/ERROR>. Changes the log level of the scanner to values
such as DEBUG, ERROR, or INFO. Default value is INFO.
- Dscanner.container.core=<Number of cores>. Increases the core for the scanner
container. The value must be a number.
- Dscanner.yarn.app.environment=<key=value>. Key pair value that you need to set in the
Yarn environment. Use a comma to separate the key pair values.
- Dscanner.pmem.enabled.container.memory.jvm.memory.ratio=<1.0/2.0>. Increases
the scanner container memory when pmem is enabled. Default value is 1.
To scan a Google Drive data store, you must associate the data store with a remote agent on the Remote
Agents workspace after you save the data store. Google Drive data stores do not connect to Enterprise Data
Catalog. When you scan the data store, the remote agent scanner maps sensitive files to data domains and
adds subject data in the subject registry.
Property Description
Scan with Remote Read-only. Selected. Indicates that you will associate the data store with a remote agent to run
Agent the Scan job .
Folder Options Required. Folders that the File Management job reads from the location configured in the Path
property. Select one of the following options:
- All
- Select Specific Folders
- Use Regular Expression
Default is All.
Include Folders Required if you choose the Select Specific Folders option in the Folder Options property. Enter
the file paths for at least one folder that you want to include in the scan.
Exclude Folders Optional if you choose the Select Specific Folders option in the Folder Options property. Enter
the file paths for at least one folder that you want to exclude from the scan.
Folder Regular Required if you choose the Use Regular Expression option in the Folder Options property. Enter a
Expression regular expression for at least one folder that you want to include in the scan.
File Types Required. File types that the File Management job reads from the location configured in the Path
property. Select All to scan all file types. Choose Select to select specific file types to scan.
Default is All.
Selected File Required if you choose Select in the File Types property. Click Select All to choose all file types,
Types or select one or more of the following options:
- Avro
- Compressed Files
- Delimited and Text
- Image
- JSON
- Microsoft Excel
- Microsoft PowerPoint
- Microsoft Word
- PDF
- Parquet
- Webpage Files
- XML
Note: If you enable Optical Character Recognition (OCR) and you select the PDF file type, the PDF
file content is processed as optical characters.
To choose the Image file type, OCR must be enabled for unstructured scans in the System
Settings.
Client Secret Required. The Client Secret from Google Developer Console.
Access Token Required. The Access Token received after you exchange the authorization code from the Google
Authorization Server.
Refresh Token Required. The Refresh Token received after you exchange the authorization code from the Google
Authorization Server.
You can create multiple data stores that connect to HDFS on the same cluster, but the Source Directory
property must be unique for each data store.
When you test the connection to HDFS, Data Privacy Management tests the combination of the Name Node
URI 1, User Name, and Source Directory properties. If any property values are not valid, the connection test
fails.
Property Description
Scan with Select to associate the data store with a remote agent when you run scan jobs to discover personal
Remote and sensitive data or build a Subject Registry index. Default is selected.
Agent Important: If you select this option, you must associate the data store with a remote agent before
running a scan. After you save the data store, you cannot change this property.
Cluster Required. Select one of the following options to provide configuration details about the cluster:
Configuration - Load From Configuration Archive File. You must select this option if you want to specify the
Details Microsoft Azure Data Lake Store storage type. You can select this option for other storage types.
- Provide Configuration Details. Select this option if you want to specify a storage type of
Distributed File System or Windows Azure Blob Storage.
Default is Provide Configuration Details.
Storage Type Required if you select Provide Configuration Details in the Cluster Configuration Details property.
Select one of the following options:
- ABFS. Azure Blob File System.
- DFS. Distributed File System.
- WASB. Windows Azure Blob Storage.
Default is DFS.
Distribution Required if you select Load From Configuration Archive File in the Cluster Configuration Details
Type property. Select one of the following options:
- Amazon Emr
- Azure HDInsight
- Cloudera
- Hortonworks
- IBM BigInsights
- MapR FS
Default is Azure HDInsight.
Azure Required if you select ABFS or WASB in the Storage Type property. The fully qualified URI to access
Storage data stored in ABFS or WASB. You can get the URI from the from the fs.defaultFS property in the
Account URI core-site.xml file.
Example: wasb://
secureatsource-2017-08-08t06-13-20-522z@secureatsource.blob.core.windows.net
Azure Required if you select ABFS or WASB in the Storage Type property. Name of the Microsoft Azure
Storage Storage account to stage the files. You can get the name from the FS URI. The name is the first term
Account after the forward slash and before the time stamp.
Name For example, in the following URI, the name of the storage account is dataprivacy: wasb://
dataprivacy-2019-08-08t06-13-20-522z@dataprivacy.blob.core.windows.net
Azure Required if you select ABFS or WASB in the Storage Type property. Key to access the Microsoft Azure
Storage Storage account. Use the decrypted version of the key. To get the decrypted version of the key,
Account Key perform the following steps:
1. Get the key from the fs.azure.account.key.secureatsource.blob.core.windows.net
Key property in the core-site.xml file. The value is the encrypted version of the key.
2. Get the file path of the decrypt script from the fs.azure.shellkeyprovider.script property
in the core-site.xml file.
Example of the decrypt script file path: /usr/lib/hdinsight-common/scripts/decrypt.sh
3. Navigate to the decrypt script file path.
4. Run the decrypt.sh file with the encrypted key as an argument. The value returned is the
decrypted version of the key.
5. Enter the decrypted version of the key.
Name Node Required if you select Provide Configuration Details in the Cluster Configuration Details property.
URI 1 URI to the active HDFS NameNode. The active HDFS NameNode manages all the client operations in
the cluster. Use the following format to specify the NameNode URI: hdfs://<NameNode>:<port>
- <NameNode> is the host name or IP address of the NameNode.
- <port> is the port on which the NameNode listens for remote procedure calls (RPC).
Use the same value that is configured for the property fs.defaultFS in the following file on the
Hadoop cluster: core-site.xml
If another data store has the same value, Data Privacy Management identifies the data store as a
potential duplicate. If blank, the data store has a status of Not Complete. You cannot scan
incomplete data stores.
HA Cluster Optional. Appears if you select Provide Configuration Details in the Cluster Configuration Details
property. Indicates whether the HDFS cluster is highly available (HA). Select Yes or No. Default is
Yes.
Name Node Required if you select Yes in the HA Cluster property. URI to access the secondary NameNode for the
URI 2 HDFS. Use the following syntax: hdfs://<SecondaryNameNode>:<port>
- <SecondaryNameNode> is the host name or IP address of the Secondary NameNode.
- <port> is the port on which the NameNode listens for remote procedure calls (RPC).
You can find the host name, IP address, and port in property dfs.namenode.rpc-
address.<ClusterName>.<Name_NodeID> in file hdfs-site.xml on the Hadoop cluster.
Property dfs.ha.namenodes.<ClusterName> includes a list of the NameNode IDs. In Ambari, you
can get the host name or IP address from the Standby NameNode property for HDFS.
HDFS Service Required if you select Yes in the HA Cluster property. Logical name of the NameNode cluster. Use the
Name same value that is configured for property dfs.nameservices in file hdfs-site.xml on the
Hadoop cluster.
User Name/ Required. User that connects to the Hadoop cluster. The user must have read permission to the
User directory configured in the Source Directory property and to all files in the directory. For highly
Principal available clusters, specify the User Name. For Kerberos-enabled clusters, specify the User Principal.
Source Required. Directory path that contains the files to scan. To scan all root directories, enter a forward
Directory slash /. To scan a specific directory, use the following syntax: /<Directory_Name>
For example, if the directory name is Folder1, enter /Folder1. Do not include a forward slash / at
the end of the directory. The scan job reads files from the last directory specified. For example, if you
enter /HumanResource/Employees/US, the scan job only reads files from the US directory.
If blank, the data store has a status of Not Complete. You cannot scan incomplete data stores.
HDFS Required. Select Yes to indicate that transparent encryption is enabled for HDFS. Select No to
Transparent indicate that transparent encryption is not enabled for HDFS.
Encryption
Key Required if you select Yes in the HDFS Transparent Encryption property. The fully qualified URI to the
Management Key Management Server key provider. Used to interact with encryption keys while reading and writing
Server to an encryption zone.
Provider URI You can get the URI from the dfs.encryption.key.provider.uri property in the hdfs-
site.xml file.
Kerberos Optional. Select Yes to indicate that the cluster uses Kerberos authentication. Select No to indicate
Cluster that the cluster does not use Kerberos authentication. Default is No.
Hadoop RPC Appears if you select Yes in the Kerberos Cluster property. Security infrastructure for Hadoop Remote
Protection Procedure Call (RPC). Select one of the following options:
- Authentication. Authentication only. The client and server mutually authenticate during connection
setup.
- Integrity. Authentication and integrity. Guarantees the integrity of data exchanged between the
client and server and the integrity of authentication.
- Privacy. Authentication, integrity, and confidentiality. Guarantees that data exchanged between the
client and server is encrypted.
Default is Authentication.
HDFS Service Required if you select Yes in the Kerberos Cluster property. Service Principal Name (SPN) of the
Principal HDFS service. If blank, the Scan job fails.
KeyTab File Required if you select Yes in the Kerberos Cluster property. Path and file name of the Service
Principal Name (SPN) keytab file for the user account to impersonate when connecting to the
Kerberos-enabled cluster.
The keytab file must be generated from the Kerberos-enabled cluster and copied to a directory on the
machine where the Data Privacy Management Service runs. For example: /<Data Privacy
Management installation directory>/hdfskeytab/admin.keytab
When you save the data store, the Data Privacy Management Service encrypts the value. If blank, the
Scan job fails.
Cluster Optional. Identifier used to identify the nodes in the cluster. To get the cluster configuration ID, see
Configuration the value for the ID property of the cluster configured in the Administrator tool. Enter the ID in lower
ID case.
Folder Required if you enable the Scan with Remote Agent property. Folders that the scan job reads from the
Options location configured in the Source Directory property. Select one of the following options:
- All
- Select Specific Folders
- Use Regular Expression
Default is All.
Include Required if you enable the Scan with Remote Agent property and choose the Select Specific Folders
Folders option in the Folder Options property. Enter the file paths for at least one folder that you want to
include in the scan.
Exclude Optional if you enable the Scan with Remote Agent property and choose the Select Specific Folders
Folders option in the Folder Options property. Enter the file paths for at least one folder that you want to
exclude from the scan.
Folder Required if you enable the Scan with Remote Agent property and choose the Use Regular Expression
Regular option in the Folder Options property. Enter a regular expression for at least one folder that you want
Expression to include in the scan.
File Types Required. File types that the scan job reads from the location configured in the Source Directory
property. Select All to scan all file types. Choose Select to select specific file types to scan.
Default is All.
Selected File Required if you choose Select in the File Types property. Click Select All to choose all file types, or
Types select one or more of the following options:
- Avro
- Compressed Files
- Delimited and Text
- Image
- JSON
- Microsoft Excel
- Microsoft PowerPoint
- Microsoft Word
- PDF
- Parquet
- Webpage Files
- XML
Note: If you enable Optical Character Recognition (OCR) and you select the PDF file type, the PDF file
content is processed as optical characters.
To choose the Image file type, OCR must be enabled for unstructured scans in the System Settings.
Enter File Optional. Available if you do not enable the Scan with Remote Agent property. Character that
Delimiter separates entries in the file. Specify the file delimiter if the file from which you extract metadata uses
a delimiter other than the following list of delimiters:
- Comma (,)
- Horizontal tab (\t)
- Semicolon (;)
- Colon (:)
- Pipe symbol (|)
Verify that you enclose the delimiter in single quotes. For example, '$'. Use a comma to separate
multiple delimiters. For example, '$','%','&'. Default is a comma (,).
First Level Optional. Available if you do not enable the Scan with Remote Agent property. Name of the folder that
Directory you want to scan. For example: subfolderA.
To specify multiple folders, separate the folder names with a comma (,). For example: subfolderA,
subfolderB
Include Sub Optional. Available if you do not enable the Scan with Remote Agent property. If enabled, the scan
Directory reads data from the directory and all subdirectories of the location configured in the Source Directory
property or the First Level Directory property, if specified.
Default is disabled.
Memory Optional. Available if you do not enable the Scan with Remote Agent property. Memory that the
Enterprise Data Catalog scanner uses when you run a scan on the data store. Select High, Low, or
Medium. Default is Low.
You might want to increase the scanner memory in the following example scenarios: you scan a large
data store, you do not specify the schemas for a data store, or you scan multiple data stores in
parallel. If you change the scanner memory, you must change the memory settings in the Yarn
Resource Manager.
JVM Options Optional. Available if you do not enable the Scan with Remote Agent property. JVM parameters that
configure the scanner container. Use the following arguments to configure the parameters:
- Dscannerloglevel=<DEBUG/INFO/ERROR>. Changes the log level of the scanner to values such
as DEBUG, ERROR, or INFO. Default value is INFO.
- Dscanner.container.core=<Number of cores>. Increases the core for the scanner
container. The value must be a number.
- Dscanner.yarn.app.environment=<key=value>. Key pair value that you need to set in the
Yarn environment. Use a comma to separate the key pair values.
- Dscanner.pmem.enabled.container.memory.jvm.memory.ratio=<1.0/2.0>. Increases
the scanner container memory when pmem is enabled. Default value is 1.
Source Optional. Available if you do not enable the Scan with Remote Agent property. The connection name
Connection in the Informatica domain that the Scan job uses to connect to HDFS. To determine the name, go to
Name the Administrator tool and click the cluster configuration object that created the HDFS connections
automatically. The name of the HDFS connection is the source connection name. For example:
HDFS_cco
If you create multiple data stores that connect to the same cluster, you can use the same connection
name for all data stores.
You must enter a value for Source Connection Name if you selected the Load from Configuration
Archive File option in the Cluster Configuration Details property.
When you test a data store connection, the Data Privacy Management Service tests the combination of the
Hadoop Distribution and JDBC URL properties. When Kerberos is used for Hive authentication, the Data
Privacy Management Service also tests the Keytab File, Kerberos Configuration File Path, and User Proxy
properties.
When you save a data store, the Data Privacy Management Service uses the host and port specified in the
URL and User properties to identify duplicate data stores.
For Cloudera Hive data stores you want to protect with encryption extensions, rules, and keys, Data Privacy
Management does not perform Scan jobs. After you configure the connection to a Cloudera Hive data store
for encryption, instead of running a scan, you import Catalog metadata and protection status for the data
store on the Data Stores workspace.
Property Description
MapR Home Required if you select MapR in the Hadoop Distribution property. The file path for the MapR home
directory for the specified user.
URL Required. JDBC connection URL used to access the Hadoop server. Use one of the following
formats:
- Non-Kerberos source: jdbc:hive2://<host>:<port>/<DB>
- Kerberos source: jdbc:hive2://<host>:<port>/<DB>;principal=<serviceName>/
<host>@realm
If blank, the data store status is Not Complete. You cannot scan incomplete data stores.
User Optional. The Hive user name for the data store connection. The Big Data Scan job requires a user
account that has read, write, and execute privileges. If the user does not have the required
privileges, the Scan job fails.
Kerberos Required for Hive sources that use Kerberos authentication. Common path of the krb5.conf file
Configuration located in both the domain and the Hadoop cluster.
File
Keytab File Required for Hive sources that use Kerberos authentication. Common path of the keytab file
located in both the domain and the Hadoop cluster.
User Proxy Required for Hive sources that use Kerberos authentication. Specifies a user proxy setting.
Include Views Optional. Determines whether a Scan job evaluates data in views. If enabled, a Scan job identifies
in the Scan sensitive data in views during metadata and data profiling. The Scan job evaluates data in all views
for the schemas specified in the Schema property. If the property is blank, the Scan job evaluates
data in all views of all schemas.
The Cloudera Navigator Scan job can identify the lineage of sensitive data in views. For example,
you ran a Hadoop Scan job on the data store that identified sensitive data in views. When you run a
Cloudera Navigator Scan job to identify data proliferation, the Scan job identifies the lineage for the
sensitive data in views.
If disabled, the Scan job does not evaluate data in views. If you disable the property after a Scan
job completes, any scans that you run in the future will not evaluate data in views. However, the
completed Scan job results remain.
Server Name Optional. The symbolic name for the service that enables Sentry authorization for the Hive service.
for Sentry This property appears if you select Cloudera in the Hadoop Distribution property. To get the server
Authorization name for Sentry authorization, perform the following steps:
1. In Cloudera Manager, go to the Hive cluster.
2. Click Configuration in the header menu.
3. In the Filters pane, expand Scope and select Hive (Service-Wide).
4. Expand Category and select Advanced. The name appears in the Server Name for Sentry
Authorization field.
Ranger Service Name of the service configured in Ranger Service Manager for the Hive service properties. This
Name property appears if you select Hortonworks in the Hadoop Distribution property.
Cluster Optional. Identifier used to identify the nodes in the cluster. To get the cluster configuration ID, see
Configuration the value for the ID property of the cluster configured in the Administrator tool. Enter the ID in
ID lower case.
Schema Option Optional. Select one of the following options to determine how you add schemas to the data store:
- All. Includes all schemas that you can access.
- Select From List. Displays a list of schemas in the Schema property.
- Specify Regex. Enables the Regular Expression for Schema property. Select this option to enter
schemas as a regular expression.
Schema Displays a list of available schemas to include in the scan if you select Select From List in the
Schema Option property.
A Hadoop Scan job can run metadata or data profiling on multiple schemas in the same data store.
However, a Cloudera Navigator Scan job determines lineage of sensitive data between Hive data
stores. If one Hive data store contains multiple schemas and sensitive data moves between the
schemas, the Cloudera Navigator Scan job cannot show sensitive data lineage between schemas in
the same data store. To show complete lineage between each schema, create one Hive data store
for each schema.
Regular Displays if you select Specify Regex in the Schema Option property. Enter a regular expression to
Expression for specify schemas that you want to include or exclude from the scan. Schema names are case-
Schema sensitive.
Table Specifies a Hive table to scan. Enter one table or leave blank to scan all tables. If you enter more
than one table, the Scan job fails.
SerDe JARs List Specifies a list of fully-qualified path names, separated by semicolons, to the SerDe JAR files that
the bridge uses to run remotely on the Hive system. If the Hive database contains customized
tables, you must provide a custom SerDe JAR file list that will be used to serialize and deserialize
the customized tables.
Memory Optional. Memory that the Enterprise Data Catalog scanner uses when you run a scan on the data
store. Select High, Low, or Medium. Default is Low.
You might want to increase the scanner memory in the following example scenarios: you scan a
large data store, you do not specify the schemas for a data store, or you scan multiple data stores
in parallel. If you change the scanner memory, you must change the memory settings in the Yarn
Resource Manager.
JVM Options Optional. JVM parameters that configure the scanner container. Use the following arguments to
configure the parameters:
- Dscannerloglevel=<DEBUG/INFO/ERROR>. Changes the log level of the scanner to values
such as DEBUG, ERROR, or INFO. Default value is INFO.
- Dscanner.container.core=<Number of cores>. Increases the core for the scanner
container. The value must be a number.
- Dscanner.yarn.app.environment=<key=value>. Key pair value that you need to set in the
Yarn environment. Use a comma to separate the key pair values.
- Dscanner.pmem.enabled.container.memory.jvm.memory.ratio=<1.0/2.0>. Increases
the scanner container memory when pmem is enabled. Default value is 1.
Auto Assign Optional. Select to automatically assign the schemas in Enterprise Data Catalog to the data store.
Connections Default is cleared.
Enable Optional. Select to extract metadata about resources in Enterprise Data Catalog that the data store
Reference references. Default is cleared.
Resources
Retain Optional. Appears when you select the Enable Reference Resources property. Reference assets are
Unresolved assets from reference resources in Enterprise Data Catalog, such as reference data sources and
Reference reference data sets. Select this option to keep unresolved reference assets. Default is enabled.
Assets
Hadoop If you select Hadoop in the Profile Execution Engine property, enter the name of the Hadoop
Connection connection listed in the Administrator tool, Connections tab. The Hadoop connection is created in
Name the Administrator tool when you create a cluster configuration. Enter the name of the Hadoop
connection that corresponds to the cluster configuration name.
You must enter the Hadoop connection name for Cloudera Hive data stores that you want to protect
with encryption extensions and rules.
Hive If you select Hive in the Profile Execution Engine property, enter the name of the Hive connection.
Connection You can find the Hive connection name in the Administrator tool, Connections tab.
Name
Source The Hive connection name in the Informatica domain that Data Privacy Management uses to
Connection connect to Hive.
Name Run the Hadoop Configuration Manager to create the Hive connection in the Informatica domain.
Specify the connection name that was configured in the utility. If you create multiple data stores
that connect to the same cluster, you can use the same connection for all data stores.
If blank, Data Privacy Management creates a Hive connection in the Informatica domain and uses
the user name specified in the data store connection properties.
When you test a connection to Informatica Cloud, Data Privacy Management tests the combination of the
URL, User Name, and Password properties.
Property Description
Cloud URL Required. Base URL of the Informatica Cloud server. For example, https://sts1.company.com/
User Name Required. User name for the Informatica Cloud server.
Detailed Optional. Select this option to enable scanning the complete data store lineage in Enterprise Data
Lineage Catalog. Default is disabled.
Memory Optional. Memory that the Enterprise Data Catalog scanner uses when you run a scan on the data
store. Select High, Low, or Medium. Default is Low.
You might want to increase the scanner memory in the following example scenarios: you scan a large
data store, you do not specify the schemas for a data store, or you scan multiple data stores in
parallel. If you change the scanner memory, you must change the memory settings in the Yarn
Resource Manager.
JVM Options Optional. JVM parameters that configure the scanner container. Use the following arguments to
configure the parameters:
- Dscannerloglevel=<DEBUG/INFO/ERROR>. Changes the log level of the scanner to values such
as DEBUG, ERROR, or INFO. Default value is INFO.
- Dscanner.container.core=<Number of cores>. Increases the core for the scanner
container. The value must be a number.
- Dscanner.yarn.app.environment=<key=value>. Key pair value that you need to set in the
Yarn environment. Use a comma to separate the key pair values.
- Dscanner.pmem.enabled.container.memory.jvm.memory.ratio=<1.0/2.0>. Increases
the scanner container memory when pmem is enabled. Default value is 1.
Enable Optional. Select to extract metadata about resources in Enterprise Data Catalog that the data store
Reference references. Default is cleared.
Resources
Retain Optional. Appears when you select the Enable Reference Resources property. Reference assets are
Unresolved assets from reference resources in Enterprise Data Catalog, such as reference data sources and
Reference reference data sets. Select this option to keep unresolved reference assets. Default is enabled.
Assets
The following table describes the Informatica Data Engineering Integration connection properties:
Property Description
Execute On Required. Select a server where the Data Engineering Integration data store runs. Default is Default
Catalog Server.
Target Version Required. Select one of the following Data Engineering Integration product versions:
- 10.0
- 10.1
- 10.1.1
- 10.1.1.HF1
- 10.2
- 10.2.1
- 10.2.2
- 10.2.2.HF1
- 10.2.HF1
- 10.2 HF2
- 10.4.0
- 10.4.1
- 10.5.0
- 9.6.1.HF3
- 9.6.1.HF4
Default is 9.6.1 HF3.
Note: For Snowflake databases, select 10.4.0 or later. On versions earlier than 10.4.0, lineage
information is not extracted.
MRS Domain Required. Name of the Informatica domain on which the Model Repository Service runs.
Name
MRS User Name Required. User name to access the Model Repository Service database.
MRS Password Required. Password to access the Model Repository Service database.
MRS Security Optional. Name of the security domain to which the Model Repository Service user belongs.
Domain Default is Native.
MRS Domain Required. Host name or IP address of the machine that hosts the Model Repository Service
Host database.
MRS Domain Required. Port number of the machine that hosts the Model Repository Service database.
Port
Os Profile Name Optional. Name of the operating system profile if the Data Integration Service is enabled to use
operating system profiles.
Param Set For Optional. Name of the parameter set in the Model Repository that is mapped to the application
Mappings in specified in the property.
Application
Select Required. Select None, Parameter Set, or Parameter File. Default is None.
Parameter Set
or Parameter
File
Parameter Sets Optional. Name of parameter sets that run mappings and workflows when the associated
application is deployed.
Parameter Sets Optional. Name of the file that lists parameter sets.
File
Parameter File Optional. Name of the XML file that lists user-defined parameters and their assigned values.
Node Name Optional. Name of the node on which the Model Repository Service runs.
MRS Node Host Optional. Host name or IP address of the node on which the Model Repository Service runs.
Name
MRS Node Port Optional. Port number of the node on which the Model Repository Service runs.
MRS is SSL Optional. If selected, indicates that the Model Repository Service uses the Secure Sockets Layer
Enabled security protocol.
Detailed Lineage Optional. If you selected 10.2.1 or a later version in the Target Version property, you can select
(Applicable for this option to scan the complete data store lineage in Enterprise Data Catalog. Default is disabled.
Target version
10.2.1 onwards)
Memory Optional. Memory that the Enterprise Data Catalog scanner uses when you run a scan on the data
store. Select High, Low, or Medium. Default is Medium.
You might want to increase the scanner memory in the following example scenarios: you scan a
large data store, you do not specify the schemas for a data store, or you scan multiple data stores
in parallel. If you change the scanner memory, you must change the memory settings in the Yarn
Resource Manager.
JVM Options Optional. JVM parameters that configure the scanner container. Use the following arguments to
configure the parameters:
- Dscannerloglevel=<DEBUG/INFO/ERROR>. Changes the log level of the scanner to values
such as DEBUG, ERROR, or INFO. Default value is INFO.
- Dscanner.container.core=<Number of cores>. Increases the core for the scanner
container. The value must be a number.
- Dscanner.yarn.app.environment=<key=value>. Key pair value that you need to set in the
Yarn environment. Use a comma to separate the key pair values.
- Dscanner.pmem.enabled.container.memory.jvm.memory.ratio=<1.0/2.0>. Increases
the scanner container memory when pmem is enabled. Default value is 1.
Enable Optional. For Data Engineering Integration version 10.4.0 or later, select this option to extract
Reference metadata about resources in Enterprise Data Catalog that the data store references. Default is
Resources cleared.
Retain Optional. Appears when you select the Enable Reference Resources property. Reference assets
Unresolved are assets from reference resources in Enterprise Data Catalog, such as reference data sources
Reference and reference data sets. Select this option to keep unresolved reference assets. Default is
Assets selected.
Preferred DIS Optional. Select a preferred Data Integration Service from the list.
Source Optional. The connection name in the Informatica domain that the Scan job uses to connect to
Connection HDFS. Use the same name that was configured as the connection name in the Administrator tool.
Name If you create multiple data stores that connect to the same cluster, you can use the same
connection name for all data stores.
If blank, the Scan job creates a connection to HDFS on a cluster that is not highly available or
Kerberos-enabled. The connection might not work for a highly available or Kerberos-enabled
cluster. If the Scan job cannot connect to the source, the job fails.
Configure connection details for the PowerCenter repository that includes workflows to extract, load, and
transform data between databases. When you run an Informatica PowerCenter scan, the Scan job identifies
sensitive data, sensitive data protection, and sensitive data proliferation for columns in PowerCenter
workflows. Do not confuse with the PowerCenter repository that exists on the same domain as the Data
Privacy Management Service.
The following table describes the Informatica PowerCenter on IBM Db2 connection properties:
Property Description
Gateway Host Required. Host name or IP address of the machine that hosts the PowerCenter repository database.
Name or
Address
Gateway Port Required. Port number of the machine that hosts the PowerCenter repository database.
Number
Informatica Optional. A collection of user accounts and groups in the Informatica domain. Enter the LDAP
Security security domain name, if one exists. Otherwise, enter: Native. Default is Native.
Domain
Repository User Required. User that accesses the PowerCenter repository. The user must have PowerCenter
Name Repository Service privileges for the folders, runtime objects, sources and targets, and design
objects privilege groups.
Repository User Required. Password for the PowerCenter repository user. When you save the data store, Data
Password Privacy Management encrypts the password.
Connection Optional. Native connection string the PowerCenter Repository Service uses to access the
String PowerCenter repository database. The connection string cannot exceed 32 characters. Enter the
connection string using the following syntax: <database_name>
Secure JDBC Required database parameters to access databases that are secured with the SSL protocol. The
Parameters parameter string cannot exceed 500 characters. When you save the data store, Data Privacy
Management encrypts the parameter string.
Enter the parameters as name=value pairs separated by semicolon characters (;). You can enter
the following secure database parameters:
- EncryptionMethod. Required. Indicates whether data is encrypted when transmitted over the
network. This parameter must be set to SSL.
- HostNameInCertificate. Optional. Host name of the machine that hosts the secure database. If
you specify a host name, Data Privacy Management validates the host name included in the
connection string against the host name in the SSL certificate.
- TrustStore. Required. Path and file name of the truststore file that contains the SSL certificate
for the database.
- TrustStorePassword. Required. Password for the truststore file.
- KeyStore. Directory of the keystore file.
- KeyStorePassword. Password used to access the keystore file.
- KeyPassword. Password used to access individual keys in the keystore file.
- ValidateServerCertificate. Optional. Set to True or False. If True, Data Privacy Management
validates the certificate that the database server sends. If you specify the
HostNameInCertificate parameter, Data Privacy Management also validates the host name in the
certificate. If False, Data Privacy Management does not validate the certificate that the database
server sends. Data Privacy Management ignores any truststore information that you specify.
Data Privacy Management appends the secure JDBC parameters to the connection string.
Database Host Optional. Database host name for the PowerCenter repository service.
Database Port Optional. Database port number for the PowerCenter repository service.
Database User Optional. User name for the PowerCenter repository database. The user account must have the
CREATE_VIEW privilege. If you use a read-only user, the user account must have the
CREATE_SYNONYM privilege.
Database Optional. Password for the PowerCenter repository database user. When you save the data store,
Password Data Privacy Management encrypts the password.
Informatica Optional. Name of the Informatica domain on which the PowerCenter Repository Service runs.
Domain
PCIS Name Optional. Name of the PowerCenter Integration Service (PCIS) that is assigned to the PowerCenter
repository.
PowerCenter Optional. Name of the folder that Data Privacy Management creates in the PowerCenter repository.
Folder Name The folder includes a workflow that reads connection properties from database configuration files
and PowerCenter parameter files. Data Privacy Management requires the workflow to import the
properties for all connections in the PowerCenter repository.
Site Key Full directory path of the encryption key for the Informatica domain on which the PowerCenter
Location Repository Service runs. The encryption key is stored in a file named siteKey. When you scan a
PowerCenter repository, the Scan job imports connections and connection properties. The Scan job
creates a data store in the Data Privacy Management repository for each connection in the
PowerCenter repository.
To read encrypted passwords in the PowerCenter repository, Data Privacy Management uses the
encryption key for the Informatica domain on which the PowerCenter Repository Service runs. To
encrypt the passwords in the Data Privacy Management repository, Data Privacy Management uses
the encryption key for the Information domain on which the Data Privacy Management Service runs.
Required for PowerCenter repositories on Informatica version 9.6x or later. If the encryption key is
not on the machine that hosts the Data Privacy Management installation, create a mount point. If
you do not configure the encryption key location, Data Privacy Management cannot read the
passwords from the PowerCenter repository. After you scan the PowerCenter repository, you must
manually add the passwords for all created data stores.
Not required for PowerCenter repositories on Informatica version 9.5x or earlier. By default, Data
Privacy Management uses a static key to read the passwords in the PowerCenter repository.
PCIS OS Profile Optional. Name of the operating system (OS) profile for the PowerCenter Integration Service
(PCIS). An operating system profile is a type of security that the PowerCenter Integration Service
uses to isolate the runtime user environment.
Database Optional. Name of the database that hosts the PowerCenter repository.
Parameter File Optional. Select a database configuration file that contains parameters to use when the data store
scan runs.
Memory Optional. Memory that the Enterprise Data Catalog scanner uses when you run a scan on the data
store. Select High, Low, or Medium. Default is Low.
You might want to increase the scanner memory in the following example scenarios: you scan a
large data store, you do not specify the schemas for a data store, or you scan multiple data stores
in parallel. If you change the scanner memory, you must change the memory settings in the Yarn
Resource Manager.
JVM Options Optional. JVM parameters that configure the scanner container. Use the following arguments to
configure the parameters:
- Dscannerloglevel=<DEBUG/INFO/ERROR>. Changes the log level of the scanner to values
such as DEBUG, ERROR, or INFO. Default value is INFO.
- Dscanner.container.core=<Number of cores>. Increases the core for the scanner
container. The value must be a number.
- Dscanner.yarn.app.environment=<key=value>. Key pair value that you need to set in the
Yarn environment. Use a comma to separate the key pair values.
- Dscanner.pmem.enabled.container.memory.jvm.memory.ratio=<1.0/2.0>. Increases
the scanner container memory when pmem is enabled. Default value is 1.
Enable Optional. Select this option to extract metadata about resources in Enterprise Data Catalog that the
Reference data store references. Default is cleared.
Resources
Retain Optional. Appears when you select the Enable Reference Resources property. Reference assets are
Unresolved assets from reference resources in Enterprise Data Catalog, such as reference data sources and
Reference reference data sets. Select this option to keep unresolved reference assets. Default is selected.
Assets
Configure connection details for the PowerCenter repository that includes workflows to extract, load, and
transform data between databases. When you run an Informatica PowerCenter scan, the Scan job identifies
sensitive data, sensitive data protection, and sensitive data proliferation for columns in PowerCenter
workflows. Do not confuse with the PowerCenter repository that exists on the same domain as the Data
Privacy Management Service.
The following table describes the Informatica PowerCenter on Microsoft SQL Server connection properties:
Property Description
Gateway Host Required. Host name or IP address of the machine that hosts the PowerCenter repository
Name or database.
Address
Gateway Port Required. Port number of the machine that hosts the PowerCenter repository database.
Number
Informatica Optional. A collection of user accounts and groups in the Informatica domain. Enter the LDAP
Security Domain security domain name, if one exists. Otherwise, enter: Native. Default is Native.
Repository User Required. User that accesses the PowerCenter repository. The user must have PowerCenter
Name Repository Service privileges for the folders, runtime objects, sources and targets, and design
objects privilege groups.
Repository User Required. Password for the PowerCenter repository user. When you save the data store, Data
Password Privacy Management encrypts the password.
Connection Optional. Native connection string the PowerCenter Repository Service uses to access the
String PowerCenter repository database. The connection string cannot exceed 32 characters. Enter the
connection string in the following syntax: <server_name>@<database_name>
Secure JDBC Required database parameters to access databases that are secured with the SSL protocol. The
Parameters parameter string cannot exceed 500 characters. When you save the data store, Data Privacy
Management encrypts the parameter string.
Enter the parameters as name=value pairs separated by semicolon characters (;). You can enter
the following secure database parameters:
- EncryptionMethod. Required. Indicates whether data is encrypted when transmitted over the
network. This parameter must be set to SSL.
- HostNameInCertificate. Optional. Host name of the machine that hosts the secure database. If
you specify a host name, Data Privacy Management validates the host name included in the
connection string against the host name in the SSL certificate.
- TrustStore. Required. Path and file name of the truststore file that contains the SSL certificate
for the database.
- TrustStorePassword. Required. Password for the truststore file.
- KeyStore. Directory of the keystore file.
- KeyStorePassword. Password used to access the keystore file.
- KeyPassword. Password used to access individual keys in the keystore file.
- ValidateServerCertificate. Optional. Set to True or False. If True, Data Privacy Management
validates the certificate that the database server sends. If you specify the
HostNameInCertificate parameter, Data Privacy Management also validates the host name in the
certificate. If False, Data Privacy Management does not validate the certificate that the database
server sends. Data Privacy Management ignores any truststore information that you specify.
Data Privacy Management appends the secure JDBC parameters to the connection string.
Database Host Optional. Database host name for the PowerCenter repository service.
Database Port Optional. Database port number for the PowerCenter repository service.
Database User Optional. User name for the PowerCenter repository database. The user account must have the
CREATE_VIEW privilege. If you use a read-only user, the user account must have the
CREATE_SYNONYM privilege.
Database Optional. Password for the PowerCenter repository database user. When you save the data store,
Password Data Privacy Management encrypts the password.
Informatica Optional. Name of the Informatica domain on which the PowerCenter Repository Service runs.
Domain
PCIS Name Optional. Name of the PowerCenter Integration Service (PCIS) that is assigned to the PowerCenter
repository.
PowerCenter Optional. Name of the folder that Data Privacy Management creates in the PowerCenter repository.
Folder Name The folder includes a workflow that reads connection properties from database configuration files
and PowerCenter parameter files. Data Privacy Management requires the workflow to import the
properties for all connections in the PowerCenter repository.
Site Key Full directory path of the encryption key for the Informatica domain on which the PowerCenter
Location Repository Service runs. The encryption key is stored in a file named siteKey. When you scan a
PowerCenter repository, the Scan job imports connections and connection properties. The Scan job
creates a data store in the Data Privacy Management repository for each connection in the
PowerCenter repository.
To read encrypted passwords in the PowerCenter repository, Data Privacy Management uses the
encryption key for the Informatica domain on which the PowerCenter Repository Service runs. To
encrypt the passwords in the Data Privacy Management repository, Data Privacy Management uses
the encryption key for the Information domain on which the Data Privacy Management Service runs.
Required for PowerCenter repositories on Informatica version 9.6x or later. If the encryption key is
not on the machine that hosts the Data Privacy Management installation, create a mount point. If
you do not configure the encryption key location, Data Privacy Management cannot read the
passwords from the PowerCenter repository.
If you select PowerCenter version 10.5 and do not enter a site key location, the scan fails.
PCIS OS Profile Optional. Name of the operating system (OS) profile for the PowerCenter Integration Service
(PCIS). An operating system profile is a type of security that the PowerCenter Integration Service
uses to isolate the runtime user environment.
Database Optional. Name of the database that hosts the PowerCenter repository.
Parameter File Optional. Select a database configuration file that contains parameters to use when the data store
scan runs.
Memory Optional. Memory that the Enterprise Data Catalog scanner uses when you run a scan on the data
store. Select High, Low, or Medium. Default is Low.
You might want to increase the scanner memory in the following example scenarios: you scan a
large data store, you do not specify the schemas for a data store, or you scan multiple data stores
in parallel. If you change the scanner memory, you must change the memory settings in the Yarn
Resource Manager.
JVM Options Optional. JVM parameters that configure the scanner container. Use the following arguments to
configure the parameters:
- Dscannerloglevel=<DEBUG/INFO/ERROR>. Changes the log level of the scanner to values
such as DEBUG, ERROR, or INFO. Default value is INFO.
- Dscanner.container.core=<Number of cores>. Increases the core for the scanner
container. The value must be a number.
- Dscanner.yarn.app.environment=<key=value>. Key pair value that you need to set in the
Yarn environment. Use a comma to separate the key pair values.
- Dscanner.pmem.enabled.container.memory.jvm.memory.ratio=<1.0/2.0>. Increases
the scanner container memory when pmem is enabled. Default value is 1.
Enable Optional. Select this option to extract metadata about resources in Enterprise Data Catalog that the
Reference data store references. Default is cleared.
Resources
Retain Optional. Appears when you select the Enable Reference Resources property. Reference assets are
Unresolved assets from reference resources in Enterprise Data Catalog, such as reference data sources and
Reference reference data sets. Select this option to keep unresolved reference assets. Default is selected.
Assets
Configure connection details for the PowerCenter repository that includes workflows to extract, load, and
transform data between databases. When you run an Informatica PowerCenter scan, the Scan job identifies
sensitive data, sensitive data protection, and sensitive data proliferation for columns in PowerCenter
workflows. Do not confuse with the PowerCenter repository that exists on the same domain as the Data
Privacy Management Service.
The following table describes the Informatica PowerCenter on Oracle connection properties:
Property Description
Gateway Host Required. Host name or IP address of the machine that hosts the PowerCenter repository database.
Name or
Address
Gateway Port Required. Port number of the machine that hosts the PowerCenter repository database.
Number
Informatica Optional. A collection of user accounts and groups in the Informatica domain. Enter the LDAP
Security security domain name, if one exists. Otherwise, enter: Native. Default is Native.
Domain
Repository User Required. User that accesses the PowerCenter repository. The user must have PowerCenter
Name Repository Service privileges for the folders, runtime objects, sources and targets, and design
objects privilege groups.
Repository User Required. Password for the PowerCenter repository user. When you save the data store, Data
Password Privacy Management encrypts the password.
Connection Optional. Native connection string the PowerCenter Repository Service uses to access the
String PowerCenter repository database. The connection string cannot exceed 32 characters. Enter the
connection string using the following syntax: <database_name>
Secure JDBC Required database parameters to access databases that are secured with the SSL protocol. The
Parameters parameter string cannot exceed 500 characters. When you save the data store, Data Privacy
Management encrypts the parameter string.
Enter the parameters as name=value pairs separated by semicolon characters (;). You can enter
the following secure database parameters:
- EncryptionMethod. Required. Indicates whether data is encrypted when transmitted over the
network. This parameter must be set to SSL.
- HostNameInCertificate. Optional. Host name of the machine that hosts the secure database. If
you specify a host name, Data Privacy Management validates the host name included in the
connection string against the host name in the SSL certificate.
- TrustStore. Required. Path and file name of the truststore file that contains the SSL certificate
for the database.
- TrustStorePassword. Required. Password for the truststore file.
- KeyStore. Directory of the keystore file.
- KeyStorePassword. Password used to access the keystore file.
- KeyPassword. Password used to access individual keys in the keystore file.
- ValidateServerCertificate. Optional. Set to True or False. If True, Data Privacy Management
validates the certificate that the database server sends. If you specify the
HostNameInCertificate parameter, Data Privacy Management also validates the host name in the
certificate. If False, Data Privacy Management does not validate the certificate that the database
server sends. Data Privacy Management ignores any truststore information that you specify.
Data Privacy Management appends the secure JDBC parameters to the connection string.
Database Host Optional. Database host name for the PowerCenter repository service.
Database Port Optional. Database port number for the PowerCenter repository service.
Database Optional. TNS service name for the Oracle database in the PowerCenter repository.
Service
Database User Optional. User name for the PowerCenter repository database. The user account must have the
CREATE_VIEW privilege. If you use a read-only user, the user account must have the
CREATE_SYNONYM privilege.
Database Optional. Password for the PowerCenter repository database user. When you save the data store,
Password Data Privacy Management encrypts the password.
Informatica Optional. Name of the Informatica domain on which the PowerCenter Repository Service runs.
Domain
PCIS Name Optional. Name of the PowerCenter Integration Service (PCIS) that is assigned to the PowerCenter
repository.
PowerCenter Optional. Name of the folder that Data Privacy Management creates in the PowerCenter repository.
Folder Name The folder includes a workflow that reads connection properties from database configuration files
and PowerCenter parameter files. Data Privacy Management requires the workflow to import the
properties for all connections in the PowerCenter repository.
Site Key Full directory path of the encryption key for the Informatica domain on which the PowerCenter
Location Repository Service runs. The encryption key is stored in a file named siteKey. When you scan a
PowerCenter repository, the Scan job imports connections and connection properties. The Scan job
creates a data store in the Data Privacy Management repository for each connection in the
PowerCenter repository.
To read encrypted passwords in the PowerCenter repository, Data Privacy Management uses the
encryption key for the Informatica domain on which the PowerCenter Repository Service runs. To
encrypt the passwords in the Data Privacy Management repository, Data Privacy Management uses
the encryption key for the Information domain on which the Data Privacy Management Service runs.
Required for PowerCenter repositories on Informatica version 9.6x or later. If the encryption key is
not on the machine that hosts the Data Privacy Management installation, create a mount point. If
you do not configure the encryption key location, Data Privacy Management cannot read the
passwords from the PowerCenter repository. After you scan the PowerCenter repository, you must
manually add the passwords for all created data stores.
Not required for PowerCenter repositories on Informatica version 9.5x or earlier. By default, Data
Privacy Management uses a static key to read the passwords in the PowerCenter repository.
PCIS OS Profile Optional. Name of the operating system (OS) profile for the PowerCenter Integration Service
(PCIS). An operating system profile is a type of security that the PowerCenter Integration Service
uses to isolate the runtime user environment.
Parameter File Optional. Select a database configuration file that contains parameters to use when the data store
scan runs.
Memory Optional. Memory that the Enterprise Data Catalog scanner uses when you run a scan on the data
store. Select High, Low, or Medium. Default is Low.
You might want to increase the scanner memory in the following example scenarios: you scan a
large data store, you do not specify the schemas for a data store, or you scan multiple data stores
in parallel. If you change the scanner memory, you must change the memory settings in the Yarn
Resource Manager.
JVM Options Optional. JVM parameters that configure the scanner container. Use the following arguments to
configure the parameters:
- Dscannerloglevel=<DEBUG/INFO/ERROR>. Changes the log level of the scanner to values
such as DEBUG, ERROR, or INFO. Default value is INFO.
- Dscanner.container.core=<Number of cores>. Increases the core for the scanner
container. The value must be a number.
- Dscanner.yarn.app.environment=<key=value>. Key pair value that you need to set in the
Yarn environment. Use a comma to separate the key pair values.
- Dscanner.pmem.enabled.container.memory.jvm.memory.ratio=<1.0/2.0>. Increases
the scanner container memory when pmem is enabled. Default value is 1.
Enable Optional. Select this option to extract metadata about resources in Enterprise Data Catalog that the
Reference data store references. Default is cleared.
Resources
Retain Optional. Appears when you select the Enable Reference Resources property. Reference assets are
Unresolved assets from reference resources in Enterprise Data Catalog, such as reference data sources and
Reference reference data sets. Select this option to keep unresolved reference assets. Default is selected.
Assets
When you test the data store connection, the Data Privacy Management Service tests the combination of the
Blob Endpoint URL and Container Name properties. If any of the property values are not valid, the test
connection fails.
The following table describes the Microsoft Azure Blob Storage connection properties:
Property Description
Scan with Select to associate the data store with a remote agent when you run Scan jobs to discover personal
Remote Agent and sensitive data or build a Subject Registry index.
Important: If you select this option, you must associate the data store with a remote agent before
running a scan. After you save the data store, you cannot change this property.
Blob Endpoint Required. Microsoft Azure Blob Storage URL to access a container. Use the value listed for the
URL Primary BLOB Service endpoint URL in the Azure portal.
Source Optional. Directory path that contains the files to scan. To scan all root directories, enter a forward
Directory slash /. To scan a specific directory, use the following syntax: /<Directory_Name>
For example, if the directory name is Folder1, enter /Folder1. Do not include a forward slash / at
the end of the directory. The Scan job reads files from the last directory specified. For example, if
you enter /HumanResource/Employees/US, the Scan job only reads files from the US directory.
The Scan job reads files from the specified directory only.
If blank, the data store has a status of Not Complete. You cannot scan incomplete data stores.
Shared Required. A token that provides access to the Microsoft Azure Blob Storage resource. Use the value
Access for the Shared Access Signature in the Azure portal.
Signature
Token
Folder Required if you enable the Scan with Remote Agent property. Folders that the Scan job reads from
Options the location configured in the Source Directory property. Select one of the following options:
- All
- Select Specific Folders
- Use Regular Expression
Default is All.
Include Required if you enable the Scan with Remote Agent property and you choose the Select Specific
Folders Folders option in the Folder Options property.
Enter the directory for at least one folder that you want to include in the scan. The directory must
begin with a forward slash (/).
Exclude Optional if you enable the Scan with Remote Agent property and you choose the Select Specific
Folders Folders option in the Folder Options property.
Enter the directory for at least one folder that you want to exclude from the scan. The directory must
begin with a forward slash (/).
Folder Required if you enable the Scan with Remote Agent property and you choose the Use Regular
Regular Expression option in the Folder Options property.
Expression Enter a regular expression for at least one folder that you want to include in the scan.
Blob Prefix Optional. To sort the blobs based on the prefix of the Microsoft Azure Blob name, enter the prefix.
The prefix is case sensitive.
File Types Required. File types that the Scan job reads from the location configured in the Source Directory
property. Select All to scan all file types. Choose Select to select specific file types to scan.
Default is All.
Selected File Required if you choose Select in the File Types property. Click Select All to choose all file types, or
Types select one or more of the following options:
- Avro
- Compressed Files
- Delimited and Text
- Image
- JSON
- Microsoft Excel
- Microsoft PowerPoint
- Microsoft Word
- PDF
- Parquet
- Webpage Files
- XML
Note: If you enable Optical Character Recognition (OCR) and you select the PDF file type, the PDF file
content is processed as optical characters.
To choose the Image file type, OCR must be enabled for unstructured scans in the System Settings.
Enter File Optional. Available if you do not enable the Scan with Remote Agent property. Character that
Delimiter separates entries in the file. Specify the file delimiter if the file from which you extract metadata
uses a delimiter other than the following list of delimiters:
- Comma (,)
- Horizontal tab (\t)
- Semicolon (;)
- Colon (:)
- Pipe symbol (|)
Verify that you enclose the delimiter in single quotes. For example, '$'. Use a comma to separate
multiple delimiters. For example, '$','%','&'. Default is a comma (,).
Extract File Optional. Select to scan metadata from the files specified in the File Types property.
Metadata
from other
File Types
First Level Optional. Available if you do not enable the Scan with Remote Agent property. Name of the folder
Directory that you want to scan. For example: subfolderA.
To specify multiple folders, separate the folder names with a comma (,). For example: subfolderA,
subfolderB
Include Sub Optional. Available if you do not enable the Scan with Remote Agent property. If enabled, the scan
Directory reads data from the directory and all subdirectories of the location configured in the Source
Directory property or the First Level Directory property, if specified.
Default is disabled.
Memory Optional. Memory that the Enterprise Data Catalog scanner uses when you run a scan on the data
store. Select High, Low, or Medium. Default is Low.
You might want to increase the scanner memory in the following example scenarios: you scan a large
data store, you do not specify the schemas for a data store, or you scan multiple data stores in
parallel. If you change the scanner memory, you must change the memory settings in the Yarn
Resource Manager.
JVM Options Optional. JVM parameters that configure the scanner container. Use the following arguments to
configure the parameters:
- Dscannerloglevel=<DEBUG/INFO/ERROR>. Changes the log level of the scanner to values such
as DEBUG, ERROR, or INFO. Default value is INFO.
- Dscanner.container.core=<Number of cores>. Increases the core for the scanner
container. The value must be a number.
- Dscanner.yarn.app.environment=<key=value>. Key pair value that you need to set in the
Yarn environment. Use a comma to separate the key pair values.
- Dscanner.pmem.enabled.container.memory.jvm.memory.ratio=<1.0/2.0>. Increases
the scanner container memory when pmem is enabled. Default value is 1.
Azure Blob Microsoft Azure Blob Storage access key. If you do not enter a key, the data store is incomplete. You
Account Key cannot scan incomplete data stores.
Source Optional. Connection name in the Informatica domain that the Scan job uses to connect to the
Connection Microsoft Azure Blob Storage source. If you do not enter a name, Data Privacy Management creates
Name the source connection name.
When you test the data store connection, Data Privacy Management tests the combination of the Client Id,
Client Key, Directory Name, and Auth Endpoint URL properties. If any of the property values are not valid, the
test connection fails.
Prerequisite: Before you create the data store, you must merge the certificates contained in the
<INFA_HOME>/java/jre/lib/security/cacerts directory in the infa_truststore.jks file, which is located
in the following directory: <INFA_HOME>/services/shared/security/.
The following table describes the Microsoft Azure Data Lake connection properties:
Property Description
Scan with Select to associate the data store with a remote agent when you run Scan jobs to discover
Remote Agent personal and sensitive data or build a Subject Registry index.
Important: If you select this option, you must associate the data store with a remote agent before
running a scan. After you save the data store, you cannot change this property.
Account Name Optional. Name of the Microsoft Azure Data Lake account that you created on the Azure portal.
ADLS Source Required. Azure Data Lake Storage (ADLS) source type for big data analytic workloads. Select Data
Type Lake Store Gen 1 or Data Lake Store Gen 2.
Default is Data Lake Store Gen 1.
Client Id Required if the ADLS Source Type is Data Lake Store Gen 1. Client ID to connect to the Microsoft
Azure Data Lake Store. Use the value listed for the application ID on the Azure portal.
Client Key Required if the ADLS Source Type is Data Lake Store Gen 1. Client key to connect to the Microsoft
Azure Data Lake Store. Use the Azure Active Directory application key value in the Azure portal as
the client key.
ADLS Gen2 Required if the ADLS Source Type is Data Lake Store Gen 2. The authentication method used to
Authentication grant access to to files and directories in the data store. Select Shared Key or Azure Active
Method Directory.
Default is Shared Key.
Authentication Required if the ADLS Source Type is Data Lake Store Gen 2. Indicates whether users can access
via Proxy the data store remotely. Select Disabled or Enabled.
Default is Disabled.
If you selected the Scan with Remote Agent check box, do not change the default value.
Directory Name Required. Directory of the Azure Data Lake Store. To scan all root directories, enter a forward
slash /. To scan a specific directory, use the following syntax: /<Directory_Name>
For example, if the directory name is Folder1, enter /Folder1. Do not include a forward slash /
at the end of the directory. The Scan job reads files from the last directory specified. For example,
if you enter /HumanResource/Employees/US, the Scan job only reads files from the US
directory.
If blank, the data store has a status of Not Complete. You cannot scan incomplete data stores.
Tenant ID Required if the ADLS Source Type is Data Lake Store Gen 2 and the ADLS Gen2 Authentication
Method is Azure Active Directory. The ID of the Azure Active Directory.
Auth Endpoint Required if the ADLS Source Type is Data Lake Store Gen 1. The OAuth 2.0 token endpoint URL in
URL the Azure portal.
Storage Account Required if the ADLS Source Type is Data Lake Store Gen 2 and the ADLS Gen2 Authentication
Key Method is Shared Key. The 512-bit primary storage account access key.
Folder Options Required if you enable the Scan with Remote Agent property. Folders that the Scan job reads from
the location configured in the Source Directory property. Select one of the following options:
- All
- Select Specific Folders
- Use Regular Expression
Default is All.
Include Folders Required if you enable the Scan with Remote Agent property and you choose the Select Specific
Folders option in the Folder Options property.
Enter the directory for at least one folder that you want to include in the scan. If the ADLS Source
Type is Data Lake Store Gen 2, the directory must begin with a forward slash (/).
Exclude Folders Optional if you enable the Scan with Remote Agent property and you choose the Select Specific
Folders option in the Folder Options property.
Enter the directory for at least one folder that you want to exclude from the scan. If the ADLS
Source Type is Data Lake Store Gen 2, the directory must begin with a forward slash (/).
Folder Regular Required if you enable the Scan with Remote Agent property and you choose the Use Regular
Expression Expression option in the Folder Options property.
Enter a regular expression for at least one folder that you want to include in the scan.
File Types Required. File types that the Scan job reads from the location configured in the Directory Name
property. Select All to scan all file types. Choose Select to select specific file types to scan.
Default is All.
Selected File Required if you choose Select in the File Types property. Click Select All to choose all file types,
Types or select one or more of the following options:
- Avro
- Compressed Files
- Delimited and Text
- Image
- JSON
- Microsoft Excel
- Microsoft PowerPoint
- Microsoft Word
- PDF
- Parquet
- Webpage Files
- XML
Note: If you enable Optical Character Recognition (OCR) and you select the PDF file type, the PDF
file content is processed as optical characters.
To choose the Image file type, OCR must be enabled for unstructured scans in the System
Settings.
Enter File Optional. Available if you do not enable the Scan with Remote Agent property. Character that
Delimiter separates entries in the file. Specify the file delimiter if the file from which you extract metadata
uses a delimiter other than the following list of delimiters:
- Comma (,)
- Horizontal tab (\t)
- Semicolon (;)
- Colon (:)
- Pipe symbol (|)
Verify that you enclose the delimiter in single quotes. For example, '$'. Use a comma to separate
multiple delimiters. For example, '$','%','&'. Default is a comma (,).
First Level Optional. Available if you do not enable the Scan with Remote Agent property. Name of the folder
Directory that you want to scan. For example: subfolderA.
To specify multiple folders, separate the folder names with a comma (,). For example:
subfolderA, subfolderB
Include Sub Optional. Available if you do not enable the Scan with Remote Agent property. If enabled, the scan
Directory reads data from the directory and all subdirectories of the location configured in the Directory
Name property or the First Level Directory property, if specified.
Default is disabled.
Memory Optional. Available if you do not enable the Scan with Remote Agent property. Memory that the
Enterprise Data Catalog scanner uses when you run a scan on the data store. Select High, Low, or
Medium. Default is Low.
You might want to increase the scanner memory in the following example scenarios: you scan a
large data store, you do not specify the schemas for a data store, or you scan multiple data stores
in parallel. If you change the scanner memory, you must change the memory settings in the Yarn
Resource Manager.
JVM Options Optional. Available if you do not enable the Scan with Remote Agent property. JVM parameters
that configure the scanner container. Use the following arguments to configure the parameters:
- Dscannerloglevel=<DEBUG/INFO/ERROR>. Changes the log level of the scanner to values
such as DEBUG, ERROR, or INFO. Default value is INFO.
- Dscanner.container.core=<Number of cores>. Increases the core for the scanner
container. The value must be a number.
- Dscanner.yarn.app.environment=<key=value>. Key pair value that you need to set in the
Yarn environment. Use a comma to separate the key pair values.
- Dscanner.pmem.enabled.container.memory.jvm.memory.ratio=<1.0/2.0>.
Increases the scanner container memory when pmem is enabled. Default value is 1.
Source Optional. Available if you do not enable the Scan with Remote Agent property. Connection name in
Connection the Informatica domain that the scan job uses to connect to the Microsoft Azure Data Lake
Name source. If you do not enter a name, Data Privacy Management creates the source connection
name.
When you test the data store connection, the Data Privacy Management Service tests the combination of the
User, Host, Port, and Database properties. If any of the property values are not valid, the test connection fails.
The following table describes the Microsoft Azure SQL Database connection properties:
Property Description
User Required. User name to connect to the Microsoft Azure SQL Database account. The user must have
the Administrator role or read permission to the table data and views. The user must also have
access to the IP address.
Password Optional. Password to connect to the Microsoft Azure SQL Database account.
Instance Optional. The instance name of the Microsoft Azure SQL Database. You can alternatively specify the
port number of the instance. Informatica recommends that you specify the port number of the
instance.
Connection Required. Native connection string to access data from the database.
String Use the following syntax: <server_name>@<database_name>
Note:
- The server name and database entry must exist in the odbc.ini file.
- You must set the property EncryptionMethod to 1 in the odbc.ini file. For example:
[azuresqldbqe]
Driver=/home/opt/infa/10.2.2/ODBC7.1/lib/DWsqls28.so
Description=SQL Server Connection with encryption
HostName=infaserver.database.windows.net
PortNumber=1433
Database=azuresqldbqe
EncryptionMethod=1
ValidateServerCertificate=0
Secure JDBC Required database parameters to access databases that are secured with the SSL protocol. The
Parameters parameter string cannot exceed 500 characters. When you save the data store, Data Privacy
Management encrypts the parameter string.
Enter the parameters as name=value pairs separated by semicolon characters (;). You can enter the
following secure database parameters:
- EncryptionMethod. Required. Indicates whether data is encrypted when transmitted over the
network. This parameter must be set to SSL.
- HostNameInCertificate. Optional. Host name of the machine that hosts the secure database. If you
specify a host name, Data Privacy Management validates the host name included in the connection
string against the host name in the SSL certificate.
- TrustStore. Required. Path and file name of the truststore file that contains the SSL certificate for
the database.
- TrustStorePassword. Required. Password for the truststore file.
- KeyStore. Directory of the keystore file.
- KeyStorePassword. Password used to access the keystore file.
- KeyPassword. Password used to access individual keys in the keystore file.
- ValidateServerCertificate. Optional. Set to True or False. If True, Data Privacy Management
validates the certificate that the database server sends. If you specify the HostNameInCertificate
parameter, Data Privacy Management also validates the host name in the certificate. If False, Data
Privacy Management does not validate the certificate that the database server sends. Data Privacy
Management ignores any truststore information that you specify.
Data Privacy Management appends the secure JDBC parameters to the connection string.
Schema Optional. Select one of the following options to determine how you add schemas to the data store:
Option - All. Includes all schemas that you can access.
- Select From List. Displays a list of schemas in the Schema property.
- Specify Regex. Enables the Regular Expression for Schema property. Select this option if you want
to enter schemas as a regular expression.
Schema Displays a list of available schemas to include in the scan if you select Select From List in the
Schema Option property.
Regular Displays if you select Specify Regex in the Schema Option property. Enter a regular expression to
Expression specify schemas that you want to include or exclude from the scan. Schema names are case-
for Schema sensitive.
Source Optional. A filter to specify tables that you want to include or exclude in the scan. Enter a regular
Metadata expression to include multiple filter parameters. Default is All.
Filter For example: Table1, NOT Table2, SUB%, %EMP, %DATA%. The filter includes Table1, tables
with names that start with SUB, names that end with EMP, and tables that include DATA anywhere in
the name. The filter excludes Table2.
Case Optional. Select to indicate that the data store is configured for case sensitivity. Default is cleared.
Sensitive
Memory Optional. Memory that the Enterprise Data Catalog scanner uses when you run a scan on the data
store. Select High, Low, or Medium. Default is Low.
You might want to increase the scanner memory in the following example scenarios: you scan a large
data store, you do not specify the schemas for a data store, or you scan multiple data stores in
parallel. If you change the scanner memory, you must change the memory settings in the Yarn
Resource Manager.
JVM Options Optional. JVM parameters that configure the scanner container. Use the following arguments to
configure the parameters:
- Dscannerloglevel=<DEBUG/INFO/ERROR>. Changes the log level of the scanner to values such
as DEBUG, ERROR, or INFO. Default value is INFO.
- Dscanner.container.core=<Number of cores>. Increases the core for the scanner
container. The value must be a number.
- Dscanner.yarn.app.environment=<key=value>. Key pair value that you need to set in the
Yarn environment. Use a comma to separate the key pair values.
- Dscanner.pmem.enabled.container.memory.jvm.memory.ratio=<1.0/2.0>. Increases
the scanner container memory when pmem is enabled. Default value is 1.
CyberArk Optional. Name of the CyberArk safe that contains the database password. When you test the data
Safe store connection or you scan the data store, Data Privacy Management retrieves the password from
CyberArk.
Include Optional. Determines whether a Scan job evaluates data in views. If enabled, a Scan job identifies
Views in the sensitive data in views during metadata and data profiling. The Scan job evaluates data in all views
Scan for the schemas specified in the Schema property. If the property is blank, the Scan job evaluates
data in all views of all schemas.
If the data store was imported from a parent data store, such as Informatica PowerCenter, the Scan
job for the parent data store can identify the lineage of sensitive data in views. For example, you ran
an Informatica PowerCenter Scan job that imported a data store that included views. You ran a
Database Scan job on the data store that identified sensitive data in the views. When you run an
Informatica PowerCenter Scan job to identify data proliferation, the Scan job identifies lineage for the
sensitive data in the views.
If disabled, the Scan job does not evaluate data in views. If you disable the property after a Scan job
completes, any scans that you run in the future will not evaluate data in views. However, the
completed Scan job results remain.
Source Optional. Connection name in the Informatica domain that the Scan job uses to connect to the
Connection Microsoft Azure SQL Database source.
Name
When you test the data store connection, the Data Privacy Management Service tests the combination of the
User, Host, Port, and Database properties. If any of the property values are not valid, the test connection fails.
The following table describes the Microsoft Azure SQL Data Warehouse connection properties:
Property Description
User Required. User name to connect to the Microsoft Azure SQL Data Warehouse account. The user must
have read permission to the source files. The user must also have access to the IP address.
Password Optional. Password to connect to the Microsoft Azure SQL Data Warehouse account.
Database Required. Name of the Microsoft Azure SQL Data Warehouse database.
Instance Optional. The instance name of the Microsoft Azure SQL Data Warehouse database. You can
alternatively specify the port number of the instance. Informatica recommends that you specify the
port number of the instance.
Secure JDBC Required database parameters to access databases that are secured with the SSL protocol. The
Parameters parameter string cannot exceed 500 characters. When you save the data store, Data Privacy
Management encrypts the parameter string.
Enter the parameters as name=value pairs separated by semicolon characters (;). You can enter the
following secure database parameters:
- EncryptionMethod. Required. Indicates whether data is encrypted when transmitted over the
network. This parameter must be set to SSL.
- HostNameInCertificate. Optional. Host name of the machine that hosts the secure database. If you
specify a host name, Data Privacy Management validates the host name included in the connection
string against the host name in the SSL certificate.
- TrustStore. Required. Path and file name of the truststore file that contains the SSL certificate for
the database.
- TrustStorePassword. Required. Password for the truststore file.
- KeyStore. Directory of the keystore file.
- KeyStorePassword. Password used to access the keystore file.
- KeyPassword. Password used to access individual keys in the keystore file.
- ValidateServerCertificate. Optional. Set to True or False. If True, Data Privacy Management
validates the certificate that the database server sends. If you specify the HostNameInCertificate
parameter, Data Privacy Management also validates the host name in the certificate. If False, Data
Privacy Management does not validate the certificate that the database server sends. Data Privacy
Management ignores any truststore information that you specify.
Data Privacy Management appends the secure JDBC parameters to the connection string.
Azure Blob Optional. Microsoft Azure Storage access key to stage the files.
Account Key
Schema Schemas from which Data Privacy Management reads when you scan the data store. The schema is
case-sensitive. Use a comma to separate multiple entries. If empty, Data Privacy Management reads
all schemas from the database.
When you test the connection, the Data Privacy Management Service does not test if the schema
exists. If the schema does not exist, the Scan job fails.
Source Optional. A filter to specify tables that you want to include or exclude in the scan. Enter a regular
Metadata expression to include multiple filter parameters. Default is All.
Filter For example: Table1, NOT Table2, SUB%, %EMP, %DATA%. The filter includes Table1, tables
with names that start with SUB, names that end with EMP, and tables that include DATA anywhere in
the name. The filter excludes Table2.
Azure Blob Required. Name of the container in Microsoft Azure Storage to use for staging before extracting data
Container from Microsoft Azure SQL Data Warehouse.
Name
Case Optional. Select to indicate that the data store is configured for case sensitivity. Default is cleared.
Sensitive
Memory Optional. Memory that the Enterprise Data Catalog scanner uses when you run a scan on the data
store. Select High, Low, or Medium. Default is Low.
You might want to increase the scanner memory in the following example scenarios: you scan a large
data store, you do not specify the schemas for a data store, or you scan multiple data stores in
parallel. If you change the scanner memory, you must change the memory settings in the Yarn
Resource Manager.
JVM Options Optional. JVM parameters that configure the scanner container. Use the following arguments to
configure the parameters:
- Dscannerloglevel=<DEBUG/INFO/ERROR>. Changes the log level of the scanner to values such
as DEBUG, ERROR, or INFO. Default value is INFO.
- Dscanner.container.core=<Number of cores>. Increases the core for the scanner
container. The value must be a number.
- Dscanner.yarn.app.environment=<key=value>. Key pair value that you need to set in the
Yarn environment. Use a comma to separate the key pair values.
- Dscanner.pmem.enabled.container.memory.jvm.memory.ratio=<1.0/2.0>. Increases
the scanner container memory when pmem is enabled. Default value is 1.
CyberArk Optional. Name of the CyberArk safe that contains the database password. When you test the data
Safe store connection or you scan the data store, Data Privacy Management retrieves the password from
CyberArk.
Include Optional. Determines whether a Scan job evaluates data in views. If enabled, a Scan job identifies
Views in the sensitive data in views during metadata and data profiling. The Scan job evaluates data in all views
Scan for the schemas specified in the Schema property. If the property is blank, the Scan job evaluates
data in all views of all schemas.
If the data store was imported from a parent data store, such as Informatica PowerCenter, the Scan
job for the parent data store can identify the lineage of sensitive data in views. For example, you ran
an Informatica PowerCenter Scan job that imported a data store that included views. You ran a
Database Scan job on the data store that identified sensitive data in the views. When you run an
Informatica PowerCenter Scan job to identify data proliferation, the Scan job identifies lineage for the
sensitive data in the views.
If disabled, the Scan job does not evaluate data in views. If you disable the property after a Scan job
completes, any scans that you run in the future will not evaluate data in views. However, the
completed Scan job results remain.
Source Optional. Connection name in the Informatica domain that the Scan job uses to connect to the
Connection Microsoft Azure SQL Data Warehouse source.
Name
The following table describes the common Microsoft OneDrive connection properties:
Property Description
Scan with Select to associate the data store with a remote agent when you run scan jobs to discover personal
Remote Agent and sensitive data or build a Subject Registry index. Default is selected.
Required if you configure the data store with tenant credentials.
Important: If you select this option, you must associate the data store with a remote agent before
running a scan. After you save the data store, you cannot change this property.
Use Tenant Select to configure the data store to include multiple user accounts in a scan. Available if you
Credentials choose to scan with remote agent. After you save the data store, you cannot change this property.
Folder Options Required if you enable the Scan with Remote Agent property. Folders that the Scan job reads from
the location configured in the Source Directory property. Select one of the following options:
- All
- Select Specific Folders
- Use Regular Expression
Default is All.
Include Folders Required if you enable the Scan with Remote Agent property and choose the Select Specific
Folders option in the Folder Options property. Enter the file paths for at least one folder that you
want to include in the scan.
Exclude Folders Optional if you enable the Scan with Remote Agent property and choose the Select Specific
Folders option in the Folder Options property. Enter the file paths for at least one folder that you
want to exclude from the scan.
Folder Regular Required if you enable the Scan with Remote Agent property and choose the Use Regular
Expression Expression option in the Folder Options property. Enter a regular expression for at least one folder
that you want to include in the scan.
File Types Required. File types that the scan job reads from the location configured in the Source Directory
property. Select All to scan all file types. Choose Select to select specific file types to scan.
Default is All.
Selected File Required if you choose Select in the File Types property. Click Select All to choose all file types, or
Types select one or more of the following options:
- Avro
- Compressed Files
- Delimited and Text
- Image
- JSON
- Microsoft Excel
- Microsoft PowerPoint
- Microsoft Word
- PDF
- Parquet
- Webpage Files
- XML
Note: If you enable Optical Character Recognition (OCR) and you select the PDF file type, the PDF
file content is processed as optical characters.
To choose the Image file type, OCR must be enabled for unstructured scans in the System Settings.
The following table describes the properties that appear if you configure the data store with tenant
credentials:
Property Description
Client ID Client ID of the Azure Active Directory application. Required if you configure the data store with
tenant credentials.
Client Secret Client secret of the Azure Active Directory application. Required if you configure the data store with
tenant credentials.
Tenant ID The unique ID of the Azure tenant. Required if you configure the data store with tenant credentials.
Department A filter to include users based on the user departments they belong to. Enter the name of the
departments that you want to include in scans. Use commas or press Enter to separate multiple
values.
Include Users A filter to include users. Enter the email IDs of users or accounts that you want to scan. Use commas
or press Enter to separate multiple values.
Exclude Users A filter to exclude users. Enter the email IDs of users or accounts to exclude from the scan. Use
commas or press Enter to separate multiple values.
Note: The Exclude Users values takes precedence over Include Users.
Email groups A filter to include users based on group membership. Enter the group names. Use commas or press
Enter to separate multiple values.
The following table describes the properties that appear if you configure the data store with user credentials:
Property Description
OneDrive URL Required for single user account configuration. URL to access OneDrive.
User Name Required if you do not use tenant credentials. The user name for the OneDrive account.
Password Required if you do not use tenant credentials. The password for the OneDrive account.
Source Required if you do not use tenant credentials. The source directory that contains the files you want to
Directory scan. You must enter /Documents before the file path. Example: /Documents/
<userFolder_for_Scan>
Enter File Optional. Available if you do not enable the Scan with Remote Agent property. Character that
Delimiter separates entries in the file. Specify the file delimiter if the file from which you extract metadata uses
a delimiter other than the following list of delimiters:
- Comma (,)
- Horizontal tab (\t)
- Semicolon (;)
- Colon (:)
- Pipe symbol (|)
Verify that you enclose the delimiter in single quotes. For example, '$'. Use a comma to separate
multiple delimiters. For example, '$','%','&'. Default is a comma (,).
Extract File Available if you do not enable the Scan with Remote Agent property. Select to scan metadata from
Metadata the file types not specified in the Selected File Types property. Data Privacy Management extracts
from other the metadata for the file types that the Profiling job step does not support.
File Types
First Level Optional. Available if you do not enable the Scan with Remote Agent property. Name of the folder
Directory that you want to scan. For example: subfolderA.
To specify multiple folders, separate the folder names with a comma (,). For example: subfolderA,
subfolderB
Include Sub Optional. Available if you do not enable the Scan with Remote Agent property. If enabled, the scan
Directory reads data from the directory and all subdirectories of the location configured in the Source
Directory property or the First Level Directory property, if specified.
Default is disabled.
Memory Optional. Available if you do not enable the Scan with Remote Agent property. Memory to use when
you scan the data store. Select High, Low, or Medium. Default is Low.
You might want to increase the scanner memory in the following example scenarios: you scan a large
data store, you do not specify the schemas for a data store, or you scan multiple data stores in
parallel. If you change the scanner memory, you must change the memory settings in the Yarn
Resource Manager.
JVM Options Optional. Available if you do not enable the Scan with Remote Agent property. JVM parameters that
configure the scanner container. Use the following arguments to configure the parameters:
- Dscannerloglevel=<DEBUG/INFO/ERROR>. Changes the log level of the scanner to values such
as DEBUG, ERROR, or INFO. Default value is INFO.
- Dscanner.container.core=<Number of cores>. Increases the core for the scanner
container. The value must be a number.
- Dscanner.yarn.app.environment=<key=value>. Key pair value that you need to set in the
Yarn environment. Use a comma to separate the key pair values.
- Dscanner.pmem.enabled.container.memory.jvm.memory.ratio=<1.0/2.0>. Increases
the scanner container memory when pmem is enabled. Default value is 1.
Profile Optional. Available if you do not enable the Scan with Remote Agent property. Select one of the
Execution following profiling engine modes:
Engine - Hadoop. Data Privacy Management uses the Blaze engine to run profiles.
- Native. Data Privacy Management uses the Informatica Data Integration Service engine to run
profiles.
Default is Native.
Hadoop Available if you do not enable the Scan with Remote Agent property. If you select Hadoop in the
Connection Profile Execution Engine property, enter the name of the Hadoop connection listed in the
Name Administrator tool, Connections tab.
Source Available if you do not enable the Scan with Remote Agent property. The connection name in the
Connection Informatica domain. Do not enter a value for File Management scans.
Name
The following table describes the common Microsoft SharePoint connection properties:
Property Description
Scan with Select to associate the data store with a remote agent when you run Scan jobs to discover personal
Remote Agent and sensitive data or build a Subject Registry index. Default is selected.
Required if you configure the data store with tenant credentials.
Important: If you select this option, you must associate the data store with a remote agent before
running a scan. After you save the data store, you cannot change this property.
Use Tenant Select to configure the data store to include multiple user accounts in a scan. Available if you
Credentials choose to scan with remote agent.
Enable Subsite Select Yes to scan subsites from the SharePoint site. Default is No.
Scan
Include Nested If you select Yes in the Enable Subsite Scan property, this option appears. Select the option to
Subsites include nested subsites when scanning subsites from the SharePoint site.
Folder Options Required if you enable the Scan with Remote Agent property. Folders that the Scan job reads from
the location configured in the SharePoint URL property. Select one of the following options:
- All
- Select Specific Folders
- Use Regular Expression
Default is All.
Note: The names that appear in the list of folders might differ from the internal names. For tenant-
level data stores, scan results include the internal names and not the display names. If the scan
results display a different name, check the SharePoint web URL of the folder to identify the folder it
refers to.
Include Folders Required if you enable the Scan with Remote Agent property and you choose the Select Specific
Folders option in the Folder Options property. Enter the file paths for at least one folder that you
want to include in the scan.
If you enabled subsite scans, enter a space as a prefix to the path. Use the following syntax: /
<space><subsite name>/<folder name>
For example: / finance/year2020
Exclude Optional if you enable the Scan with Remote Agent property and you choose the Select Specific
Folders Folders option in the Folder Options property. Enter the file paths for at least one folder that you
want to exclude from the scan.
If you enabled subsite scans, enter a space as a prefix to the path. Use the following syntax: /
<space><subsite name>/<folder name>
For example: / finance/year2020
Folder Regular Required if you enable the Scan with Remote Agent property and you choose the Use Regular
Expression Expression option in the Folder Options property. Enter a regular expression for at least one folder
that you want to include in the scan.
If you enabled subsite scans, enter a space as a prefix to the path. Use the following syntax: /
<space><subsite name>/<RegEx>
For example: / finance/year*
File Types Required. File types that the Scan job reads from the location configured in the SharePoint URL
property. Select All to scan all file types. Choose Select to select specific file types to scan.
Default is All.
Enter File Optional. Available if you do not enable the Scan with Remote Agent property. Character that
Delimiter separates entries in the file. Specify the file delimiter if the file from which you extract metadata
uses a delimiter other than the following list of delimiters:
- Comma (,)
- Horizontal tab (\t)
- Semicolon (;)
- Colon (:)
- Pipe symbol (|)
Verify that you enclose the delimiter in single quotes. For example, '$'. Use a comma to separate
multiple delimiters. For example, '$','%','&'. Default is a comma (,).
Selected File Required if you choose Select in the File Types property. Click Select All to choose all file types, or
Types select one or more of the following options:
- Avro
- Compressed Files
- Delimited and Text
- Image
- JSON
- Microsoft Excel
- Microsoft PowerPoint
- Microsoft Word
- PDF
- Parquet
- Webpage Files
- XML
Note: If you enable Optical Character Recognition (OCR) and you select the PDF file type, the PDF
file content is processed as optical characters.
To choose the Image file type, OCR must be enabled for unstructured scans in the System Settings.
The following table describes the properties that appear if you configure the data store with tenant
credentials:
Property Description
SharePoint URL Required. The root URI for the data source exposed through the Open Data Protocol layer. All
requests are extensions of this URI. For example: https://infasharepoint.abcd.com/
Site/_vti_bin/Data.svc
Client ID Client ID of the Azure Active Directory application. Required if you configure the data store with
tenant credentials.
Client Secret Client secret of the Azure Active Directory application. Required if you configure the data store
with tenant credentials.
Tenant ID The unique ID of the tenant or domain. Required if you configure the data store with tenant
credentials.
Enable Subsite Select the option to include subsites from the SharePoint site in the scan. By default, the option
Scan is selected.
Include Sites Enter the sites to include in the scan. By default, the scan includes all sites. Enter sites to scan
specific sites and exclude all other sites.
Property Description
SharePoint Required. The SharePoint URl that you extract. For example: https://
URL infasharepoint.abcd.com/Site/_vti_bin/Data.svc
User Name Required. The user name for the SharePoint account.
SharePoint Required. Specifies the type of SharePoint content. Select one of the following options:
Content Type - All. Scans contents from SharePoint lists and libraries.
- SharePoint List. Scans content from attached files in the SharePoint list, such as calender events,
Wikipedia pages, announcements, and links.
- SharePoint Library. Scans content from the SharePoint library.
Default is All.
Extract File Select to scan metadata from the file types not specified in the Selected File Types property. Data
Metadata from Privacy Management extracts the metadata for the file types that the Profiling job step does not
other File support.
Types
First Level Optional. Available if you do not enable the Scan with Remote Agent property. Name of the folder
Directory that you want to scan. For example: subfolderA.
To specify multiple folders, separate the folder names with a comma (,). For example: subfolderA,
subfolderB
Include Sub Optional. Available if you do not enable the Scan with Remote Agent property. If enabled, the scan
Directory reads data from the directory and all subdirectories of the location configured in the SharePoint URL
property or the First Level Directory property, if specified.
Default is disabled.
Memory Optional. Memory used when you scan the data store. Select High, Low, or Medium. Default is Low.
You might want to increase the scanner memory in the following example scenarios: you scan a
large data store, you do not specify the schemas for a data store, or you scan multiple data stores in
parallel. If you change the scanner memory, you must change the memory settings in the Yarn
Resource Manager.
JVM Options Optional. JVM parameters that configure the scanner container. Use the following arguments to
configure the parameters:
- Dscannerloglevel=<DEBUG/INFO/ERROR>. Changes the log level of the scanner to values
such as DEBUG, ERROR, or INFO. Default value is INFO.
- Dscanner.container.core=<Number of cores>. Increases the core for the scanner
container. The value must be a number.
- Dscanner.yarn.app.environment=<key=value>. Key pair value that you need to set in the
Yarn environment. Use a comma to separate the key pair values.
- Dscanner.pmem.enabled.container.memory.jvm.memory.ratio=<1.0/2.0>. Increases
the scanner container memory when pmem is enabled. Default value is 1.
Hadoop If you select Hadoop in the Profile Execution Engine property, enter the name of the Hadoop
Connection connection listed in the Administrator tool, Connections tab.
Name
Source The connection name in the Informatica domain. Do not enter a value for File Management scans.
Connection
Name
Property Description
User Required. Name of the user account to log into Salesforce. To import user metadata from Salesforce,
the user account must have the View All Data and View Event Log Files privileges. The Salesforce org
must have the Bulk API feature enabled.
Password Required. Password combined with the security token for the Salesforce user name. Salesforce issues
tokens through email during user registration. Enter the Salesforce password followed by the token,
without a space in between.
For example, if the Salesforce password is Infa2020 and the security token is
vK2VVTty2bw1V2K9zUj0KEzr, enter the password as Infa2020vK2VVTty2bw1V2K9zUj0KEzr.
Service URL Required. By default, this property displays the following URL: https://$YourInstance
$.salesforce.com/services/Soap/u/34.0
Replace $YourInstance$ with the instance name of the Salesforce account that you want to access.
For example, if your account is served by "ap2.salesforce.com," enter the following URL: http://
ap2.salesforce.com/services/Soap/u/34.0
Memory Optional. Memory that the Enterprise Data Catalog scanner uses when you run a scan on the data
store. Select High, Low, or Medium. Default is Low.
You might want to increase the scanner memory in the following example scenarios: you scan a large
data store, you do not specify the schemas for a data store, or you scan multiple data stores in
parallel. If you change the scanner memory, you must change the memory settings in the Yarn
Resource Manager.
JVM Options Optional. JVM parameters that configure the scanner container. Use the following arguments to
configure the parameters:
- Dscannerloglevel=<DEBUG/INFO/ERROR>. Changes the log level of the scanner to values such
as DEBUG, ERROR, or INFO. Default value is INFO.
- Dscanner.container.core=<Number of cores>. Increases the core for the scanner container.
The value must be a number.
- Dscanner.yarn.app.environment=<key=value>. Key pair value that you need to set in the
Yarn environment. Use a comma to separate the key pair values.
- Dscanner.pmem.enabled.container.memory.jvm.memory.ratio=<1.0/2.0>. Increases the
scanner container memory when pmem is enabled. Default value is 1.
Source Optional. Connection name in the Informatica domain that the scan job uses to connect to the
Connection Salesforce application.
Name
When you test a connection to SAP, Data Privacy Management tests the combination of the user name and
password, and the system and client numbers. Data Privacy Management uses the application server host
name to determine duplicate data stores.
Note: You can connect to the SAP application server through FTP or non-FTP mode. If you do not use the FTP
mode, you must create an NFS mount between the Staging Directory in the SAP server and the Source
Directory in the Data Privacy Management server.
Property Description
User Name Required. Name of the user account to connect to the SAP application server.
Password Required. Password for the user account to connect to the SAP application server. When you save
the data store, Data Privacy Management encrypts the password.
Application Required. Host name or IP address of the SAP application server. Example:
Server Host sapappserver.domain.com
Language Optional. Language code that corresponds to the SAP language. Example: EN
Enable Select to access the source SAP data through streaming rather than through FTP.
Streaming for
Data Access
Encoding Required. Code page compatible with the SAP server. Must also correspond to the language code.
Select UTF-8 encoding of Unicode.
Staging Required for data profiling. Path in the SAP application server where the SAP user can read, write,
Directory and delete the temporary files. Example: C:\usr\sap\sapsid\default
You do not need to set the property for metadata profiling.
Source For data profiling, path to the directory on the Data Privacy Management server where the Data
Directory Integration Service can read, write, and delete files. Example: /INFA/shared
You do not need to set the property for metadata profiling.
Use FTP Select to enable FTP access to SAP. You do not need to enable the property for metadata profiling.
FTP Host If you select Use FTP, enter the host name or IP address of the FTP server. Example:
ftp.domain.com
FTP User If you select Use FTP, enter the FTP user name to connect to the FTP server. Example: ftpuser
FTP Password If you select Use FTP, enter the password for the FTP user.
Retry Period If you select Use FTP, enter the number of seconds the Data Privacy Management Service attempts
to reconnect to the FTP host if the connection fails. To indicate an infinite retry period, enter 0.
Use SFTP Appears if you select Use FTP. Enables secure FTP access to SAP.
Public Key File If you select Use SFTP, enter the public key file path and filename. Required if the SFTP server uses
Name public key authentication.
Private Key If you select Use SFTP, enter the private key file path and filename. Required if the SFTP server uses
File Name private key authentication.
Private Key If you select Use SFTP, enter the private key file password used to decrypt the private key file.
File Password Required if the SFTP server uses private key authentication and the private key is encrypted.
Scan with Select to associate the data store with a remote agent when you run scan jobs to discover personal
Remote Agent and sensitive data or build a Subject Registry index.
Important: If you select this option, you must associate the data store with a remote agent before
running a scan. After you save the data store, you cannot change this property.
Folder Options Required if you enable the Scan with Remote Agent property. Folders that the scan job reads from
the location configured in the Staging Directory or Source Directory property. Select one of the
following options:
- All
- Select Specific Folders
- Use Regular Expression
Include Required if you enable the Scan with Remote Agent property and choose the Select Specific Folders
Folders option in the Folder Options property. Enter the file paths for at least one folder that you want to
include in the scan.
Exclude Optional if you enable the Scan with Remote Agent property and choose the Select Specific Folders
Folders option in the Folder Options property. Enter the file paths for at least one folder that you want to
exclude from the scan.
Folder Regular Required if you enable the Scan with Remote Agent property and choose the Use Regular Expression
Expression option in the Folder Options property. Enter a regular expression for at least one folder that you want
to include in the scan.
Repository Optional. Select one or more SAP repository objects, such as ABAP programs, classes, or CDS
Objects objects from the list. Each repository object is assigned to a package that contains related tables.
Data Privacy Management will scan the tables in the selected repository objects.
If you plan to scan the entire SAP system, do not select repository objects.
Memory Optional. Available if you enable the Scan with Remote Agent property. Memory that the Enterprise
Data Catalog scanner uses when you run a scan on the data store. Options are: High, Low, and
Medium. Select High.
You might want to increase the scanner memory in the following example scenarios: you scan a
large data store, you do not specify the schemas for a data store, or you scan multiple data stores in
parallel. If you change the scanner memory, you must change the memory settings in the Yarn
Resource Manager.
JVM Options Optional. Available if you enable the Scan with Remote Agent property. JVM parameters that
configure the scanner container. Use the following arguments to configure the parameters:
- Dscannerloglevel=<DEBUG/INFO/ERROR>. Changes the log level of the scanner to values
such as DEBUG, ERROR, or INFO. Default value is INFO.
- Dscanner.container.core=<Number of cores>. Increases the core for the scanner
container. The value must be a number.
- Dscanner.yarn.app.environment=<key=value>. Key pair value that you need to set in the
Yarn environment. Use a comma to separate the key pair values.
- Dscanner.pmem.enabled.container.memory.jvm.memory.ratio=<1.0/2.0>. Increases
the scanner container memory when pmem is enabled. Default value is 1.
Source Optional. Available if you enable the Scan with Remote Agent property. Connection name in the
Connection Informatica domain that the Scan job uses to connect to the SAP source.
Name
Property Description
Case Specifies that the data store is configured for case sensitivity. Select the check box to specify that
Sensitive the data store is case sensitive. Clear the check box to specify that the data store is case insensitive.
Default is selected.
Memory Optional. Memory that the Enterprise Data Catalog scanner uses when you run a scan on the data
store. Select High, Low, or Medium. Default is Low.
You might want to increase the scanner memory in the following example scenarios: you scan a large
data store, you do not specify the schemas for a data store, or you scan multiple data stores in
parallel. If you change the scanner memory, you must change the memory settings in the Yarn
Resource Manager.
JVM Options Optional. JVM parameters that configure the scanner container. Use the following arguments to
configure the parameters:
- Dscannerloglevel=<DEBUG/INFO/ERROR>. Changes the log level of the scanner to values such
as DEBUG, ERROR, or INFO. Default value is INFO.
- Dscanner.container.core=<Number of cores>. Increases the core for the scanner
container. The value must be a number.
- Dscanner.yarn.app.environment=<key=value>. Key pair value that you need to set in the
Yarn environment. Use a comma to separate the key pair values.
- Dscanner.pmem.enabled.container.memory.jvm.memory.ratio=<1.0/2.0>. Increases
the scanner container memory when pmem is enabled. Default value is 1.
Include Optional. Use to evaluate data included in views If enabled, a scan job identifies sensitive data in
Views in the views during metadata and data profiling. The scan job evaluates data in all views for the schemas
Scan specified in the Schema property. If the property is blank, the scan job evaluates data in all views of
all schemas.
If disabled, the scan job does not evaluate data in views. If you disable the property after a scan job
completes, any scans that you run in the future do not evaluate data in views. However, the completed
scan job results remain.
Note: You must apply the Informatica 10.5.2.1 Service Pack to include views in a scan.
Source Optional. Connection name in the Informatica domain that the Cloud job uses to connect to the
Connection Snowflake database.
Name
To work with a Snowflake Advanced Scanner data store, you must apply the Informatica 10.5.2.1 service
pack. For information about how to view domain discovery scan results and run subject registry scans with
the Snowflake advanced scanner in Data Privacy Management, see the How-To article: Using the Snowflake
Advanced Scanner in Data Privacy Management.
The following table describes the Snowflake Advanced Scanner connection properties:
Property Description
Name Required. Short description of the data store. The name is not case sensitive and must be unique
within the Data Privacy Management repository. The name cannot exceed 255 characters or
contain spaces or the following special characters: \~!$%^&*()+
Description Optional. Long description of the data store that does not exceed 255 characters.
Source Connection name in the Informatica domain that the Cloud job uses to connect to the Snowflake
Connection database.
Name
Property Description
Data Store Required. Select a logical grouping of related data stores from the list of groups defined on the
Group Data Store Groups workspace. You can assign a data store to one data store group.
Security Required. Select a security group from the list of all security groups in the Informatica domain,
Groups including native and LDAP groups. Native groups display the group name. For example,
Administrator. LDAP groups include the security domain and the group name. For example,
<security domain>/<LDAP group name>.
Tags Optional. Keywords that add to the description. Create a new tag or select an existing tag from the
list. You can assign multiple tags to a data store.
After you assign a tag to a data store, you can use tags in alert rules to improve the specificity of
the alert. For example, you want to be alerted when the risk score of any production data store
increases by 20%. However, you do not want to be alerted for non-production data stores. You can
assign the tag PROD to the production data stores. Then, configure the alert conditions to alert you
when the risk score increases by 20% for data stores that have the PROD tag.
When you delete a data store, if a tag was only assigned to the deleted data store, the Data Privacy
Management Service deletes the tag from the repository.
Department Optional. Enter the department that is responsible for the maintenance or ownership of the data
store. The department can be a line of business, such as investment banking or wealth
management, or a corporate department, such as Sales or Marketing. The name is case-sensitive
and cannot exceed 100 characters.
After you scan data stores, you can view a summary of scan results for all data stores by
department. You can filter the scan results by location on the Overview workspace.
Data Owner Optional. Select the individual, group, or organization name that is responsible for the accuracy and
integrity of the source data on which you run a classification profiling scan.
Shared with Optional. Indicates that the data in the data store is shared with a third party such as a third-party
Third Party organization or vendor.
Third-Party Optional. Appears if you select the Shared with Third Party check box. Enter a description of the
Description third party. You can enter up to 2000 characters and use the following special characters: , _ @
Location Optional. Select a location that is defined on the Locations workspace. If you do not select a
location, Data Privacy Management assigns the Unknown location to the data store.
Auto Sync Optional. Determines if the Scan job automatically synchronizes data source information in the
Catalog catalog with the corresponding data store information in Data Privacy Management. Default is
disabled.
The following read-only properties appear on the data store details page. The values do not impact the data
store:
Property Description
Auto Assign Select to automatically assign the schemas in Enterprise Data Catalog to the data store. Default is
Connections No.
Case Sensitive Specifies that the data store is configured for case sensitivity. Select the check box to specify that
the data store is case sensitive. Clear the check box to specify that the data store is case
insensitive.
Default is selected.
Memory Memory that the Enterprise Data Catalog scanner uses when you run a scan on the data store.
Select High, Low, or Medium. Default is Low.
You might want to increase the scanner memory in the following example scenarios: you scan a
large data store, you do not specify the schemas for a data store, or you scan multiple data stores
in parallel. If you change the scanner memory, you must change the memory settings in the Yarn
Resource Manager.
Custom JVM JVM parameters that configure the scanner container. Configure the following parameters:
Options - Dscannerloglevel=<DEBUG/INFO/ERROR>. Changes the log level of the scanner to values
such as DEBUG, ERROR, or INFO. Default value is INFO.
- Dscanner.container.core=<Number of cores>. Increases the core for the scanner
container. The value must be a number.
- Dscanner.yarn.app.environment=<key value>. Key pair value that you need to set in the
YARN environment. Use a comma to separate the key pair values.
- Dscanner.pmem.enabled.container.memory.jvm.memory.ratio=<1.0/2.0>. Increases
the scanner container memory when pmem is enabled. Default value is 1.
Enable Select to extract metadata about resources in Enterprise Data Catalog that the data store
Reference references. Default is No.
Resources
Retain Appears when you select the Enable Reference Resources property. Reference assets are assets
Unresolved from reference resources in Enterprise Data Catalog, such as reference data sources and reference
reference data sets. Select this option to keep unresolved reference assets. Default is Yes.
Assets
When you scan an SQL Server Integration Services data store, Data Privacy Management creates a Data
Integration job. Each data store should connect to a different package. The connections in the mappings
should also be unique. Data Privacy Management does not validate if the data store connects to different
packages.
For file-based data stores, when you test the connection, Data Privacy Management tests the Agent URL
property. If the property value is not valid, the test connection fails.
For repository-based data stores, when you test the connection, Data Privacy Management tests the
combination of the Agent URL, SQL Server Version, Package/Repository Name, and Password properties. If
any of the property values are not valid, the test connection fails.
Property Description
Agent URL Required. URL of the Enterprise Data Catalog agent. Use the following syntax to enter the URL:
http://<host name or IP address>:<port>/<directory>
For file-based data stores, if the package file is not password protected, you can provide a URL to an
Enterprise Data Catalog agent that is installed on any machine. If the package file is password
protected, provide a URL to an Enterprise Data Catalog agent that exists on the same machine that
hosts Microsoft Visual Studio or the Microsoft SQL Server Management Studio. The Microsoft Visual
Studio version must support the development of Integration Services packages.
For repository-based data stores, provide a URL to an Enterprise Data Catalog agent that is on the
same machine that hosts the Microsoft SQL Server Management Studio.
File Required if you select FILE in the SSIS Scanner Type property. Full file path and file name for the
package .dtsx or .zip file. To scan multiple package files, add the files to one .zip file. The file must
exist on the same machine that hosts the Data Privacy Management Service.
For example, enter /test/Package.dtsx if the package .dtsx file exists in the test folder.
If the file path is not valid or does not exist, the Scan job fails.
Encoding Required if you select FILE in the SSIS Scanner Type property. Code page for the extracted metadata.
Default is Western European (iso-8859-1).
SQL Server Required if you select Repository Server in the SSIS Scanner Type property. Select one of the
Version following options:
- SQL Server 2012
- SQL Server 2014
- SQL Server 2016
If blank, the data store status is Not Complete. You cannot scan incomplete data stores.
Host Name Required if you select Repository Server in the SSIS Scanner Type property. Host name or IP address
for the server. If blank, the data store status is Not Complete. You cannot scan incomplete data
stores.
Password Password for the package. Required if the package is encrypted with a password. When you save the
data store, Data Privacy Management encrypts the value in the repository. If the package is encrypted
and the password is not valid or blank, the Scan job fails.
Package/ If you select FILE in the SSIS Scanner Type property, enter the Microsoft Visual Studio package name.
Repository If you select Repository Server in the SSIS Scanner Type property, enter the directory path and
Name package name in the Microsoft SQL Server Management Studio from which the Data Integration job
reads. Specify the directory structure starting after the Stored Packages MSDB directory. The Data
Integration job connects only to stored packages in MSDB subdirectories.
- For example, you want to scan a package named TestPackage. The full path for the package in the
Microsoft SQL Server Management Studio is <Server>/Stored Packages/MSDB/TestFolder/
TestPackage
- Enter /TestFolder/TestPackage
- Use a comma to separate multiple values. If you include multiple packages in a data store, the
packages must use the same password. You can also use a Java-based regular expression. If blank,
the Data Integration job reads from all packages in the MSDB subdirectories.
Variable Optional. Full directory path and file name of the text file that defines the values for the user-defined
Values File variables in the package. The variables value file must be on the same machine where the Data
Privacy Management Service runs.
For example, enter /home/test/variables.txt if the variables text file exists in the /home/test
directory.
If the data flow includes variables and you do not provide a variable values file, the Data Integration
job cannot identify the sensitive data proliferation between the connections in the mapping. If the file
path does not exist or is not valid, the Data Integration job fails. The Data Integration job does not
validate the content of the text file.
Working Optional. File path to the working directory of the SQL Server Integration Services database.
Directory
Path
Memory Optional. Memory that the Enterprise Data Catalog scanner uses when you scan the data store. Select
High, Low, or Medium. Default is Low.
You might want to increase the scanner memory in the following example scenarios: you scan a large
data store, you do not specify the schemas for a data store, or you scan multiple data stores in
parallel. If you change the scanner memory, you must change the memory settings in the Yarn
Resource Manager.
JVM Options Optional. JVM parameters that configure the scanner container. Use the following arguments to
configure the parameters:
- Dscannerloglevel=<DEBUG/INFO/ERROR>. Changes the log level of the scanner to values such
as DEBUG, ERROR, or INFO. Default value is INFO.
- Dscanner.container.core=<Number of cores>. Increases the core for the scanner container.
The value must be a number.
- Dscanner.yarn.app.environment=<key=value>. Key pair value that you need to set in the
Yarn environment. Use a comma to separate the key pair values.
- Dscanner.pmem.enabled.container.memory.jvm.memory.ratio=<1.0/2.0>. Increases the
scanner container memory when pmem is enabled. Default value is 1.
Enable Optional. Select this option to extract metadata about resources in Enterprise Data Catalog that the
Reference data store references. Default is cleared.
Resources
Retain Optional. Appears when you select the Enable Reference Resources property. Reference assets are
Unresolved assets from reference resources in Enterprise Data Catalog, such as reference data sources and
Reference reference data sets. Select this option to keep unresolved reference assets. Default is disabled.
Assets
Agent Optional. Enter the SQL Server Agent configuration options to include in the scan.
Options
Data Domains
This chapter includes the following topics:
The Data Privacy Management installation includes a set of commonly used data domains that you can use
to identify and protect sensitive data in a data store. You can also create data domains. Data Privacy
Management stores data domains in the Model repository.
When you create a data domain, you specify a rule to identify data by metadata or data values and a
protection extension to protect sensitive data.
For example, you want to find columns that contain U.S. Social Security numbers. A data store might have
Social Security numbers in a column called SSN in one table and a column called Social_Security in a
different table. The data store might also contain Social Security numbers in a column called Comments. To
identify all columns with Social Security numbers, you can create a metadata match condition and a data
match condition. The metadata match condition describes possible column names for Social Security
numbers such as SSN and Social. The data match condition can specify a reference table, rule, or pattern
that the data values must match. In this example, you can use a data match condition that specifies the
pattern used for U.S. Social Security numbers: 999-99-9999
When you specify a data match condition, you also specify a conformance range. During a data scan, Data
Privacy Management calculates the conformance score of a column. The conformance score is the
percentage of values in a column that match the data domain. Based on the conformance range you
specified, Data Privacy Management uses the conformance score to reject or identify a column as sensitive
and to determine when to report a column for validation.
When you add protection extensions to a data domain, you also specify the default protection rules that Data
Privacy Management applies to the sensitive fields in each data store that is included in the data domain.
199
When you configure a protection task, you can apply the default protection rules from the data domains to
the sensitive fields in the data store for which the protection task is created.
You assign data domains to classification policies. When you define a scan, you assign a classification policy
to the scan. Data Privacy Management uses the information in the data domains to discover sensitive
columns.
The discovery of sensitive columns in a data store forms the basis of determining critical security
information such as policy classification, data risk cost, sensitive data proliferation, and protection status.
Related Topics:
• “Extensions Overview” on page 65
When you first access the Data Domains workspace, you see a list of the predefined data domains
available in the Model repository. From the Actions menu, you can import, create, view, edit, copy, and
delete data domains.
When you select a data domain from the list of data domains, the Data Domain Details page appears.
The Data Domain Details page displays the name, description, match condition, and protection
extensions for a data domain. The match condition specifies the rules that determine whether a column
in the data store matches the data domain. Protection extensions specify the protection rules that Data
Privacy Management applies to the sensitive fields identified in the data store.
To access the Data Domains workspace, click Manage > Data Domains.
The following image shows an example of the Data Domains workspace with a list of data domains:
To view or edit the properties for a data domain, access the Data Domain Details page from the Data
Domains list page in one of the following ways:
• Click a data domain name. To edit the properties, click Edit on the Data Domain Details page.
• Select the check box next to a data domain and then select Open or Edit from the Actions menu.
Property Description
Name Required. Unique name for the data domain. The name is not case sensitive and cannot exceed
255 characters.
The name cannot contain spaces or the following special characters: ~!$%^&*()-+=[]
{}|;:'",./<>?
Description Optional. Long description of the data domain that cannot exceed 255 characters.
Column Matching Required. Determines how Data Privacy Management handles conflicts that might occur
Logic between data match conditions and metadata match conditions defined for the data domain.
One of the conditions can override the other, or you can specify that both conditions must match
to include a data store column in the data domain.
Metadata Match Identifies columns to include in the data domain. Matches column name metadata to a regular
Condition pattern expression, a reference table in the Model repository, or a metadata rule in the Model
repository.
Data Match Identifies columns to include in the data domain. Matches column values to a regular pattern
Condition expression, a reference table in the Model repository, or a metadata rule in the Model repository.
Conformance For data match conditions, specifies the minimum percentage of values in a column that must
Score match the data domain for Data Privacy Management to identify the column as sensitive. Also
specifies the lower and upper limit of percentages to reject, validate, or automatically accept a
column as sensitive.
The Scan job calculates conformance scores when the Scan job runs data profiling.
Define the default conformance score ranges on the Settings workspace.
Mark columns For data match conditions, Scan jobs identify fields as sensitive if they meet the validation
within the validate criteria in the conformance score setting.
range as sensitive If you change the default validation range in a data domain, the Scan job identifies sensitive
fields based on the data domain setting.
Conformance Row For data match conditions, specifies the minimum number of values in a column that must
Count match the data domain to identify the column as sensitive.
Important: The sampling technique for a data store scan must include at least the same number
of rows as the conformance row count. If the sampling technique contains fewer rows than the
conformance row count, Data Privacy Management will not identify the column as sensitive, and
the sensitive field might remain unprotected.
Proximity Match Identifies columns to include in the data domain. Matches column values to other selected data
Condition domains that contain nearly identical columns or fields.
Add Protection Optional. Applies one or multiple protection extensions to sensitive fields in the data domain for
Extension data stores that the protection extension supports. Protection extensions are defined on the
Extensions workspace and must have an Active status. Options are:
- Encryption
- Persistent Data Masking - Big Data
- Persistent Data Masking - Remote Domain
Protection Rules Required if you add a protection extension. Defines the default protection rules for sensitive
fields in the data domain for data stores that the protection extension supports. Protection rule
options are:
- Encryption. Applies one or multiple encryption rules that are defined on the Encryption Rules
workspace.
- Persistent Data Masking - Big Data. Specifies the default masking rule to protect the sensitive
fields.
- Persistent Data Masking - Remote Domain. Specifies the default masking rule to protect the
sensitive fields.
Associated Required if you add an encryption protection extension. Associates a selected encryption key to
Encryption Key each encryption rule.
Related Topics:
• “Supported Data Store Types for Protection Extension Plugins” on page 66
• “Encryption Rules Overview” on page 594
• “Settings Workspace Properties” on page 39
To specify a metadata match condition, use one of the following methods to describe a column name:
• Pattern
• Reference table
• Rule
Pattern
To use the pattern method for a metadata match condition, enter a regular expression that describes the
pattern of characters in a column name.
For example, a name for a column with Social Security numbers might include the term SSN or Social. You
can use the following expressions to find columns that have either SSN or Social in the column name:
.*social.*
.*ssn.*
A data domain can contain multiple pattern expressions. Data Privacy Management identifies a column as
sensitive if the column name matches any one expression in the pattern.
For information about creating regular expressions, see tutorials and documentation for regular expressions
on the internet such as http://www.regular-expressions.info/tutorial.html.
The following image shows a metadata match condition that is defined by two regular expressions:
For example, a reference table for credit card column names might contain the following values:
• ccn
• credit_card
• credit_card_number
• credit_card_no.
Data Privacy Management discovers columns with a column name that matches any one of the strings in the
reference table.
To use a reference table, select one from the Model repository. You can use Informatica Developer to view
the contents of the reference table. For a list and descriptions of the metadata reference tables that are
available with the Data Privacy Management installation, see the System-Defined Data Domains appendix in
the Informatica Data Privacy Management Administrator Guide.
To create a reference table, use Informatica Analyst or Informatica Developer. For information about creating
reference tables, see the Informatica Data Quality Getting Started Guide.
Rule
A metadata rule defines the logic to discover a column by column name.
For example, to discover columns with birth dates, the metadata rule might specify the following expressions
that describe the column names DOB, Date of Bir*, and Birth Da*:
• ^[dD][oO][bB]$
• ^[dD][aA][tT][eE].*[oO][fF].*[bB][iI][rR].*$
• ^[bB][iI][rT][tT][hH].*[dD][aA].*
Data Privacy Management discovers columns with a column name that matches any one of the expressions
in the rule.
To use a rule, select one from the Model repository. You can use Informatica Developer to view the
expressions in a rule. For a list and description of the metadata rules that are available with the Data Privacy
Management installation, see the System-Defined Data Domains appendix in the Informatica Data Privacy
Management Administrator Guide.
To create a rule use Informatica Analyst or Informatica Developer. For information about creating rules, see
the Informatica Data Quality Getting Started Guide.
To specify a data match condition, use one of the following methods to describe column values:
• Pattern
• Reference Table
• Rule
Pattern
To use the pattern method for a data match condition, enter a regular expression that describes the values in
a column.
For example, you want to discover columns that contain Social Security numbers. Social Security numbers
have a specific pattern such as the 999-99-9999 pattern used for U.S. Social Security numbers.
You can create the following expression to describe the data pattern for U.S. Social Security numbers:
[0-9]{3}-[0-9]{2}-[0-9]{4}
A data domain can contain multiple data pattern expressions. Data Privacy Management discovers columns
that match any expression in the pattern.
For information about creating regular expressions, see tutorials and documentation for regular expressions
on the internet such as http://www.regular-expressions.info/tutorial.html.
Reference Table
A data reference table contains a collection of the standard and alternative versions of column data for a
specific type of column.
For example, you want to discover columns that contain country names in your data store. You can use a
reference table that lists the ISO country code and the different versions of names for each country.
The following image shows a data reference table in the Model repository for country names:
Data Privacy Management discovers columns with values that match at least one of the values in the
reference table.
To use a reference table, select one from the Model repository. You can use Informatica Developer to view
the contents of the reference table. For a list and descriptions of the data reference tables that are available
To create a reference table, use Informatica Analyst or Informatica Developer. For information about creating
reference tables, see the Informatica Data Quality Getting Started Guide.
Rule
A data rule defines the logic to discover a column by column values.
For example, you want to discover columns with IP addresses. You can use a rule from the Model repository
that defines the patterns for IP addresses.
The following image shows the expressions in a data rule for IP addresses:
Data Privacy Management discovers columns with values that match at least one of the expressions in the
data rule.
To use a rule, select one from the Model repository. You can use Informatica Developer to view the
expressions in a rule. For a list and description of the data rules that are available with the Data Privacy
Management installation, see the System-Defined Data Domains appendix in the Informatica 2
Administrator Guide.
You can also create a rule using Informatica Analyst or Informatica Developer. For information about creating
rules, see the Informatica Data Quality Getting Started Guide.
The following image shows a data domain with a data match condition that is defined by a rule:
During the Profiling job step of a Scan job, Data Privacy Management calculates the percentage of field
values that match each data domain. You can exclude null values from the calculation when you create the
scan.
Note: Data Privacy Management does not calculate conformance scores for File Management data store
types.
Depending on the percentage of field values that match the data domain, Data Privacy Management performs
one of the following actions for each field in the data store:
To set the default conformance ranges, specify the following settings on the Settings workspace. You can
customize the settings when you create or edit a data domain.
Setting Description
Maximum Reject The maximum data domain match percentage at or below which Data Privacy Management
Score rejects a field as sensitive.
For example, to reject fields as sensitive when 40% or fewer of the values match the data
domain, set the value to 40%.
Minimum Accept The minimum data domain match percentage at or above which Data Privacy Management
Score accepts a field as sensitive.
For example, to accept fields as sensitive when 80% or more of the values match the data
domain, set the value to 80%.
Mark columns If selected, Data Privacy Management identifies the fields in the validation range as sensitive.
within the validate If cleared, Data Privacy Management identifies the fields in the validation range as not sensitive
range as sensitive and includes the fields in a scan report. The data store owner reviews the scan report and
determines if the fields are sensitive.
Default is cleared.
If a field returns a data match percentage between the Maximum Reject Score and the Minimum Accept
Score, Data Privacy Management flags the field for validation.
Example
You want to determine the sensitivity status of a field based on the following criteria:
• If at least 80% of the field values match the data domain, identify the field as sensitive.
• If 41-79% of the field values match the data domain, flag the field for the data owner to validate the data
and determine the sensitivity status of the field.
• If 0-40% of the field values match the data domain, reject the field as sensitive.
On the Data Match tab of the New_Data_Domain page, use the slider buttons to set the following values:
Related Topics:
• “Changing the Default Conformance Ranges” on page 45
• “Downloading Scan Job Reports” on page 306
When you create a data domain and specify a data match condition, the Mark columns within the validation
range as sensitive check box is cleared by default. If you keep the default setting, the data store scan results
include only the fields in the auto-accept range in the sensitive field count. If you select the check box, the
data store scan results include fields in the auto-accept and validation ranges.
To validate the sensitivity status of the fields in the scan results, view the list of fields and corresponding
conformance scores on the Sensitive Fields page accessed from the Security Dashboard. You can update
the sensitivity status in one of the following ways:
• Manually. On the Sensitive Fields page, select a row to view the Properties pane for the sensitive field.
Edit the Sensitive property.
• In bulk. Export the list of sensitive fields as a CSV file and change the sensitivity status for the fields.
After you update the sensitivity status, import the file on the Data Stores workspace.
To find and secure sensitive data, you scan a data store with a classification policy that contains data
domains. Create a data domain for each type of sensitive data that you want to discover and, optionally,
include protection extensions to protect the sensitive data. Create the data domain before you create the
classification policy.
The export file contains the same format as the file required to import data domains. You can use the export
file to add data domains or update data domain properties in bulk. After you save the updates to the export
file, import the file. When you import the file, the Import job adds and updates the data domains in the Model
repository.
After the Import job finishes, you can go to the Data Domains workspace to see the data domains that you
imported. Data Privacy Management saves the data domain in the Model repository.
Related Topics:
• “Import Job” on page 296
The following table describes the columns, column headings, and the values in the CSV file:
1 DomainName Required. Enter the name of the data domain. You can use up to 100
characters. Do not use backslashes.
To update a data domain, ensure that the name of the data domain in
the DomainName column matches the name of the data domain in the
Model repository.
2 Description Optional. Enter unique details of the data domain. The description
cannot exceed 254 characters.
9 DataRule Required if you entered RULE for the DataMatchOption property. Enter
the name of a data rule that is located in the Model repository.
14 MinConformanceRowcoun Specifies the minimum number of values in a column that must match
t the data domain to identify the column as sensitive. Default is 1.
15 DomainTrumpRule Specifies how Data Privacy Management handles conflicts that might
occur between data match conditions and metadata match conditions
defined for the data domain. Enter one of the following rules:
- DATA_OVERRIDES_METADATA
- METADATA_OVERRIDES_DATA
- METADATA_AND_DATA
- METADATA_OR_DATA
17 Action Optional. Instructs the import job to create, update, or delete a data
domain in the model repository.
Enter one of the following values:
- U. Creates or updates a data domain in the Data Privacy Management
repository. The action that the job performs depends on if the data
domain exists in the repository and on the Replace Duplicates with
Items Imported option. You configure the Replace Duplicates with
Items Imported option when you import the CSV file.
If the data domain does not exist, the Import job creates the data
domain.
If the data domain exists and the Replace Duplicates with Items
Imported option is enabled, the Import job updates the data domain.
Warning: If a column in the CSV file contains a null value, the Import
job replaces the value in the repository with a null value.
If the data domain exists and the Replace Duplicates with Items
Imported option is not enabled, the Import job skips the row.
- D. Deletes a data domain from the repository. If the data domain does
not exist, the Import job rejects the row.
If blank, default is U. Creates or updates a data domain in the Data
Privacy Management repository.
Consider the following rules and guidelines when you import data domains from a CSV file:
• The first row of the CSV file must contain case-sensitive column headings in the required sequence. If the
CSV file does not contain all of the column headings in the required sequence and case, the Import job
fails.
• Each subsequent row after the column headings defines a data domain.
• If the CSV file contains duplicate entries, the Import job processes the entries in the order in which the
rows appear in the file.
• The column values cannot contain the following keywords or terms:
SELECT FROM
SELECT UNION
DELETE FROM
UPDATE SET
<SCRIPT
</SCRIPT>
ALERT(
"/>"
If a row contains keywords or characters that are not valid, the Import job rejects the row. The Import job
logs an error in the job log and processes the remaining rows.
• The column values cannot exceed 255 characters. If a row contains more than 255 characters in any
column, the Import job rejects the row.
You must have the Add or Modify Data Domain privilege to disassociate data domains from data stores.
Verify that the data domains are not associated with a scan that is in progress, a protection task, or user
access data.
Note: When you delete data domains, you remove the rules associated with the data domain from the Active
and Suggestions tabs on the Always Sensitive /Never Sensitive workspace.
Classification Policies
This chapter includes the following topics:
You can use a system-defined classification policy or create, copy, or import a classification policy. When you
create a classification policy, you assign data domains to the classification policy and specify a data domain
match condition, the cost for each record with sensitive data, and a data sensitivity level. You assign one or
more classification policies and data stores to include in a scan.
A data domain match condition describes the criteria that classifies the data store as a match for the
classification policy. If a data store matches a classification policy, Data Privacy Management assigns the
sensitivity level to the data store.
Scan jobs analyze data stores to discover sensitive and personal data based on the classification policies
included in the scan.
To access the Classification Policies workspace, click Manage > Classification Policies.
218
Classification Policies List Page
The Classification Policies list page lists the classification policies in the Data Privacy Management
repository. By default, the list of classification policies appears when you access the Classification Policies
workspace.
The following image shows an example of the Classification Policies workspace with the classification
policies list page:
Access the classification policy details page to view or edit a classification policy. You access the
classification policy details page when you click a classification policy name from the classification policies
list page or when you view a classification policy from the Actions menu.
Property Description
Name Name of the classification policy. The name is not case sensitive and must be unique within the
Data Privacy Management repository. The name cannot exceed 255 characters, contain a space, or
contain the following special characters: ~ ! $ % ^ & * ( ) +
Description Optional. The description cannot exceed 255 characters. Use this property to specify unique details
to differentiate between similarly named classification policies.
Cost for Each The monetary cost that your organization would incur for each occurrence of sensitive data when
Impression exposed to an unauthorized user. Data Privacy Management provides a default value. You can set
the cost to a maximum value of 999. To estimate the cost for each occurrence of sensitive data,
consider the direct and indirect costs that your organization might incur if sensitive data is exposed,
lost, or unavailable.
For example, direct costs might include the following expenses:
- Hiring experts and purchasing applications to help resolve a data breach.
- Providing customers with hotline support.
- Providing customers with free credit monitoring subscriptions and discounts for future products
and services.
Indirect costs might include the following expenses:
- Investigation and communication.
- Loss of critical data for use in primary applications that serve employees, customers, or partners.
- Potential loss of customers.
The scan results show the total cost of sensitive data in the data stores evaluated in the scan.
Sensitivity Degrees of data sensitivity based on industry or organization data security standards. You can
Level select from up to five data sensitivity levels. For example, you can specify the following data
sensitivity levels:
- Public. Indicates that data can be shared with anyone. For instance, data that you share on the
company website.
- Internal. Indicates that data can be shared with employees only. For instance, a marketing and
sales strategy.
- Confidential. Indicates that data can be shared with a select group of users. For instance, details
for a patent.
- Restricted. Indicates that data can be shared with key personnel. For instance, personal
information.
Related Topics:
• “Residual Risk Indicator” on page 515
• “Changing the Default Risk Cost” on page 45
A data domain identifies the semantics of sensitive data and specifies the rules that Data Privacy
Management must use to match a field or file name or value to the data store.
You can include predefined data domains in classification policies, or you can create and import data
domains. For example, to scan a data store for PII data, you can include data domains for Social Security
number, first name, last name, and birth date in the classification policy.
When you scan a data store, Data Privacy Management discovers the field or file names or values that you
specified in the data domains included in the policy.
• Match Any. The data store must have one or more fields or files that match at least one data domain in
the classification policy.
• Match All. The data store must have one or more fields or files that match all the data domains in the
classification policy.
• Custom. The data store must have fields or files that match a data domain or file tag match condition that
you create.
Match Any
When you select the Match Any data domain match condition, the data store must contain a field or file that
matches at least one data domain in the classification policy for Data Privacy Management to designate the
data store as a match for the classification policy.
For example, you want to evaluate a data store for PCI data. You create a classification policy with the credit
card number and magnetic stripe data domains. You specify the Match Any condition and scan the data store
with the classification policy. Data Privacy Management classifies the data store as a match for PCI if the
data store contains a field for either credit card or magnetic stripe data.
When you create a classification policy with multiple data domains, Match Any is the default data domain
match condition.
For example, you want to evaluate a data store for PII data. You create a classification policy that includes
many data domains such as age, birth date, birth place, drivers license number, first name, last name, gender,
IP address, and Social Security number. You scan the data store with the classification policy. Data Privacy
Management classifies the data store as PII if the data store matches one of the data domains.
For instance, Data Privacy Management would classify the data store as PII even if the data store matched
only the first name data domain. The scan results would not be meaningful or accurate because a first name
by itself is not a unique identifier.
For example, you want to evaluate a data store for PHI data. You create a classification policy with the
account number, first name, and last name data domains. You specify the Match All condition and scan the
data store with the classification policy. Data Privacy Management classifies the data store as a match for
PHI if the data store contains fields for account number, first name, and last name.
When you create a classification policy with a single data domain, Match All is the default data domain
match condition.
For example, you want to evaluate a data store for PII data. You create a classification policy with seven data
domains that include first name, last name, and Social Security number. You scan the data store with the
classification policy. The scan classifies the data store as PII only if the data store contains fields that match
all seven data domains. However, if a data store matches only the first name, last name, and Social Security
number data domains, Data Privacy Management does not classify the data store as PII.
Custom
Create Custom match conditions if the classification policy contains many data domains or you want to
include Google Drive or Microsoft SharePoint file tags.
The condition pallette enables you to create a detailed condition set. You can create a suite of conditions,
group and nest the conditions, and use the All are true and Any are true operators to relate the conditions
and groups of conditions.
For every condition group, you can use the default Cost for Each Impression value or you can specify a
different cost.
You can enter one file tag for each condition that uses the following operators:
• Is
• Is not
If you include more than one condition group with file tag match conditions, and the groups have different
costs for each impression, the scan job uses the highest cost per impression value to calculate the risk cost
for the data stores that match the classification policy.
If a file matches a custom match condition that specifies file tags, Data Privacy Management increases the
policy impression value for the file by 1.
For example, you want to evaluate a data store for PCI data. You create a classification policy that includes
the LastName, CreditCardNumber, and SSN data domains. According to your company's standard, a data store
is PCI if the data store contains a minimum of 30 values that match the CreditCardNumber data domain. If
the data store matches all three data domains, the cost of each occurrence of sensitive data triples.
When you scan the data store, Data Privacy Management classifies the data store as PCI if the sensitive
fields or files match either condition group.
• You can export and import classification policies that use custom file tag match conditions.
• If you create a scan that includes a classification policy with custom file tag match conditions, and the
policy matches data stores other than Google Drive and Microsoft SharePoint, the scan job returns
unexpected values for the Policy Impressions property on the Sensitive Fields page.
• Data Privacy Management calculates the Confidence level and displays the value on the Sensitive Fields
page for files that match the classification policy condition.
- For unstructured data stores that do not scan with a remote agent, the Sensitive Fields page does not
list these columns.
If the data store contains sensitive information, Data Privacy Management performs the following tasks:
• Calculates the number of fields or files and impressions with sensitive data. Updates the results on the
Security Dashboard and on the Data Store Details page.
The following image shows some of the dashboard indicators that Data Privacy Management updates after
the sensitive data discovery process:
• Classifies the data store as a match for the classification policy. If the data store matches more than one
classification policy, Data Privacy Management assigns the most severe data-sensitivity level from the
matched classification policies to the data store. Data Privacy Management displays the sensitivity level
on the Data Store Details page.
• Calculates the percentage of scanned data stores for each data-sensitivity level. Data Privacy
Management updates the results on the Security Dashboard.
To identify and classify sensitive data, you scan a data store based on the classification policies included in
the scan. Create, copy, or import a classification policy for each category of sensitive data that you want to
identify in the data store.
Related Topics:
• “Specifying Data Store Sensitivity Levels” on page 42
• “Sensitivity Level Indicator” on page 515
The export file contains the same format as the file required to import classification policies. You can use the
export file to add classification policies or to update policy properties in bulk. After you save the export file,
import the file. When you import the file, the Import job adds and updates the classification policies in the
Data Privacy Management repository.
After the Import job finishes, you can return to the Classification Policies workspace to see the classification
policies that you imported.
The following table describes the order of the columns, column headings, and the format for the values in the
CSV file:
2 Description Optional.
3 DomainNames Enter the names of the data domains that you want to include in the
classification policy. The names of the data domains must match the
names of the data domains in the Data Privacy Management repository.
This field is case sensitive. Use the following format to enter a list of
domains: <data domain1>,<data domain2>,...
4 DomainMatchExpressionJ Enter a data domain match condition. Enter the condition as a JSON
son expression.
5 Severity Enter a data sensitivity level. The name of the data sensitivity level must
match the name of one of the defined data sensitivity levels. For
example, if you defined Public, Internal, Confidential, or Restricted as
data sensitivity levels, you can only enter one of these levels. You must
use uppercase letters.
6 CostPerRow Optional. Enter a positive value up to 999. If you do not enter a value,
the Data Privacy Management Service assigns the default cost for each
row.
Note: To update an existing classification policy, ensure that the name of the policy in the
ClassificationPolicyName column matches the name of the classification policy in the Data Privacy
Management repository.
Consider the following rules and guidelines when you import classification policies from a CSV file:
• The first row of the CSV file must contain case-sensitive column headings in the required sequence. If the
CSV file does not contain all of the column headings in the required sequence and case, the Import job
fails.
• Each subsequent row after the column headings defines a classification policy.
• If the CSV file contains duplicate entries, the Import job processes the entries in the order in which the
rows appear in the file.
• The column values cannot contain the following keywords or terms:
SELECT FROM
SELECT UNION
DELETE FROM
UPDATE SET
<SCRIPT
</SCRIPT>
ALERT(
"/>"
Review the rules and guidelines before you edit a classification policy.
• Before you edit a classification policy, verify if the policy is included in a scan.
• If a classification policy is associated with a completed scan, the Evaluate Classification Policies job step
starts when you save the policy after one or more of the following edits:
- Change the cost for each row
Column Sensitivity
This chapter includes the following topics:
From the Always Sensitive/Never Sensitive workspace, you can view and manage Always Sensitive and
Never Sensitive rules. Each rule specifies column sensitivity for a specific data domain, or for any data
domain.
Based on your knowledge of column names and column data, you can import the global sensitivity status of
columns to the Data Privacy Management repository. Global sensitivity has the following statuses:
After you import the global sensitivity status, you can edit the scope of the Always Sensitive or Never
Sensitive rules, mark rules as active or inactive, or delete rules.
You can also export the global sensitivity status of columns to a CSV file. After you export the file, you can
update and then import the file to add, delete, or update the global sensitivity status to the Data Privacy
Management repository.
The export file includes a list of column names, the corresponding data domain name, and whether the
column is an Always Sensitive column or a Never Sensitive column. An Always Sensitive column indicates
that the column is always sensitive for a specific data domain. A Never Sensitive column indicates that the
column is never sensitive for a specific data domain or for any data domain.
For more information about sensitivity status, see “How Data Privacy Management Determines Sensitivity
Status” on page 249.
232
Note: You can only add structured data sources to an Always Sensitive or a Never Sensitive list.
The Always Sensitive/Never Sensitive workspace displays a list of active and suggested rules by data store,
schema, object, field, domain, type, and last updated. You can click a column header to sort the list by the
selected column.
Active
The Active tab lists active rules. From the Actions menu, you can import and export rules. When you
select a rule, you can edit or delete the rule, or move the rule to the Suggestions tab to inactivate the
rule. Default is the Active tab.
Suggestions
The Suggestions tab list inactive rules that you might want to make active. Data Privacy Management
gets the rules from inactive Always Sensitive and Never Sensitive rules and data that Data Privacy
Management generates when you perform one of the following actions:
From the Actions menu, you can export suggested Always Sensitive and Never Sensitive rules. When you
select a rule, you can edit or delete the rule, or mark the rule as active.
To access the Always Sensitive/Never Sensitive workspace, click Manage > Always Sensitive/Never
Sensitive.
The following image shows an example of the Always Sensitive/Never Sensitive workspace with a list of
active rules:
To display the Data Domain Details page for a data domain, click the data domain name. To display the
Sensitive Fields page for a data store, click the data store name.
After you identify the sensitive columns, you can perform the following tasks:
The following table shows how you can specify a column as Always Sensitive for a data domain:
You can specify: Is Always Sensitive for a data domain and is applicable for:
A column. All tables, schemas, and data stores or, a specified table, all
schemas, and all data stores.
A column of one or more tables of a specified Specified tables and schemas and all data stores.
schema.
A column of one or more tables of a specified Specified tables, schemas, and data stores.
schema of a data store.
The following table shows how you can specify a column as Never Sensitive for a data domain:
You can specify: Is never sensitive for a data domain and is applicable for:
A column. All tables, schemas, and data stores, or a specified table, all
schemas, and all data stores.
A column of one or more tables of a specified Specified tables and schemas and all data stores.
schema.
A column of one or more tables of a specified Specified tables, schemas, and data stores.
data store schema.
The following image shows an example of a super rule and eight subset rules. The rule in row 1 is the most
generic rule and it applies to all data stores, schemas, and objects for the DOB field for the data domain
BirthDay. The rules in rows 2 through 9 are subset rules of the rule in row 1.
Super rules and subset rules cannot coexist in the Active tab. Super rules and subset rules can coexist in the
Suggestions tab.
As you create and edit Always Sensitive and Never Sensitive rules, keep the following guidelines in mind:
Rule Properties Rule 1 and Rule 2 Rule 1 and Rule 2 Rule 1 listed as Description
are listed as are listed as Active and Rule 2
Active Suggestions listed as a
Suggestion
The following table lists the order of the columns, column headings, and the format for the values in the CSV
file:
1 DataStoreName Name of the data store. Enter the exact name of the data store in the Model
repository. Values are not case sensitive.
2 SchemaName Name of the schema. Enter the exact name of the data schema in the Model
repository. Values are not case sensitive.
3 ObjectName Name of the object. Enter the exact name of the object in the Model repository.
Values are not case sensitive.
4 FieldName Name of the field. Enter the exact name of the field in the Model repository.
Values are not case sensitive.
6 WhiteList_or_BlackList Required.
If the column is Always Sensitive for a data domain, enter W or Whitelist.
Values are not case sensitive.
If the column is Never Sensitive, enter B or Blacklist. Values are not case
sensitive.
The Import job rejects the row if the value is blank.
7 Action Optional. Instructs the Import job to create, update, or delete the global
sensitivity status of a column in the Data Privacy Management repository.
Enter one of the following values:
- U. Adds or updates the global sensitivity status of a column to the Data
Privacy Management repository. The action that the job performs depends
on whether the global sensitivity for a column is already specified in the
repository and whether you select the Replace Duplicates with Items
Imported option when you import the CSV file.
If the global sensitivity of the column is not already specified in the
repository, the Import job adds the column to the list.
If the global sensitivity of the column is already specified in the repository
and you do not enable the Replace Duplicates with Items Imported option,
the Import job skips the row.
If the global sensitivity of the column is already specified in the repository
and you enable the Replace Duplicates with Items Imported option, the
Import job updates the information in the repository.
- D. Deletes the global sensitivity status of a column from the repository.
If blank, default is U. The Import job adds or updates the global sensitivity
status of a column in the Data Privacy Management repository.
To check the status of the Import job, go to the Jobs workspace and view the job log for the Import File job
step. The job log shows how many records the job processed, rejected, skipped, deleted, inserted, or updated.
If the job rejected one or more rows in the CSV file, you can download a list of the rejected rows.
After the Import job finishes, scan the data stores with the Always Sensitive and Never Sensitive columns to
update the dashboard results.
Consider the following rules and guidelines when you import global sensitivity status of columns from a CSV
file:
• The first row of the CSV file must contain case-sensitive column headings in the required sequence. If the
CSV file does not contain all of the column headings in the required sequence and case, the Import job
fails.
• Each subsequent row after the column headings defines the global sensitivity status for a column. For
example, the column with name ssa is always sensitive for the SSN data domain.
• The column values cannot contain the following keywords or combination of characters:
SELECT FROM
SELECT UNION
DELETE FROM
UPDATE SET
<SCRIPT
</SCRIPT>
ALERT(
"/>"
If a row contains keywords or characters that are not valid, the Import job rejects the row. The Import job
logs an error in the job log and processes the remaining rows.
• The column values cannot exceed 255 characters. If a row contains more than 255 characters in any
column, the Import job rejects the row.
• Enclose names that include a dot (.) or comma (,) within quotes for any column value that includes a
name. If a value for a name column includes a dot or comma not enclosed within quotes, the Import job
might not process the column values correctly. For example, the Import job processes a comma as a
delimiter between column values.
• The data domains in the CSV file must exist in the Model repository.
• If you previously imported a Never Sensitive column without a data domain, you must continue to leave
the data domain column blank in subsequent imports.
5. Specify No Change or Apply Globally for each item, such as Data Stores, Schemas, and Tables. Default is
No Change.
6. Click Save.
Scans
This chapter includes the following topics:
Scans Overview
A scan defines how to discover and classify sensitive and subject registry information in data stores. When
you run a scan, the Data Privacy Management Service creates one Scan job for each data store in the scan.
You can create, run, and manage scans on the Scans workspace. You create and run scans after you define
data stores, data domains, and classification policies. When you create a scan, you configure properties and
options for the Scan job. You specify the data store category and the data stores that you want to scan. You
create a scan for one category. You add one or more data stores in the same category for which you want to
identify sensitive or subject data. You assign one or more classification policies to the scan.
Run scans to create and build your subject registry. A Subject Registry scan can include Domain Discovery
scans to map fields to data domains. You also run scans on golden data stores to identify subjects, and on
transaction data stores to link subjects in the data stores with subjects identified in golden data stores. This
ensures that the subject registry includes all subject data.
You schedule when to run the scan. You can run the scan immediately or in the future. You can run the scan
once or on a recurring basis. When you save the scan, the Data Privacy Management Service creates a Scan
job for each data store in the scan.
You can monitor the Scan job status on the Jobs workspace. After the Scan job finishes, you can view
Domain Discovery scan results on the Security Dashboard. You can view Subject Registry scan results on the
Privacy Dashboard.
243
Compressed Files in Scans
Data Privacy Management can scan compressed files with specific extensions. You can include compressed
files in scans on unstructured sources.
A compressed file contains individual files. An individual file inside a compressed file with a supported
extension might in turn be a compressed file with a supported or unsupported extension.
A Data Privacy Management scan handles compressed files in the following ways based on the extension:
As a single file
Considers the compressed file to be a single file and identifies and lists sensitive data as one file.
Considers a compressed file as a folder and lists each file in it as a separate file. Each compressed file
nested in the main file is also listed as a folder and the files in it as separate files. If the file contains
unstructured file types that Data Privacy Management does not support, the scan skips the file and adds
the file name to the Exception report.
The process can drill down multiple levels until it encounters a compressed file with an extension that is
not supported for drill down. This file is considered a single file with no further drill down.
You can configure the number of levels that a scan drills down to in the Data Privacy Management
Service properties.
The following supported compressed file extensions are handled as single files or folders with individual
files:
xz zip
bz2 tar
tbz2 tar.bz2
gz tar.gz
- tar.xz
- tgz2
- tgz
- gtar
- 7z
The agent adds entries to the Exception report in the following cases:
• For compressed files with bz2, gz2, or xz extensions without tar archive and with a compression ratio
greater than 100, the agent generates zip bomb exceptions during a domain discovery scan. The agent
writes the exceptions to the Exception report and continues the scan.
• For compressed files with 7z extension that are very large or highly compressed, the agent generates a
heap space exception. The exception also occurs if a very large or highly compressed 7z file is present in
another compressed file. The agent writes the exceptions to the Exception report and continues the scan.
You can include specific image file types and files with content in English. Files can include specific font
types and must be within specific dimensions and specifications.
Read the rules and guidelines and verify the prerequisites before you include image files or scanned files in a
scan.
For information about the prerequisites, see the Install the Informatica Discovery Agent chapter in the
Informatica Data Privacy Management Installation and Configuration Guide.
• Font styles can include Times New Roman, Verdana, Arial, and Calibri.
• The font size must be >=10 pt.
• The background color must be white.
• If a border is present, it must be a Solid border.
• Images must be clear and without noise.
• Scans can detect content with font sizes of 8pt and 9pt, but there might be inaccuracies of up to 3% to 8%
deviation in extracted text based on the data.
Scans Workspace
Use the Scans workspace to create, run, and manage scans. The Scans workspace includes a page that
displays a list of all scans and a page that displays the properties for a scan. You can use the scan filters to
navigate to the scans that you want to manage. You can view a summary of the scan results.
1. Scan name
2. Scan properties
3. Scan results
4. Scan summary
5. Scan status. Click to view the jobs for the scan.
The scan summary shows the combined risk score of all the data stores in the scan. The Secure@Source
Service updates the combined risk score when the scan job completes. The scan summary also shows the
number of times the scan job ran.
The scan results include key statistics for all data stores in the scan. The Secure@Source Service updates
the scan results when the scan job completes. Each time you run a scan, the Secure@Source Service
generates a new set of scan results. If you run the scan multiple times, you can view and compare how the
scan results change over time. Use the arrow icons to scroll though all of the results for the scan. For more
detailed scan results, view the dashboard or the data store.
The status is displayed as a link that opens the jobs workspace. A list of jobs appears. The list shows jobs
that are associated with one set of scan results. For example, if you click the scan status for the current scan
results, the list shows the jobs for the current scan only. The list does not show the jobs for previous runs of
the same scan.
Related Topics:
• “Changing Risk Score Factor Weights” on page 42
• “Adding Custom Risk Score Factors” on page 43
The following table describes the general properties for all scans:
Property Description
Name Short description of the scan. The name is not case sensitive and must be unique within the Data
Privacy Management repository. The name cannot exceed 255 characters and cannot contain spaces
or the following special characters: ~ ! $ % ^ & * ( ) +
When you run a scan, Data Privacy Management creates a Scan job. The Scan job name is the same
as the scan name.
Description Long description of the scan. The description cannot exceed 255 characters.
Scan Type The type of scan that you want to run. Select one or more of the following options:
- Domain Discovery: Maps columns in data stores to data domains.
- Discover Subjects: Scans golden data stores to identify data subjects in Subject Registry scans.
Note: If you configure a scan with the Discover Subjects option only, you do not choose a
category, data store type, or data store in the scan configuration.
- Link Subjects: Scans transaction data stores to link subjects in the data stores with subjects
identified in golden data stores.
Subject Type For Discover Subjects or Link Subjects scans, the entity files that you want to include in a subject
registry scan. An entity file contains properties that determine how subject details appear in the
subject registry. Each entity file represents a subject type and can include one or more golden data
stores.
Category A logical group of data store types. Select one of the following options:
- Application
- Big Data
- Cloud
- Data Engineering
- Database Management
- File Management
- NoSQL
Data Store Select one of the following options, depending on the value of the Category property:
Type - Application scans include SAP data stores.
- Big Data scans include Cloudera Navigator, Hadoop Distributed File System, and Hive data stores.
- Cloud scans include Amazon Redshift, Amazon S3, Azure Data Lake, Microsoft Azure Blob
Storage, Microsoft Azure SQL Data Warehouse, Microsoft Azure SQL Database, Salesforce, and
Snowflake data stores.
- Data Integration scans include Informatica Cloud, Informatica Data Engineering Integration,
Informatica PowerCenter on IBM Db2, Informatica PowerCenter on Microsoft SQL Server,
Informatica PowerCenter on Oracle, and SQL Server Integration Services data stores.
- Database Management scans include IBM DB2, IBM DB2 for i5/OS, IBM Db2 for z/OS, JDBC,
Microsoft SQL Server, Netezza, Oracle, SAP HANA, Sybase, and Teradata data stores.
- File Management scans include File System, Google Drive, Microsoft OneDrive, and Microsoft
SharePoint data stores.
- NoSQL scans include Apache Cassandra.
Data Stores Data stores to include in the scan. The list shows data stores for the selected data store type and
that include complete connection properties. The list does not show incomplete data stores.
Scan Options Determines the job steps that the Scan job includes. The scan options that you configure depend on
the data store type. If the scan includes only data stores that scan with remote agent connections,
the scan job performs data profiling, but the options do not appear in the user interface.
Subject Scan Determines subject registry scan job options. Includes the following option:
Options - Full Subject Scan. Enables complete scans on repeat scan runs. Default is selected.
Uncheck the option to enable incremental scans on repeat scan runs.
Note: Applicable for structured sources. You cannot configure incremental scans on unstructured
sources.
Notification Enable the property to send an email notification when the Scan job status changes. You can enter
multiple recipient email addresses, including distribution lists. Use a comma to separate multiple
entries.
The Data Privacy Management Service validates the format of the email addresses when you
navigate to the next panel. The Data Privacy Management Service uses the email server
configuration defined in the Data Privacy Management Service properties.
Exceptions Determines how a Scan job behaves if it encounters errors. If you want the scan to ignore profiling-
related errors, enable the Report Exceptions and Continue Processing exception option. Default is
selected.
You can download a report from the scan Jobs page. The report contains the warning messages.
Note: The Report Exceptions and Continue Processing option applies only to scans that include the
Run Profiling or Include Row Count scan option, or that contain data stores that run scans with
remote agents.
You cannot resume the job to scan the tables that fail with warnings. To scan the failed tables, run
the scan again or create and run a scan for the failed tables.
Data Privacy Management processes the information in the following order of precedence to determine
column sensitivity for a data domain:
The Profiling job step determines sensitivity based on the following factors:
Sensitive files in unstructured data stores appear on the Sensitive Fields page with a value of Yes in the
Sensitive column. The Properties pane lists the sensitive data domains in the file.
The following table describes how Data Privacy Management assigns sensitivity according to the global
sensitivity status:
Column is always Data Privacy Management identifies the field as sensitive for the data domain regardless of
sensitive for a data the results of the Profiling job step.
domain The field appears on the Sensitive Fields page with the Auto (Always Sensitive)
classification mechanism.
Column is never Data Privacy Management identifies the field as not sensitive for the data domain.
sensitive for a data If the Profiling job step found that the field is sensitive for the data domain, Data Privacy
domain Management lists the field on the Sensitive Fields page. The sensitivity status is set to Not
Sensitive. The classification mechanism displays Auto (Never Sensitive).
If the Profiling job step did not find the field sensitive for the data domain, Data Privacy
Management does not list the field on the Sensitive Fields page.
Column is never Data Privacy Management identifies the field as not sensitive regardless of the data domain
sensitive for any data in the scan classification policy.
domain If the Profiling job step found that the field is sensitive for a data domain, Data Privacy
Management lists the field on the Sensitive Fields page. The sensitivity status is set to Not
Sensitive. The classification mechanism displays Auto (Never Sensitive).
If the Profiling job step did not find the field sensitive for any data domain, Data Privacy
Management does not list the field on the Sensitive Fields page.
Data Privacy Management saves the updates you make. If a future scan job includes the same data store and
a classification policy with the same data domain, Data Privacy Management assigns the sensitivity status
you specified.
To specify sensitivity status, you can import a CSV file to perform the following tasks:
You can also change the existing sensitivity status of a field or data domain directly on the Sensitive Fields
page. In the Properties pane for a field or file, click Edit to change the sensitivity selection for a field or data
domain.
Scan Management
You can create, run, and manage scans on the Scans workspace.
A scan is a template for a scan job definition. You can create a scan once. Then, you can run the scan
multiple times. When you run a scan, the Data Privacy Management Service creates a scan job.
You can view, copy, edit, or delete scans. You can view and export a list of all scans, a scan summary, and the
scan status. For example, you can see how many data stores and classification policies a scan includes.
You can edit a scan to change the scan options. For example, if you configure all the scan options for a data
store scan, the scan job performance might be impacted. Instead, run the scan multiple times with a different
scan option selected.
Creating a Scan
Create a scan to configure how and when the scan job discovers and classifies sensitive data. When you
create a scan, you specify the type of scan, configure the options, and schedule when the scan runs.
Related Topics:
• “Application Scan Options” on page 257
• “Big Data Scan Options” on page 260
• “Cloud Scan Options” on page 264
• “Data Integration Scan Options” on page 267
• “Database Management Scan Options” on page 270
• “File Management Scan Options” on page 273
Running a Scan
When you run a scan, Data Privacy Management creates a scan job and runs the job according to the scan
schedule.
• You can create a scan once and run the scan multiple times. For example, you create a scan and schedule
the scan to run every month. One week later, you want to run the scan. You do not have to create another
scan.
• You can run scans that have a completed or terminated status. To run scans that have a failed, paused,
scheduled, or stopped status, terminate the job. Then, run the scan.
• You can include a data store in multiple scans. However, you cannot run multiple scans on the same data
store simultaneously.
The scan fails if another job that includes the same data store is in progress. For example, a scan job,
Import job, or Evaluate Classification Policies job might include the same data store.
• You can run a scan multiple times. However, before you run a scan again, ensure that all jobs associated
with the scan are completed or terminated.
For example, you included 10 data stores in a scan and ran the scan. Data Privacy Management created
10 scan jobs. Before you can run the scan again, all 10 jobs must be completed or terminated.
• When you schedule a recurring scan, Data Privacy Management performs the same validation as when
you schedule a scan to run immediately. If the service cannot run a scan, the service writes an error to the
system log. The service tries to run the scan at the next scheduled time.
• In a subject registry scan, you cannot use the same data domain for more than one column in an SQL
query. Map each column in an SQL file to a different data domain.
• Domain discovery scans for subject registry must include schemas and objects that are marked as
sensitive in an SQL file. If the SQL file does not include the schemas and objects that are marked as
sensitive, information in DSAR reports for data subjects might be incorrect.
• When you run a scan on a data store configured to scan with a remote agent, the remote agent associated
with the data store must be up and running.
• If the remote agent associated with the data store is processing another job when you run the scan, the
scan job will run after the other job finishes.
• Data Privacy Management performs domain discovery through the remote agent. You can also create and
use a custom data domain with a pattern data match condition that includes a regular expression. For
more information about data domains in remote agent scans, see the How-To Library Article
Configuring Data Domains for Unstructured Sources in Data Privacy Management, on the Informatica
documentation portal.
• Remote agents do not process data domains configured with a metadata match condition only.
• To run a domain discovery scan that analyzes only the files that are new or changed since the last time
the scan job ran, the data store must meet the following criteria:
- The scan definition and associated data stores, classification policies, and data domains configured in
the initial scan are unchanged since the last scan finished successfully or completed with a warning.
- You can perform a delta scan for files that you create or edit after a previous scan finishes successfully
or the Browse job step ends with a warning in the current scan.
• The first data store scan is a Full scan. A delta scan will become a full scan if any of the following
conditions apply:
- The data store, classification policies, data domains, or scan definitions changed at any time.
- When you run a Reset Classification Results job on a data store, the next data store scan will be a Full
scan.
- If you change the Scan with Remote Agent data store property from disabled to enabled, the next scan
you run on the data store will perform a full scan.
• If files added to a data store have a timestamp that is before the timestamp for the previous successful
scan or a previous scan that ended with a warning, the Delta Scan job skips the files.
• To enable incremental scans on structured sources, you must include a field with the
IncrementalScanField property set to true in the entity file. Incremental scans of unstructured sources
do not require the field.
• The field must include a timestamp value in the following format: yyyy-MM-dd HH24:mm:ss
• A scan fails with errors if you configure an incremental scan on the scan UI but do not configure the
property in the entity file.
• The field must contain a timestamp value. The scan uses the value to verify if the data was added or
updated after the last scan.
• The scan uses the values in this column to determine records updated or added after the last scan.
• For unstructured sources, if you add files after a domain discovery scan and then run an incremental
Subject Registry scan, the scan skips files added after the last domain discovery scan. To avoid loss of
Subject Registry information, run a domain discovery scan before you run an incremental Subject Registry
scan on unstructured sources. Alternatively, you can include domain discovery in the Subject Registry
scan configuration.
• If you set the IncrementalScanField value to true for more than one field, only the first field is considered.
• The scan considers the last updated time and adds records that have a last updated time greater than the
time of previous scan.
• The scan does not consider the last updated time on records that you delete after the last scan. So an
incremental scan does not remove records that you delete after the last scan from the golden record
table. Run a full scan to remove deleted records.
• Search results on the UI and DSAR reports that you generate after an incremental scan might therefore
include incorrect data for deleted records or fail.
• If you update a record such that it does not match with a subject, an incremental scan does not delete the
link to the subject. This can result in incorrect data included in search results and in DSAR reports.
Copying a Scan
You can copy a scan to create a scan with similar properties. When you copy a scan, you can edit the scan
properties.
Deleting a Scan
You can delete a scan if the scan has a completed, scheduled, or terminated status. You cannot delete a scan
that has a failed, paused, or stopped status.
Exporting Scans
You can export a list of scans to a CSV file. The export file includes a list of all scans and the corresponding
scan summaries. If you filtered the scan list, the export file contains the scans that match the active filter
conditions.
Scan Options
This chapter includes the following topics:
Option Description
Include Catalog Extracts a metadata catalog from the data store. The Data Privacy Management repository must
Information have at least one scan result that includes the catalog information before the Scan job can run
data domain discovery or determine the row count.
Run Profiling Determines the type of profiling that the Scan job runs to perform data domain discovery:
- Metadata. Matches data domains with the source metadata.
- Data. Matches data domains with the source data. You can also specify options for running a
data profile on the data store.
Include Row Calculates the total number of rows for each sensitive field that the Scan job identifies. This
Count option might affect the Scan job performance. To optimize performance, run a scan with this
option only.
If you do not include the row count, the Security Dashboard does not display the risk cost for the
data stores in the scan.
257
You can enable one or more of the options for a scan. However, some of the options depend on scan results
from other options. For example, if you enable the Run Profiling option and the Data Privacy Management
repository does not include scan results for the Include Catalog Information option, the Scan job fails.
You can run a scan once with all options enabled, or you can run a scan multiple times with different options
enabled. For example, you can run a scan with the Include Catalog Information option enabled. Then, run a
scan with the Run Profiling option enabled. Then, run a scan with the Include Row Count option enabled.
Related Topics:
• “Creating a Scan” on page 251
The Scan job uses the metadata match conditions in a data domain to determine if a column matches a data
domain based on the column name. If there is a match, the Scan job identifies the column as sensitive. The
Scan job evaluates the data domains in the classification policies that you include in the scan.
Metadata profiling is typically the fastest method of profiling. However, metadata profiling can generate
more false positives than data profiling or metadata profiling with data profiling. A false positive can occur if
a data domain uses a broad regular expression for the data domain metadata match condition.
For example, you configure a data domain metadata match condition to use a general expression that
searches for "name" in column metadata. The Scan job identifies the FirstName, LastName, ProductName,
and DepartmentName columns as a match for the data domain. However, the ProductName and
DepartmentName columns are not sensitive. To reduce the chance of false positives, use the data profiling
scan option or metadata profiling with data profiling scan option.
The Scan job uses the data match conditions in a data domain to determine if a column matches the data
domain based on the column data. If there is a match, the Scan job identifies the column as sensitive. The
Scan job evaluates the data domains in the classification policies that you include in the scan.
Data profiling is typically the most accurate and slowest method of profiling. You can choose the following
options:
The option enables a Scan job to run data domain discovery profiles. The profiles match data domains
with the source metadata and data. The Scan job evaluates the data domains in the classification
policies that you include in the scan.
The Scan job first performs metadata profiling, and then runs a data domain discovery profile that
matches data domains with the source metadata. The Scan job uses the metadata match conditions in a
data domain to determine if a column matches a data domain based on the column name. If there is no
metadata match, the Scan job does not identify the column as sensitive. If there is a metadata match,
the Scan job performs data profiling on the column.
The Scan job performs data profiling for the columns that the scan job identified as a metadata match.
The Scan job runs a data domain discovery profile that matches data domains with the source data. The
Scan job uses the data match conditions in a data domain to determine if a column matches a data
Metadata profiling with data profiling is typically faster than data profiling and slower than metadata
profiling. Compared to metadata profiling, metadata profiling with data profiling reduces the amount of
false positives in the scan results. A false positive occurs when a column is not sensitive, but the Scan
job identifies the column as sensitive.
For example, a DepartmentName column might match a metadata match condition for a name string in a
data domain. However, the data within the DepartmentName column does not match the data match
conditions in the data domain. If you run a scan with the Metadata profiling option, the Scan job
identifies the column as sensitive. If you run a scan with the Data profiling option or the Metadata
profiling with Data profiling scan option, the Scan job does not identify the column as sensitive.
Metadata profiling with data profiling might increase the amount of false negatives in the scan results. A
false negative occurs when a column is sensitive, but the Scan job does not identify the column as
sensitive. A false negative can occur if the metadata contains poorly labeled columns.
Exclude Null
The Scan job ignores rows with null values within the sample set of rows in a column to calculate the
conformance score. When you enable this option, the sensitivity status of a column might be more
accurate.
For example, you enable the Exclude Null property and specify that the Scan job runs a profile on the first
10,000 rows of each column. However, a column has 1,000 null values in the first 10,000 rows. The
Profiling job step bases the conformance score calculation on the 9,000 rows that have values. The
Profiling job step does not increase the size of the sample.
Full Profiling
The Scan job performs data profiling for all data domain matches in the data store, including matches
discovered in previous data store scans.
Sampling Technique
The following table describes the sampling options that determine the number of rows on which the
Scan job runs a profile.
Sampling Description
Option
Auto Random Runs the profile on a random sample size. The random sample size is based on the number of
rows in the data store.
First <number> Runs the profile on the number of rows that you select in sequential order. The scan starts
with the first row of the table and reads the rows in sequential order up to the number of rows
you selected. You can select 10,000, 100,000, or 500,000 rows.
No Sampling Runs the profile on all rows of the data store. If the data store contains a large number of
rows, the Scan job performance might be impacted.
Random Runs the profile on the number of rows that you select in random order. You can select
<number> 10,000, 100,000, or 500,000 rows.
The unit to measure the quantity of rows that must match the data domain for Data Privacy Management
to identify the column as sensitive. Select one of the following options:
• Percentage. Evaluates the percentage of matched rows in the column with the minimum percentage
for sensitivity specified in the Conformance Score property of the data domain. If the percentage of
matched rows is equal to or exceeds the minimum percentage for sensitivity in the conformance
score, Data Privacy Management identifies the column as sensitive.
• Rows. Evaluates the number of matched rows in the column with the number specified in the
Conformance Row Count property of the data domain. If the number of matched rows is equal to or
exceeds the conformance row count, Data Privacy Management identifies the column as sensitive.
Default is Percentage.
Option Description
Identify For Cloudera Navigator data stores, identifies the lineage of sensitive data between scanned Hive
Proliferation and data stores and identifies the protection status of sensitive data based on transformation details.
Data Protection For example, if a sensitive column is included in a Data Masking transformation, the Scan job
determines that the column is protected.
Include Catalog For HDFS and Hive data stores, extracts a metadata catalog from the data store. The Data Privacy
Information Management repository must have at least one scan result that includes the catalog information
before the Scan job can run data domain discovery or determine the row count.
Run Profiling For HDFS and Hive data stores, determines the type of profiling that the Scan job runs to perform
data domain discovery:
- Metadata. Matches data domains with the source metadata.
- Data. Matches data domains with the source data. You can also specify options for running a
data profile on the data store.
Include Row For Hive data stores, calculates the total number of rows for each sensitive field that the Scan job
Count identifies. This option might affect the Scan job performance. To optimize performance, run a
scan with this option only.
If you do not include the row count, the Security Dashboard does not display the risk cost for the
data stores in the scan.
You can enable one or more of the options for a scan. However, some of the options depend on scan results
from other options. For example, if you enable the Run Profiling option and the Data Privacy Management
repository does not include scan results for the Include Catalog Information option, the Scan job fails.
You can run a scan once with all options enabled, or you can run a scan multiple times with different options
enabled. For example, you can run a scan with the Include Catalog Information option enabled. Then, run a
scan with the Run Profiling option enabled. Then, run a scan with the Include Row Count option enabled.
The Scan job uses the metadata match conditions in a data domain to determine if a column matches a data
domain based on the column name. If there is a match, the Scan job identifies the column as sensitive. The
Scan job evaluates the data domains in the classification policies that you include in the scan.
Metadata profiling is typically the fastest method of profiling. However, metadata profiling can generate
more false positives than data profiling or metadata profiling with data profiling. A false positive can occur if
a data domain uses a broad regular expression for the data domain metadata match condition.
For example, you configure a data domain metadata match condition to use a general expression that
searches for "name" in column metadata. The Scan job identifies the FirstName, LastName, ProductName,
and DepartmentName columns as a match for the data domain. However, the ProductName and
DepartmentName columns are not sensitive. To reduce the chance of false positives, use the data profiling
scan option or metadata profiling with data profiling scan option.
The Scan job uses the data match conditions in a data domain to determine if a column matches the data
domain based on the column data. If there is a match, the Scan job identifies the column as sensitive. The
Scan job evaluates the data domains in the classification policies that you include in the scan.
Data profiling is typically the most accurate and slowest method of profiling. You can choose the following
options:
The option enables a Scan job to run data domain discovery profiles. The profiles match data domains
with the source metadata and data. The Scan job evaluates the data domains in the classification
policies that you include in the scan.
The Scan job first performs metadata profiling, and then runs a data domain discovery profile that
matches data domains with the source metadata. The Scan job uses the metadata match conditions in a
data domain to determine if a column matches a data domain based on the column name. If there is no
metadata match, the Scan job does not identify the column as sensitive. If there is a metadata match,
the Scan job performs data profiling on the column.
The Scan job performs data profiling for the columns that the scan job identified as a metadata match.
The Scan job runs a data domain discovery profile that matches data domains with the source data. The
Scan job uses the data match conditions in a data domain to determine if a column matches a data
domain based on the column data. If there is a data match, the Scan job identifies the column as
sensitive. If there is no data match, the Scan job does not identify the column as sensitive.
Metadata profiling with data profiling is typically faster than data profiling and slower than metadata
profiling. Compared to metadata profiling, metadata profiling with data profiling reduces the amount of
false positives in the scan results. A false positive occurs when a column is not sensitive, but the Scan
job identifies the column as sensitive.
Metadata profiling with data profiling might increase the amount of false negatives in the scan results. A
false negative occurs when a column is sensitive, but the Scan job does not identify the column as
sensitive. A false negative can occur if the metadata contains poorly labeled columns.
Exclude Null
The Scan job ignores rows with null values within the sample set of rows in a column to calculate the
conformance score. When you enable this option, the sensitivity status of a column might be more
accurate.
For example, you enable the Exclude Null property and specify that the Scan job runs a profile on the first
10,000 rows of each column. However, a column has 1,000 null values in the first 10,000 rows. The
Profiling job step bases the conformance score calculation on the 9,000 rows that have values. The
Profiling job step does not increase the size of the sample.
Full Profiling
The Scan job performs data profiling for all data domain matches in the data store, including matches
discovered in previous data store scans.
Sampling Technique
The following table describes the sampling options that determine the number of rows on which the
Scan job runs a profile.
Sampling Description
Option
Auto Random Runs the profile on a random sample size. The random sample size is based on the number of
rows in the data store.
First <number> Runs the profile on the number of rows that you select in sequential order. The scan starts
with the first row of the table and reads the rows in sequential order up to the number of rows
you selected. You can select 10,000, 100,000, or 500,000 rows.
No Sampling Runs the profile on all rows of the data store. If the data store contains a large number of
rows, the Scan job performance might be impacted.
Random Runs the profile on the number of rows that you select in random order. You can select
<number> 10,000, 100,000, or 500,000 rows.
The unit to measure the quantity of rows that must match the data domain for Data Privacy Management
to identify the column as sensitive. Select one of the following options:
• Percentage. Evaluates the percentage of matched rows in the column with the minimum percentage
for sensitivity specified in the Conformance Score property of the data domain. If the percentage of
matched rows is equal to or exceeds the minimum percentage for sensitivity in the conformance
score, Data Privacy Management identifies the column as sensitive.
Default is Percentage.
1 Create Hive data stores To show complete lineage between each schema, create one Hive data store
for each schema and configure connection details.
A Big Data Scan job determines lineage of sensitive data between scanned
Hive data stores. If one Hive data store contains multiple schemas and
sensitive data moves between the schemas, the Big Data Scan job cannot
show sensitive data lineage between schemas in the same data store.
2 Scan the Hive data stores to Run a Domain Discovery scan for the Hive data stores.
identify sensitive data
3 Create a Cloudera Navigator Configure the connection details to the Cloudera Navigator Metadata Server
data store that is on the same host as the Hive data stores.
4 Scan the Cloudera Navigator The scan job identifies sensitive data proliferation between Hive data stores
data store to identify the that are on the same host as the Cloudera Navigator data store.
proliferation of sensitive data
Option Description
Include Extracts a metadata catalog from the data store. The Data Privacy Management repository must
Catalog have at least one scan result that includes the catalog information before the scan job can run data
Information domain discovery or determine the row count.
Run Profiling Determines the type of profiling that the scan job runs to perform data domain discovery:
- Metadata. Matches data domains with the source metadata.
- Data. Matches data domains with the source data. You can also specify options for running a
data profile on the data store.
If a Cloud scan includes Amazon S3 data stores that scan with a remote agent, the scan job
performs data profiling, but you do not select options when you create the scan.
If a Cloud scan includes data stores that scan with a remote agent and data stores that do not scan
with a remote agent, you select profiling options. The scan job applies the data profiling options to
the data stores that do not scan with a remote agent when you run the scan.
Include Row For Amazon Redshift, Microsoft Azure SQL Data Warehouse, Microsoft Azure SQL Database,
Count Salesforce, and Snowflake data stores, calculates the total number of rows for each sensitive field
that the Scan job identifies. This option might affect the Scan job performance. To optimize
performance, run a scan with this option only.
If you do not include the row count, the Security Dashboard does not display the risk cost for the
data stores in the scan.
You can enable one or more of the options for a scan. However, some of the options depend on scan results
from other options. For example, if you enable the Run Profiling option and the Data Privacy Management
repository does not include scan results for the Include Catalog Information option, the Scan job fails.
You can run a scan once with all options enabled, or you can run a scan multiple times with different options
enabled. For example, you can run a scan with the Include Catalog Information option enabled. Then, run a
scan with the Run Profiling option enabled. Then, run a scan with the Include Row Count option enabled.
Related Topics:
• “Creating a Scan” on page 251
• “Cloud Job” on page 290
The Scan job uses the metadata match conditions in a data domain to determine if a column matches a data
domain based on the column name. If there is a match, the Scan job identifies the column as sensitive. The
Scan job evaluates the data domains in the classification policies that you include in the scan.
Metadata profiling is typically the fastest method of profiling. However, metadata profiling can generate
more false positives than data profiling or metadata profiling with data profiling. A false positive can occur if
a data domain uses a broad regular expression for the data domain metadata match condition.
The scan job uses the data match conditions in a data domain to determine if a column matches the data
domain based on the column data. If there is a match, the scan job identifies the column as sensitive. The
scan job evaluates the data domains in the classification policies that you include in the scan.
Data profiling is typically the most accurate and slowest method of profiling. You can choose the following
options:
The option enables a Scan job to run data domain discovery profiles. The profiles match data domains
with the source metadata and data. The Scan job evaluates the data domains in the classification
policies that you include in the scan.
The Scan job first performs metadata profiling, and then runs a data domain discovery profile that
matches data domains with the source metadata. The Scan job uses the metadata match conditions in a
data domain to determine if a column matches a data domain based on the column name. If there is no
metadata match, the Scan job does not identify the column as sensitive. If there is a metadata match,
the Scan job performs data profiling on the column.
The Scan job performs data profiling for the columns that the scan job identified as a metadata match.
The Scan job runs a data domain discovery profile that matches data domains with the source data. The
Scan job uses the data match conditions in a data domain to determine if a column matches a data
domain based on the column data. If there is a data match, the Scan job identifies the column as
sensitive. If there is no data match, the Scan job does not identify the column as sensitive.
Metadata profiling with data profiling is typically faster than data profiling and slower than metadata
profiling. Compared to metadata profiling, metadata profiling with data profiling reduces the amount of
false positives in the scan results. A false positive occurs when a column is not sensitive, but the Scan
job identifies the column as sensitive.
For example, a DepartmentName column might match a metadata match condition for a name string in a
data domain. However, the data within the DepartmentName column does not match the data match
conditions in the data domain. If you run a scan with the Metadata profiling option, the Scan job
identifies the column as sensitive. If you run a scan with the Data profiling option or the Metadata
profiling with Data profiling scan option, the Scan job does not identify the column as sensitive.
Metadata profiling with data profiling might increase the amount of false negatives in the scan results. A
false negative occurs when a column is sensitive, but the Scan job does not identify the column as
sensitive. A false negative can occur if the metadata contains poorly labeled columns.
Exclude Null
The Scan job ignores rows with null values within the sample set of rows in a column to calculate the
conformance score. When you enable this option, the sensitivity status of a column might be more
accurate.
For example, you enable the Exclude Null property and specify that the Scan job runs a profile on the first
10,000 rows of each column. However, a column has 1,000 null values in the first 10,000 rows. The
Full Profiling
The Scan job performs data profiling for all data domain matches in the data store, including matches
discovered in previous data store scans.
Sampling Technique
The following table describes the sampling options that determine the number of rows on which the
scan job runs a profile.
Sampling Description
Option
Auto Random Runs the profile on a random sample size. The random sample size is based on the number of
rows in the data store.
This option is not available for Amazon S3 data stores.
First <number> Runs the profile on the number of rows that you select in sequential order. The scan starts with
the first row of the table and reads the rows in sequential order up to the number of rows you
selected. You can select 10,000, 100,000, or 500,000 rows.
No Sampling Runs the profile on all rows of the data store. If the data store contains a large number of
rows, the Scan job performance might be impacted.
Random Runs the profile on the number of rows that you select in random order. You can select 10,000,
<number> 100,000, or 500,000 rows.
This option is not available for Cloud scans for Amazon S3 data stores.
The unit to measure the quantity of rows that must match the data domain for Data Privacy Management
to identify the column as sensitive. Select one of the following options:
• Percentage. Evaluates the percentage of matched rows in the column with the minimum percentage
for sensitivity specified in the Conformance Score property of the data domain. If the percentage of
matched rows is equal to or exceeds the minimum percentage for sensitivity in the conformance
score, Data Privacy Management identifies the column as sensitive.
• Rows. Evaluates the number of matched rows in the column with the number specified in the
Conformance Row Count property of the data domain. If the number of matched rows is equal to or
exceeds the conformance row count, Data Privacy Management identifies the column as sensitive.
Default is Percentage.
Option Description
Collect Data For Informatica Cloud and Informatica Data Engineering Integration data stores, imports
Store Information connections from the data source and creates data stores in the Data Privacy Management
repository. The Scan job creates one data store for each connection in Informatica Cloud.
Select one of the following options:
- Retrieve Security Groups. The imported data stores will have the same security group as the
parent data store
- Select Security Groups. Select one or more security groups for the imported data stores.
Include Data For Informatica PowerCenter data stores, imports connections from the PowerCenter repository
Stores and and creates data stores in the Data Privacy Management repository. The Scan job extracts a
Catalog metadata catalog and identifies sensitive data in the PowerCenter repository for each data store.
Information The Scan job identifies connection-level data lineage to show data flow between the data stores.
For example, the Scan job identifies that data moves from data store A to data store B. The Scan
job evaluates the combination of sensitive data in each data store to determine if a data store
matches a classification policy.
Select one of the following options:
- Retrieve Security Groups. The imported data stores will have the same security group as the
parent data store
- Select Security Groups. Select one or more security groups for the imported data stores.
Identify Extracts a metadata catalog from the source database, identifies the lineage of sensitive data,
Proliferation and and identifies the protection status of sensitive data based on transformation details. For
Data Protection example, if a sensitive column is included in a Data Masking transformation, the Scan job
determines that the column is protected.
If you enable the option and the Data Privacy Management repository does not have scan results
for the Collect Data Store Information option, the Data Integration Scan job fails.
You can enable one or more of the options for a scan. However, some of the options depend on scan results
from other options. For example, if you enable the Run Profiling option and the Data Privacy Management
repository does not include scan results for the Include Catalog Information option, the Scan job fails.
You can run a scan once with all options enabled, or you can run a scan multiple times with different options
enabled. For example, you can run a scan with the Include Catalog Information option enabled. Then, run a
scan with the Run Profiling option enabled. Then, run a scan with the Include Row Count option enabled.
Related Topics:
• “Creating a Scan” on page 251
• “Data Integration Job” on page 291
Create and run a Cloud scan to import data stores from Informatica Cloud.
After you run the scan, monitor the scan job. Edit the imported data stores to update the properties that
the scan job did not complete. For example, the scan job does not import passwords or connections
strings. The scan job does not import any connection properties for Amazon S3 data stores. Verify if
there are duplicate data stores that you need to merge. Then, configure connectivity to the imported data
stores.
Step 3. Create and run scans to identify sensitive data for the imported data stores.
Scan the data stores that the Cloud Scan job imported. The type of scan you run depends on the
repository type of the imported data store. For example, if the scan job imported an Oracle data store,
run a Database Management scan. The scan job identifies and classifies sensitive data in each data
store.
After you run the scan, monitor the scan job and curate the profile results.
Step 4. Run a Cloud scan to identify the proliferation of sensitive data in the imported data stores.
Run a Cloud scan to identify proliferation of sensitive data in mappings. The Scan job identifies column-
level lineage of sensitive data between the imported data stores. The Scan job identifies if Data Masking
transformations mask the sensitive data.
Run a Data Integration scan to import data stores from Data Engineering Integration. The Scan job
creates one data store in the Data Privacy Management repository for each connection in Informatica
Data Engineering Integration.
After you run the scan, monitor the Scan job and verify the imported data stores. For example, verify if
there are duplicate or incomplete data stores that you need to manage. Configure connectivity to the
imported data stores.
Scan the data stores that the Data Integration Scan job imported. The type of scan you run depends on
the repository type of the imported data store. For example, if the Data Integration Scan job imported a
Hive data store, run a Big Data scan. The Scan job identifies and classifies sensitive data in each data
store.
Step 3. Run a Data Integration scan to identify the proliferation and protection status of sensitive data.
Run a DataIntegration scan to identify proliferation of sensitive data in mappings. The Scan job identifies
column-level lineage of sensitive data between sources and targets. The Scan job identifies if Data
Masking transformations mask the sensitive data.
Run a Data Integration scan to import data stores from a PowerCenter repository. The Scan job identifies
and classifies sensitive data in PowerCenter mappings. The Scan job identifies an approximate
connection-level lineage of sensitive data between sources and targets.
After you run the scan, monitor the Scan job and verify the imported data stores. For example, verify if
there are duplicate or incomplete data stores that you need to manage. Configure connectivity to the
imported data stores.
Step 2. Run a Database Management scan to identify sensitive data for the imported data stores.
Run a Database Management scan to scan the data stores that the Data Integration Scan job imported.
The Database Management Scan job identifies and classifies sensitive data in each database. The Scan
job evaluates data that is not included in PowerCenter mappings.
After you run the scan, monitor the Scan job and curate the profile results.
Step 3. Run a Data Integration scan to identify the proliferation of sensitive data.
Run a Data Integration scan to identify proliferation of sensitive data in PowerCenter mappings. The
Scan job identifies column-level lineage of sensitive data between sources and targets. The Scan job
identifies if Data Masking transformations mask the sensitive data.
Step 1. Create a data store for each connection in a SQL Server Integration Services mapping.
Create a data store and configure connection details to each supported connection in a SQL Server
Integration Services mapping. Create a data store for each OLE DB Source and OLE DB Destination that
connects to SQL Server Native Client. When you create the data store, use the category Database
Management and the data store type Microsoft SQL Server.
The data stores will be the child data stores of the SQL Server Integration Services data store that you
create in step 3.
Create and run a scan on the data stores that you created in the previous step. The Scan job identifies
and classifies sensitive data in the data stores.
After you run the scan, monitor the Scan job and curate the profile results.
Step 3. Create a data store that connects to the SQL Server Integration Services package.
Create a data store and configure connection details to the SQL Server Integration Services package that
includes the mapping from step 1. The mapping in the package should include the connections for which
you created data stores.
The type of data store that you create depends on the server that hosts the packages. You can create a
file-based data store to scan packages that you created in Microsoft Visual Studio, but do not exist in
Microsoft SQL Server Management Studio. You can create a repository-based data store to scan
packages that exist in the Microsoft SQL Server Management Studio repository.
Step 4. Create and run a scan on the SQL Server Integration Services data store to identify the proliferation of sensitive
data in the child data stores.
Create and run a scan on the SQL Server Integration Services data store. The Data Integration Scan job
identifies the proliferation of sensitive data between the connections in the mapping.
Important: After you run the first SQL Server Integration Services scan, you must run the scan again after
you scan a child data store again. For example, you scan a child data store again because you added
more columns in the mapping. Another example, you run a scan on a child data store every month to
identify sensitive data. After the Scan job identifies sensitive data in the child data store, you must run
the SQL Server Integration Services scan again to update the proliferation of the sensitive data.
Option Description
Include Catalog Extracts a metadata catalog from the data store. The Data Privacy Management repository must
Information have at least one scan result that includes the catalog information before the Scan job can run
data domain discovery or determine the row count.
Run Profiling Determines the type of profiling that the Scan job runs to perform data domain discovery:
- Metadata. Matches data domains with the source metadata.
- Data. Matches data domains with the source data. You can also specify options for running a
data profile on the data store.
Include Row Calculates the total number of rows for each sensitive field that the Scan job identifies. This
Count option might affect the Scan job performance. To optimize performance, run a scan with this
option only.
If you do not include the row count, the Security Dashboard does not display the risk cost for the
data stores in the scan.
You can enable one or more of the options for a scan. However, some of the options depend on scan results
from other options. For example, if you enable the Run Profiling option and the Data Privacy Management
repository does not include scan results for the Include Catalog Information option, the Scan job fails.
You can run a scan once with all options enabled, or you can run a scan multiple times with different options
enabled. For example, you can run a scan with the Include Catalog Information option enabled. Then, run a
scan with the Run Profiling option enabled. Then, run a scan with the Include Row Count option enabled.
Related Topics:
• “Creating a Scan” on page 251
• “Database Management Job” on page 293
The Scan job uses the metadata match conditions in a data domain to determine if a column matches a data
domain based on the column name. If there is a match, the Scan job identifies the column as sensitive. The
Scan job evaluates the data domains in the classification policies that you include in the scan.
Metadata profiling is typically the fastest method of profiling. However, metadata profiling can generate
more false positives than data profiling or metadata profiling with data profiling. A false positive can occur if
a data domain uses a broad regular expression for the data domain metadata match condition.
For example, you configure a data domain metadata match condition to use a general expression that
searches for "name" in column metadata. The Scan job identifies the FirstName, LastName, ProductName,
and DepartmentName columns as a match for the data domain. However, the ProductName and
DepartmentName columns are not sensitive. To reduce the chance of false positives, use the data profiling
scan option or metadata profiling with data profiling scan option.
The Scan job uses the data match conditions in a data domain to determine if a column matches the data
domain based on the column data. If there is a match, the Scan job identifies the column as sensitive. The
Scan job evaluates the data domains in the classification policies that you include in the scan.
Data profiling is typically the most accurate and slowest method of profiling. You can choose the following
options:
The option enables a Scan job to run data domain discovery profiles. The profiles match data domains
with the source metadata and data. The Scan job evaluates the data domains in the classification
policies that you include in the scan.
The Scan job first performs metadata profiling, and then runs a data domain discovery profile that
matches data domains with the source metadata. The Scan job uses the metadata match conditions in a
data domain to determine if a column matches a data domain based on the column name. If there is no
metadata match, the Scan job does not identify the column as sensitive. If there is a metadata match,
the Scan job performs data profiling on the column.
The Scan job performs data profiling for the columns that the scan job identified as a metadata match.
The Scan job runs a data domain discovery profile that matches data domains with the source data. The
Scan job uses the data match conditions in a data domain to determine if a column matches a data
domain based on the column data. If there is a data match, the Scan job identifies the column as
sensitive. If there is no data match, the Scan job does not identify the column as sensitive.
Metadata profiling with data profiling is typically faster than data profiling and slower than metadata
profiling. Compared to metadata profiling, metadata profiling with data profiling reduces the amount of
false positives in the scan results. A false positive occurs when a column is not sensitive, but the Scan
job identifies the column as sensitive.
For example, a DepartmentName column might match a metadata match condition for a name string in a
data domain. However, the data within the DepartmentName column does not match the data match
conditions in the data domain. If you run a scan with the Metadata profiling option, the Scan job
identifies the column as sensitive. If you run a scan with the Data profiling option or the Metadata
profiling with Data profiling scan option, the Scan job does not identify the column as sensitive.
Metadata profiling with data profiling might increase the amount of false negatives in the scan results. A
false negative occurs when a column is sensitive, but the Scan job does not identify the column as
sensitive. A false negative can occur if the metadata contains poorly labeled columns.
Exclude Null
The Scan job ignores rows with null values within the sample set of rows in a column to calculate the
conformance score. When you enable this option, the sensitivity status of a column might be more
accurate.
For example, you enable the Exclude Null property and specify that the Scan job runs a profile on the first
10,000 rows of each column. However, a column has 1,000 null values in the first 10,000 rows. The
Profiling job step bases the conformance score calculation on the 9,000 rows that have values. The
Profiling job step does not increase the size of the sample.
Full Profiling
The Scan job performs data profiling for all data domain matches in the data store, including matches
discovered in previous data store scans.
The following table describes the sampling options that determine the number of rows on which the
Scan job runs a profile.
Sampling Description
Option
Auto Random Runs the profile on a random sample size. The random sample size is based on the number of
rows in the data store.
First <number> Runs the profile on the number of rows that you select in sequential order. The scan starts
with the first row of the table and reads the rows in sequential order up to the number of rows
you selected. You can select 10,000, 100,000, or 500,000 rows.
No Sampling Runs the profile on all rows of the data store. If the data store contains a large number of
rows, the Scan job performance might be impacted.
Random Runs the profile on the number of rows that you select in random order. You can select
<number> 10,000, 100,000, or 500,000 rows.
The unit to measure the quantity of rows that must match the data domain for Data Privacy Management
to identify the column as sensitive. Select one of the following options:
• Percentage. Evaluates the percentage of matched rows in the column with the minimum percentage
for sensitivity specified in the Conformance Score property of the data domain. If the percentage of
matched rows is equal to or exceeds the minimum percentage for sensitivity in the conformance
score, Data Privacy Management identifies the column as sensitive.
• Rows. Evaluates the number of matched rows in the column with the number specified in the
Conformance Row Count property of the data domain. If the number of matched rows is equal to or
exceeds the conformance row count, Data Privacy Management identifies the column as sensitive.
Default is Percentage.
For data stores that enable the Scan with Remote Agent option, the scan job performs data profiling but you
do not select profiling options when you create the scan. If you create a scan that includes data stores that
scan with a remote agent and data stores that do not scan with a remote agent, you select profiling options.
The scan job applies the data profiling options to the data stores that do not scan with a remote agent when
you run the scan.
Option Description
Include Catalog Extracts a metadata catalog from the data store. The Data Privacy Management repository must
Information have at least one scan result that includes the catalog information before the Scan job can run
data domain discovery or determine the row count.
Run Profiling Determines the type of profiling that the Scan job runs to perform data domain discovery:
- Metadata. Matches data domains with the source metadata.
- Data. Matches data domains with the source data. You can also specify options for running a
data profile on the data store.
You can enable one or more of the options for a scan. However, some of the options depend on scan results
from other options. For example, if you enable the Run Profiling option and the Data Privacy Management
repository does not include scan results for the Include Catalog Information option, the Scan job fails.
You can run a scan once with all options enabled, or you can run a scan multiple times with different options
enabled. For example, you can run a scan with the Include Catalog Information option enabled. Then, run a
scan with the Run Profiling option enabled.
Related Topics:
• “Creating a Scan” on page 251
• “File Management Job” on page 295
The Scan job uses the metadata match conditions in a data domain to determine if a column matches a data
domain based on the column name. If there is a match, the Scan job identifies the column as sensitive. The
Scan job evaluates the data domains in the classification policies that you include in the scan.
Metadata profiling is typically the fastest method of profiling. However, metadata profiling can generate
more false positives than data profiling or metadata profiling with data profiling. A false positive can occur if
a data domain uses a broad regular expression for the data domain metadata match condition.
For example, you configure a data domain metadata match condition to use a general expression that
searches for "name" in column metadata. The Scan job identifies the FirstName, LastName, ProductName,
and DepartmentName columns as a match for the data domain. However, the ProductName and
DepartmentName columns are not sensitive. To reduce the chance of false positives, use the data profiling
scan option or metadata profiling with data profiling scan option.
The scan job uses the data match conditions in a data domain to determine if a field matches the data
domain based. If there is a match, the scan job identifies the field as sensitive. The scan job evaluates the
data domains in the classification policies that you include in the scan.
The option enables a Scan job to run data domain discovery profiles. The profiles match data domains
with the source metadata and data. The Scan job evaluates the data domains in the classification
policies that you include in the scan.
The Scan job first performs metadata profiling, and then runs a data domain discovery profile that
matches data domains with the source metadata. The Scan job uses the metadata match conditions in a
data domain to determine if a column matches a data domain based on the column name. If there is no
metadata match, the Scan job does not identify the column as sensitive. If there is a metadata match,
the Scan job performs data profiling on the column.
The Scan job performs data profiling for the columns that the scan job identified as a metadata match.
The Scan job runs a data domain discovery profile that matches data domains with the source data. The
Scan job uses the data match conditions in a data domain to determine if a column matches a data
domain based on the column data. If there is a data match, the Scan job identifies the column as
sensitive. If there is no data match, the Scan job does not identify the column as sensitive.
Metadata profiling with data profiling is typically faster than data profiling and slower than metadata
profiling. Compared to metadata profiling, metadata profiling with data profiling reduces the amount of
false positives in the scan results. A false positive occurs when a column is not sensitive, but the Scan
job identifies the column as sensitive.
For example, a DepartmentName column might match a metadata match condition for a name string in a
data domain. However, the data within the DepartmentName column does not match the data match
conditions in the data domain. If you run a scan with the Metadata profiling option, the Scan job
identifies the column as sensitive. If you run a scan with the Data profiling option or the Metadata
profiling with Data profiling scan option, the Scan job does not identify the column as sensitive.
Metadata profiling with data profiling might increase the amount of false negatives in the scan results. A
false negative occurs when a column is sensitive, but the Scan job does not identify the column as
sensitive. A false negative can occur if the metadata contains poorly labeled columns.
Exclude Null
The Scan job ignores rows with null values within the sample set of rows in a column to calculate the
conformance score. When you enable this option, the sensitivity status of a column might be more
accurate.
For example, you enable the Exclude Null property and specify that the Scan job runs a profile on the first
10,000 rows of each column. However, a column has 1,000 null values in the first 10,000 rows. The
Profiling job step bases the conformance score calculation on the 9,000 rows that have values. The
Profiling job step does not increase the size of the sample.
You can configure sampling options for faster profiling on the source data. The following table describes
the sampling options that determine the number of rows on which the Scan job runs a profile.
Sampling Description
Option
First Runs the profile on the number of rows that you select in sequential order. The scan starts with
<number> the first row of the table and reads the rows in sequential order up to the number of rows you
selected. You can select 10,000, 100,000, or 500,000 rows.
Not valid if the file system data store is enabled for JSON and XML file types.
No Sampling Runs the profile on all rows of the data store. If the data store contains a large number of rows,
the Scan job performance might be impacted.
Note: If the scan includes unstructured data stores, you must select this option for the Scan job
to finish successfully.
The unit to measure the quantity of rows that must match the data domain for Data Privacy Management
to identify the column as sensitive.
You must select Rows. Data Privacy Management evaluates the number of matched rows in the column
with the number specified in the Conformance Row Count property of the data domain. If the number of
matched rows is equal to or exceeds the conformance row count, Data Privacy Management identifies
the column as sensitive.
Option Description
Include Catalog Extracts a metadata catalog from the data store. The Data Privacy Management repository must
Information have at least one scan result that includes the catalog information before the Scan job can run
data domain discovery or determine the row count.
Run Profiling Determines the type of profiling that the Scan job runs to perform data domain discovery:
- Metadata. Matches data domains with the source metadata.
- Data. Matches data domains with the source data. You can also specify options for running a
data profile on the data store.
Include Row Calculates the total number of rows for each sensitive field that the Scan job identifies. This
Count option might affect the Scan job performance. To optimize performance, run a scan with this
option only.
If you do not include the row count, the Security Dashboard does not display the risk cost for the
data stores in the scan.
You can run a scan once with all options enabled, or you can run a scan multiple times with different options
enabled. For example, you can run a scan with the Include Catalog Information option enabled. Then, run a
scan with the Run Profiling option enabled. Then, run a scan with the Include Row Count option enabled.
Note: You cannot include keyspaces, tables, or columns that include uppercase characters in a domain
discovery scan on an Apache Cassandra data store.
The Scan job uses the metadata match conditions in a data domain to determine if a column matches a data
domain based on the column name. If there is a match, the Scan job identifies the column as sensitive. The
Scan job evaluates the data domains in the classification policies that you include in the scan.
Metadata profiling is typically the fastest method of profiling. However, metadata profiling can generate
more false positives than data profiling or metadata profiling with data profiling. A false positive can occur if
a data domain uses a broad regular expression for the data domain metadata match condition.
For example, you configure a data domain metadata match condition to use a general expression that
searches for "name" in column metadata. The Scan job identifies the FirstName, LastName, ProductName,
and DepartmentName columns as a match for the data domain. However, the ProductName and
DepartmentName columns are not sensitive. To reduce the chance of false positives, use the data profiling
scan option or metadata profiling with data profiling scan option.
The Scan job uses the data match conditions in a data domain to determine if a column matches the data
domain based on the column data. If there is a match, the Scan job identifies the column as sensitive. The
Scan job evaluates the data domains in the classification policies that you include in the scan.
Data profiling is typically the most accurate and slowest method of profiling. You can choose the following
options:
The option enables a Scan job to run data domain discovery profiles. The profiles match data domains
with the source metadata and data. The Scan job evaluates the data domains in the classification
policies that you include in the scan.
The Scan job first performs metadata profiling, and then runs a data domain discovery profile that
matches data domains with the source metadata. The Scan job uses the metadata match conditions in a
data domain to determine if a column matches a data domain based on the column name. If there is no
metadata match, the Scan job does not identify the column as sensitive. If there is a metadata match,
the Scan job performs data profiling on the column.
The Scan job performs data profiling for the columns that the scan job identified as a metadata match.
The Scan job runs a data domain discovery profile that matches data domains with the source data. The
Scan job uses the data match conditions in a data domain to determine if a column matches a data
Metadata profiling with data profiling is typically faster than data profiling and slower than metadata
profiling. Compared to metadata profiling, metadata profiling with data profiling reduces the amount of
false positives in the scan results. A false positive occurs when a column is not sensitive, but the Scan
job identifies the column as sensitive.
For example, a DepartmentName column might match a metadata match condition for a name string in a
data domain. However, the data within the DepartmentName column does not match the data match
conditions in the data domain. If you run a scan with the Metadata profiling option, the Scan job
identifies the column as sensitive. If you run a scan with the Data profiling option or the Metadata
profiling with Data profiling scan option, the Scan job does not identify the column as sensitive.
Metadata profiling with data profiling might increase the amount of false negatives in the scan results. A
false negative occurs when a column is sensitive, but the Scan job does not identify the column as
sensitive. A false negative can occur if the metadata contains poorly labeled columns.
Exclude Null
The Scan job ignores rows with null values within the sample set of rows in a column to calculate the
conformance score. When you enable this option, the sensitivity status of a column might be more
accurate.
For example, you enable the Exclude Null property and specify that the Scan job runs a profile on the first
10,000 rows of each column. However, a column has 1,000 null values in the first 10,000 rows. The
Profiling job step bases the conformance score calculation on the 9,000 rows that have values. The
Profiling job step does not increase the size of the sample.
Full Profiling
The Scan job performs data profiling for all data domain matches in the data store, including matches
discovered in previous data store scans.
Sampling Technique
The following table describes the sampling options that determine the number of rows on which the
Scan job runs a profile.
Sampling Description
Option
First <number> Runs the profile on the number of rows that you select in sequential order. The scan starts
with the first row of the table and reads the rows in sequential order up to the number of rows
you selected. You can select 10,000, 100,000, or 500,000 rows.
No Sampling Runs the profile on all rows of the data store. If the data store contains a large number of
rows, the Scan job performance might be impacted.
The unit to measure the quantity of rows that must match the data domain for Data Privacy Management
to identify the column as sensitive. Select one of the following options:
• Percentage. Evaluates the percentage of matched rows in the column with the minimum percentage
for sensitivity specified in the Conformance Score property of the data domain. If the percentage of
matched rows is equal to or exceeds the minimum percentage for sensitivity in the conformance
score, Data Privacy Management identifies the column as sensitive.
• Rows. Evaluates the number of matched rows in the column with the number specified in the
Conformance Row Count property of the data domain. If the number of matched rows is equal to or
exceeds the conformance row count, Data Privacy Management identifies the column as sensitive.
Default is Percentage.
Jobs
This chapter includes the following topics:
280
Jobs Overview
Data Privacy Management includes a job framework to manage and automate tasks.
A job is a collection of steps that the Data Privacy Management Service performs to complete a task. For
example, the Data Privacy Management Service runs a job to discover and classify sensitive data in data
stores.
Data Privacy Management includes job types to manage alerts, import data, evaluate classification policies,
run tasks, generate subject data reports, synchronize information, and scan data stores to build the subject
registry and discover and classify sensitive data.
Data import and classification policy evaluation jobs do not include scheduling options and run immediately.
For example, if you edit a classification policy, the evaluate classification policies job runs when you save the
classification policy.
Alert, Protection, Scan, and Subject Data Report Purge jobs run according to scheduling options that you
define. For example, when you create an alert rule, you define the scheduling options for the Alert job. You
can schedule a job to run immediately, in the future, or on a recurring schedule.
The Data Privacy Management Service can run multiple jobs in parallel. You define the maximum number of
parallel jobs when you configure the Data Privacy Management Service.
After you run or schedule a job, you can monitor the job status to verify if the job completed. You can view
the job details and the status of the job steps. You can view and export the job logs to find out more
information about the tasks that the job performs or the errors that the job encountered. You can pause,
resume, stop, or terminate a job.
Jobs Workspace
Use the Jobs workspace to monitor and manage jobs. The Jobs workspace includes a list of jobs and
graphical filters to filter the job list. You can view completed, failed, paused, stopped, or terminated jobs. You
can view jobs that are in progress or jobs that are scheduled to run in the future.
1. Job filters
2. Jobs list
3. Actions menu
Job Filters
Use the job filters to quickly navigate to jobs. You can filter jobs based on the job status, the job type, and the
job start time.
When you click a filter, the Jobs list page shows the jobs that match the filter condition. To clear the filter
conditions, click the filter again. You can hide and unhide the job filters.
The following image shows an example of the Jobs list page that shows the job parameters for a Subject
Scan job:
1. Job filters
2. Job list
3. Job properties
4. Refresh icon
5. Click the arrow next to a job to display the job parameters.
6. Click the Job ID to view the job steps and job logs.
7. Click the Job Sub Steps to view the job steps and job logs.
Access the Job details page to view the job steps or job logs. You access the Job details page when you
click a Job ID from the Jobs list page or when you view a job from the Actions menu.
Job Types
A job type is a template that includes a predefined workflow of one or more job steps. Data Privacy
Management includes job types for tasks related to alerts, data evaluation, data import, risk score
calculations, scans, subject data reports, tasks, and users.
Data Privacy Management creates a job when you perform a task that is associated with a job type, or when
system jobs are configured to run. The job type determines when the job runs. Jobs run according to
scheduling options that you define or immediately after you perform an action. System jobs run at
preconfigured intervals.
For scan jobs that you configure with both the Include Catalog Information and Run Profiling options, the job
steps are interdependent. If the Profiling job step fails, the Catalog information is not available and the Load
Catalog job step fails.
Evaluate Security Policy Evaluates data store properties and the data store scan
results. If the data store properties and scan results
meet the conditions in the data store security policy, the
job creates a security policy violation.
An Evaluate Security Policy job runs when you create or
edit a data store security policy and configure the
schedule.
Import Catalog Results Imports information about data stores from Enterprise
Data Catalog to Data Privacy Management.
An Import Catalog Results job runs when you import
Catalog resources on the Data Stores workspace.
Incremental Scan Uses the scan options from an Application, Big Data,
Cloud, Data Integration, Database Management, or File
Management scan that completed to run the scan again
on the data stores, but only performs profiling on the
changed fields or files included in the scan job.
An Incremental Scan job runs when you select Scan for
Missing Data Domains from the Actions menu of a Data
Store Details page.
Recalculate Risk Scores Uses the risk score setting on the Settings workspace to
calculate and update the risk score of scanned data
stores. Data Privacy Management creates a Recalculate
Risk Scores job when you change risk factor weights. The
job runs immediately after you confirm that you want to
recalculate the existing risk scores.
Subject Registry Scan When you run a subject registry scan, the service creates
one master Subject Registry Scan job and individual
Subject Scan jobs for every golden and transaction data
store included in the scan.
The master Subject Registry Scan job includes child jobs
that track all scans that you include in the subject
registry scan. Based on how you configure the scan, the
job can contain the following child jobs:
- Poll Domain Discovery. Tracks the status of all domain
discovery scans. Applicable if you include the Domain
Discovery scan type.
- Scan Golden Data Stores. Tracks the status of all
subject scans on golden data stores. Applicable if you
include the Discover Subjects scan type.
- Scan Transaction Data Stores. Tracks the status of all
subject scans on transaction data stores. Applicable if
you include the Link Subjects scan type.
Data Privacy Management creates an individual Subject
Scan job for each golden and transaction data store scan
that you include in a subject registry scan.
Sync Catalog Updates Synchronizes the information for a data store that the
Import Catalog Results job found.
A Sync Catalog Updates job runs when you select Import
Results from Enterprise Data Catalog from the Actions
menu of a Data Store Details page.
Data Privacy Management includes the following system job types that you can view on the System Jobs tab
on the Jobs workspace:
Evaluate Risk Calculates and updates the risk score of scanned data stores. The Data Privacy Management
Scores Service runs the Evaluate Risk Scores job once a day at the scheduled time. The default
scheduled time is 1:30 am UTC.
Subject Data Automatically removes subject data reports according to the interval that you specify in the Data
Report Purge Privacy Settings pane on the Settings workspace.
Synchronize Users Synchronizes user details between the Data Privacy Management repository and the Enterprise
Unified Metadata platform. After you create a user in the Administrator tool, you run the
Synchronize Users job.
To run the job, select Synchronize > Users from the Actions menu on the Data Stores workspace.
To view and manage new users, you must log out of Data Privacy Management and log in again
after you add the users in the Administrator tool and run the Synchronize Users job.
Related Topics:
• “Changing Risk Score Factor Weights” on page 42
• “Adding Custom Risk Score Factors” on page 43
Load and Link For Cloudera Navigator, extracts a metadata catalog from the source.
Load Catalog For HDFS and Hive, extracts a metadata catalog from the source if the scan is configured to
include catalog metadata.
Update For Cloudera Navigator, determines column-level data lineage for sensitive columns between
Augmentation scanned Hive data stores that are on the same host as the Cloudera Navigator data store.
Profiling For HDFS and Hive, creates and runs a profile to perform data domain discovery on source
metadata, data, or both if the scan is configured for data profiling. The job uses the profile
results to determine sensitivity status.
Collect Row Count For Hive data stores, calculates the row count for each table that contains sensitive data if the
scan is configured to include a row count.
Copy Augmentation Copies scan results from temporary tables to the Data Privacy Management repository. The job
stores scan results in temporary tables while running. After the scan results are stored in the
history tables, the job commits the scan results to the repository in one database commit.
Evaluate Policies Evaluates if the data store contains the classification of sensitive data as defined in the policy.
Calculates the risk score, residual risk cost, and sensitivity levels. Calculates the total number
of sensitive columns that each data store contains.
• The scan job can connect to Hadoop clusters that use Kerberos authentication.
• The auto random and random sampling techniques are not supported. You can use the First N sampling
techniques for data profiling.
• The Scan job cannot run profiling for columns that include the following datatypes:
ARRAYS
BINARY
MAPS
STRUCT
If the source contains the datatypes, the Scan job skips profiling for columns that contain the
unsupported datatypes.
• The Big Data Scan job identifies and classifies sensitive data in Hive data stores. To identify the
proliferation of sensitive data between Hive data stores and to identify the protection status of sensitive
data, run a Cloudera Navigator scan.
Load Catalog Extracts a metadata catalog from the source if the scan is configured to include catalog
metadata.
Profiling Creates and runs a profile to perform data domain discovery on source metadata, data, or both
if the scan is configured for data profiling. The job uses the profile results to determine
sensitivity status.
Collect Row Count Calculates the row count for each table that contains sensitive data if the scan is configured to
include a row count.
Copy Augmentation Copies scan results from temporary tables to the Data Privacy Management repository. The job
stores scan results in temporary tables while running. After the scan results are stored in the
history tables, the job commits the scan results to the repository in one database commit.
Evaluate If the scan is configured to include catalog information, run profiling, or include a row count,
Classification the job evaluates if the data store contains the classification of sensitive data as defined in the
Policies classification policy. Calculates the risk score, residual risk cost, and sensitivity levels.
Calculates the total number of sensitive fields that each data store contains.
Related Topics:
• “Cloud Scan Options” on page 264
• The scan job cannot run profiling or determine row count for the following Salesforce tables:
ActivityHistory
AggregateResult
AttachedContentDocument
CollaborationGroupRecord
CombinedAttachment
ContentDocumentLink
DatacloudCompany
DatacloudContact
DatacloudDandBCompany
DatacloudSocialHandle
EmailStatus
FeedLike
FeedTrackedChange
IdeaComment
Name
NoteAndAttachment
OpenActivity
OwnedContentDocument
ProcessInstanceHistory
UserRecordAccess
Vote
Collect For Informatica Data Engineering Integration data store scans configured to collect data store
Connections information, imports connections and connection properties to the Data Privacy Management
repository.
The Scan job creates one data store in the repository for each connection.
Collect For Informatica Cloud data store scans configured to collect data store information, connects to
Connections from Informatica Cloud using the connection properties in the data store. Imports connections and
Informatica Cloud connection properties from Informatica Cloud to the Data Privacy Management repository for
Amazon S3, Microsoft SQL Server, MySQL, Oracle, and Salesforce data store types.
The Scan job creates one data store for each supported Informatica Cloud connection. For
MySQL connections, the Scan job creates a data store with data store type JDBC.
The Scan job uses the following naming convention to create a data store: <Data Privacy
Management Informatica Cloud Data Store Name>/<Informatica Cloud Org ID>/
<Informatica Cloud Connection Name>
The Scan job does not import the password or connection string properties. The Scan job does
not import connection properties for Amazon S3. You must update the password and connection
string in the imported data stores after the Scan job finishes.
Collect For Informatica PowerCenter data store scans configured to include data stores and catalog
Connections and information, imports relational database and ODBC connections from the PowerCenter repository
Catalog to the Data Privacy Management repository.
Imports the metadata from the PowerCenter repository for each connection. Determines an
approximate connection-level data lineage for the data stores.
Load and Link For Informatica Cloud data store scans configured to identify proliferation and data protection,
extracts a metadata catalog from the source database.
For Informatica PowerCenter data store scans configured to identify proliferation and data
protection, determines column-level data lineage for the sensitive columns and identifies the
protection status of sensitive data.
For SQL Server Integration Services data store scans configured to identify proliferation and data
protection, extracts a metadata catalog from the source database. Identifies the number of
scanned child data stores in the mapping between which sensitive data proliferates. For example,
if a mapping has three connections and sensitive data moves between two connections, the job
log shows the assignable connection size as two.
Profiling For Informatica PowerCenter data store scans configured to include data stores and catalog
information, runs a metadata profile job to identify sensitive data and runs data domain discovery
to match metadata with the data domains in the scan.
The Scan job uses the profile results to determine the metadata sensitivity status. The Scan job
updates the Data Privacy Management repository with the results of the sensitive data
identification.
Collect Database For Informatica PowerCenter data stores scans configured to include data stores and catalog
Configuration information, updates the data store connection properties in the Data Privacy Management
Files repository.
The Scan job runs a PowerCenter workflow that reads connection properties from database
configuration files and from PowerCenter parameter files. The Scan job updates the repository
with the connection properties for the data stores that the Scan job imported from the
PowerCenter repository.
Generate For Informatica PowerCenter data stores scans configured to include data stores and catalog
Connection information, determines connection-level data lineage for sensitive columns.
Lineage The Scan job analyzes the PowerCenter repository to determine data lineage for the data stores
that the Scan job imported. The Scan job determines forward lineage from sources to targets
only.
Update If the scan is configured to identify proliferation and data protection, determines column-level
Augmentation data lineage for sensitive columns and identifies the protection status of sensitive data.
Copy If the scan is configured to collect data store information and identify proliferation and data
Augmentation protection, copies scan results from temporary tables to the Data Privacy Management
repository.
The Scan job stores scan results in temporary tables while running. After the scan results are
stored in the history tables, the Scan job commits the scan results to the Data Privacy
Management repository in one database commit.
Evaluate If the scan is configured to collect data store information and identify proliferation and data
Classification protection, evaluates if the data store contains the classification of sensitive data as defined in
Policies the scan policies.
Calculates the risk score, residual risk cost, and sensitivity levels. Calculates the total number of
sensitive columns that each data store contains.
Related Topics:
• “Data Integration Scan Options” on page 267
• The Scan job uses the connections details in the associated Data Integration Service of the Data Privacy
Management Service to connect to Informatica Data Engineering Integration. Because the Scan job does
not connect to the database directly, the Scan job does not require user privileges.
• The Scan job cannot connect to Informatica Data Engineering Integration if the Model repository is on an
Informatica domain that uses Kerberos authentication.
• The Scan job does not import connections from the SQL Server Integration Services package.
• The Scan job does not determine if sensitive data is protected. All sensitive columns show as unprotected
unless you manually mark the columns as protected.
Load Catalog Extracts a metadata catalog from the source. The job includes the job step if the scan is
configured to include catalog information.
Profiling Creates and runs a profile to perform data domain discovery. The scan options determine if the
data domain discovery runs on the source metadata, data, or both. The job uses the profile
results to determine the sensitivity status.
The job includes the job step if the scan is configured to run profiling.
Collect Row Count Calculates the row count for each table that contains sensitive data. The job includes the job
step if the scan is configured to include a row count.
Copy Copies scan results from temporary tables to the Data Privacy Management repository. The job
Augmentation stores scan results in temporary tables while running. After the scan results are stored in the
history tables, the job commits the scan results to the repository.
The job includes the job step if the scan is configured to include catalog information, to run
profiling, or to include a row count.
Evaluate Evaluates if the data stores contain the classification of sensitive data as defined in the
Classification classification policy. Calculates the risk score, residual risk cost, and sensitivity levels.
Policies Calculates the total number of sensitive columns that each data store contains.
The job includes the job step if the scan is configured to include catalog information, to run
profiling, or to include a row count.
Related Topics:
• “Database Management Scan Options” on page 270
• The Scan job runs most job steps in sequential order. A job step must complete before the next job step
starts.
• If you configure a scan to include catalog information and to run data profiling, the Scan job runs the Load
Catalog and Profiling job steps in parallel. If you configure a scan to include catalog information and to
run metadata profiling or metadata with data profiling, the job runs the job steps sequentially.
• You can pause, stop, terminate, and resume a Scan job that runs the Load Catalog and Profiling job steps
in parallel. The action affects the job steps that are running. The behavior is the same as when the job
steps run in sequential order. For example, if you pause the job after the Load Catalog job step completes
and while the Profiling job step is running, the job only pauses the Profiling job step. When you resume the
job, the job resumes the Profiling job step.
When a Scan job runs the Load Catalog and Profiling job steps in parallel and the Load Catalog job step
fails, the Profiling job step also fails. When you resume the job, the job resumes both job steps.
The Data Privacy Management Service creates the job when you edit a classification policy that is associated
with a completed scan. The job runs immediately after you save the classification policy or trigger the
Evaluate Classification Policies action from the Classification Policies workspace.
The evaluate classification policies job includes the following job step:
Evaluate classification policies
Evaluates if data stores contain the classification of sensitive data as defined in the classification policy.
Calculates the risk score, residual risk cost, and sensitivity level. Calculates the total number of sensitive
columns that each data store contains.
The Data Privacy Management Service runs the evaluate risk scores job once a day at the scheduled time. By
default, the scheduled time is 1:30 am UTC.
The evaluate risk scores job includes the following job step:
Evaluate risk scores
The Data Privacy Management Service creates the job when you create or edit a data store security policy.
You schedule when the job runs when you configure the data store security policy.
The evaluate security policy job includes the following job step:
Evaluates the properties and scan results of data stores that the user can access. If the properties or
scan results meet the conditions in the data store security policy, the job creates a security policy
violation.
Load Catalog For data stores that scan with Enterprise Data Catalog, extracts a metadata catalog from the
source. The job includes the job step if the scan is configured to include catalog information.
Browse For data stores that scan with a remote agent, loads the metadata of files in the data store based
on the value in the Agent Scan Settings pane on the Settings workspace and the folders and file
types included in the data store configuration.
If the Browse job step discovers zero files, the File Management job has a Warning status, and the
remaining job steps have a Warning message and the job ends.
Profiling For data stores that scan with Enterprise Data Catalog, creates and runs a profile to perform data
domain discovery. The scan options determine if the data domain discovery runs on the source
metadata, data, or both. The job uses the profile results to determine the sensitivity status.
The job includes the job step if the scan is configured to run profiling.
AgentProfiling For data stores that scan with a remote agent, creates and runs a profile to perform data domain
discovery through a remote agent on the list of files that the Browse job step returns.
Evaluate Rules For data stores that scan with Enterprise Data Catalog, evaluates the sensitivity status of fields.
The job step is included if the scan is configured to run profiling.
Copy Copies scan results from temporary tables to the Data Privacy Management repository. The job
Augmentation stores scan results in temporary tables while running. After the scan results are stored in the
history tables, the job commits the scan results to the repository.
The job includes the job step if the scan is configured to include catalog information or run
profiling.
Evaluate Evaluates if the data stores contain the classification of sensitive data as defined in the
Classification classification policy. Calculates the risk score, residual risk cost, and sensitivity levels.
Policies Calculates the total number of sensitive fields or files that each data store contains.
The job includes the job step if the scan is configured to include catalog information or run
profiling.
Related Topics:
• “File Management Scan Options” on page 273
• The File Management job does not determine an actual or estimated row count. The job ignores the row
count factor when calculating the risk score.
• The File Management jobs sets the residual risk cost to zero.
• For unstructured file management data stores that scan with a remote agent, if the Browse job step
encounters files that exceed the size specified in the Agent Scan Settings pane on the Settings
workspace, the job skips the files and continues processing.
You can download a report as a CSV file from the Download Report column for the Browse job step on the
Job Details page. The report contains information about the current maximum file size setting, the folders
and file types that the data store includes and excludes, and a list of files that returned errors.
You can download reports as CSV files from the Download Report column for the Profiling job step on the
Job Details page. The reports contains information about the files that the Scan job profiled successfully
and the files that returned errors in the Profiling job step.
• If you scan a file management data store with Enterprise Data Catalog and later edit the data store to
enable the Scan with Remote Agent setting, you must scan the data store again after you associate the
data store with a remote agent. The scan skips files larger than the setting specified on the Settings
workspace.
Import Job
The Import job uses data from CSV files to create, update, or delete information about classification policies,
data stores, data store aliases, data store protection status, data store owners, data domains, encryption
The Data Privacy Management Service creates an Import job that runs immediately when you import data
from CSV files.
Import File Processes data from the CSV file and stores data in temporary tables in the Data Privacy
Management repository.
Save to Repository Moves data from the temporary tables to the main tables in the Data Privacy Management
repository. Then, the job step deletes data in the temporary tables.
• If the job fails during the Import File job step, download the rejected records from the job log. Fix the
issues and import the file again.
• If you terminate the job during the Import File job step, the job deletes all data from the temporary tables.
The job does not save any data in the Data Privacy Management repository. To continue, import the CSV
file again.
• If you pause or stop the job during the Import File job step, the job pauses or stops at the end of the job
step. When you resume the job, the job starts at the Save to Repository job step.
• If the job fails or you terminate the job during the Save to Repository job step, the job might not save all of
the data to the main tables in the Data Privacy Management repository. You can export data from the
repository to view what information the job imported. Then, import a file that contains the data that the
job did not process.
• If you pause or stop the job during the Save to Repository job step, the job pauses or stops at the end of
the job step. The job finishes at the end of the Save to Repository job step.
The Import Catalog Results job includes the following job steps:
Import Catalog Imports classification policies from Enterprise Data Catalog that you select on the Data
Resources Stores workspace when configuring the resources to import.
Import Profiling Results Imports the data store profile from Enterprise Data Catalog.
Copy Augmentation Copies profiling results from temporary tables to the Data Privacy Management repository.
Evaluate Classification Evaluates if the data stores contain the classification of sensitive data as defined in the
Policies classification policies imported from Enterprise Data Catalog. Calculates the risk score,
residual risk cost, and sensitivity levels. Calculates the total number of sensitive fields or
files that each data store contains.
Profiling For data stores that scan with Enterprise Data Catalog, creates and runs a profile to perform data
domain discovery. The scan options determine if the data domain discovery runs on the source
metadata, data, or both. The job uses the profile results to determine the sensitivity status.
The job includes the job step if the scan is configured to run profiling.
Collect Row For Application, Hive, Cloud, Data Integration, and Database Management data stores, calculates
Count the row count for each table that contains sensitive data if the scan is configured to include a
row count.
Copy Copies scan results from temporary tables to the Data Privacy Management repository. The job
Augmentation stores scan results in temporary tables while running. After the scan results are stored in the
history tables, the job commits the scan results to the repository.
The job includes the job step if the scan is configured to include catalog information or run
profiling.
Evaluate Evaluates if the data stores contain the classification of sensitive data as defined in the
Classification classification policy. Calculates the risk score, residual risk cost, and sensitivity levels.
Policies Calculates the total number of sensitive fields or files that each data store contains.
The job includes the job step if the scan is configured to include catalog information or run
profiling.
Orchestration Job
The Orchestration job performs the actions that are configured in custom, email, service management, and
system log tasks.
You run the job when you create a manual action or on demand from the Tasks workspace. If a custom,
email, service management, or system log action is associated with a security policy, the orchestration job
runs automatically when Data Privacy Management detects a violation of the security policy. The
OrchestrationJobStep
Performs the actions that are specified in the custom, email, service management, or system log task
and the extensions that are configured for each task.
Protection Job
The Protection job applies the rules to protect sensitive fields in a data store that are configured in a
protection task. The Protection job steps depend on the protection extension that is specified in the task.
You schedule when the job runs after you configure a protection task on the Tasks workspace. If an action to
protect data is associated with a data store or user activity security policy, Data Privacy Management creates
the protection task when a violation of the security policy occurs. If an action to protect data is associated
with a decryption security policy, Data Privacy Management creates the protection task when the policy
conditions are met.
Export Connection For tasks configured with an encryption or Persistent Data Masking protection extension,
XML creates a connection from Data Privacy Management to the Test Data Management service.
The Protection job builds a connection XML file, exports the file from Data Privacy
Management, and then imports the file into Persistent Data Masking.
Export Encryption For tasks configured with an encryption protection extension, builds an XML file that contains
Rule the encryption rules and associated encryption keys for the data domains included in the task.
The Protection job then runs the ExportProject XML job step.
Export Project XML For tasks configured with an encryption or Persistent Data Masking protection extension,
creates a project in the Test Data Management service. The Protection job builds a project XML
file that includes metadata from the data sources and assignments in the task. The job then
exports the project XML file from Data Privacy Management and imports the file into Persistent
Data Masking.
Generate Workflow For tasks configured with an encryption or Persistent Data Masking protection extension,
creates the workflow that masks the sensitive columns according to the rules specified in the
task.
Execute Workflow For tasks configured with an encryption or Persistent Data Masking protection extension, runs
the workflow that masks the sensitive columns according to the rules specified in the task.
Update Encryption For tasks configured with an encryption protection extension, encrypts the sensitive columns
Configuration with the associated encryption rules and keys specified for each data domain in the data store.
Update Protection For tasks configured with a protection extension that enables the Mark column as protected
Status after execution option, changes the status of sensitive columns from Unprotected to Protected
after the job completes successfully.
The job runs immediately after you confirm that you want to reset the results.
The Reset Classification Results job includes the following job steps:
Reset Protection and Resets the results of a protection job and resets the Protection status and
Sensitivity Results Sensitivity results.
Reset Catalog Results Catalog results reset to remove domain association information.
For unstructured data stores associated with a subject registry remote agent, the
job skips this step.
Update Classification Status Updates the classification status of the data store to Not Analyzed.
In the connection, you define the type of information the job imports and set up a schedule to periodically
synchronize Data Privacy Management with Salesforce. You can schedule the job to run immediately, in the
future, or set up a job that runs on a regular basis.
The Import job includes multiple job steps. The job steps depend on the options configured in the Salesforce
server connection. The connection includes an option to import users, user access, and user activity.
Connects to the Salesforce application using the connection properties in the Salesforce data store. The
job sends a query through the Bulk API to collect users from Salesforce. The first time you run the job for
the connection, the job collects all users from Salesforce. For subsequent runs, the Import job queries
changes since the time the last Import job completed.
The Bulk API returns a .csv file that includes a list of users. The job processes the records in the CSV file
and imports the data into a temporary table in the Data Privacy Management repository. The job log
shows how many records the job processed, rejected, skipped, deleted, inserted, or updated. If the job
rejected one or more rows in the CSV file, you can download a list of the rejected rows.
When you run a synchronization, the Import job always includes the job step to import users.
Connects to the Salesforce application using the connection properties in the Salesforce data store. The
job sends a query through the Bulk API to collect user groups, user group memberships, and user access
from Salesforce. The Import job retrieves user groups from Salesforce permission sets and retrieves
user group memberships from the Salesforce permission set assignments. The Import job queries object
permissions for sensitive tables and field permissions for sensitive columns. Every time the job runs, the
job gets user access details that are valid at the time the job runs.
The Bulk API returns four CSV files that include a list of permission sets, permission set assignments,
object permissions, and field permissions. The job processes the records in the CSV files and imports
the data into a temporary table in the Data Privacy Management repository. The job log shows how many
records the job processed, rejected, skipped, deleted, inserted, or updated for each CSV file. If the job
rejected one or more rows in a CSV file, you can download a list of the rejected rows. If a table in the
object permissions CSV file is not identified as sensitive in the repository, the Import job rejects the row.
If the job fails or if you terminate the job during the job step, the job deletes data from the temporary
tables. The job does not save data to the repository.
The Import job includes the job step if the connection is configured to import user access.
Save to Repository
Copies user and user access data from the temporary tables to the main tables and deletes data from
the temporary tables. If the job fails or if you terminate the job during the Import File job step, the job
deletes data from the temporary tables. The job does not save data to the Data Privacy Management
repository.
The Import job includes the job step if the connection is configured to import users and user access.
Evaluates the user access count in the risk score for the Salesforce data store.
The Import job includes the job step if the connection is configured to import user access.
Connects to the Salesforce application using the connection properties in the Salesforce data store. The
job sends a query through the Bulk API to collect event logs from Salesforce. The Import job retrieves
user activity from the EventLogFile object for the API, REST API, and Bulk API event types. The job uses
the scheduling options in the connection to determine the time range in which to retrieve the user
activity. The job saves user activity directly to the user activity store. The retention period configured in
the Data Privacy Management Service determines how long the events remain in the user activity store.
If the Import job fails during the job step, any of the events that the job saves in the user activity store
remain. You can terminate the job and run the job again. The job will not save duplicate events in the
user activity store.
The Import job includes the job step if the connection is configured to import user activity.
The Data Privacy Management Service runs the job at the interval specified for purging DSAR reports in the
Subject Data Report Purge Interval property, if the Subject Data Report Retention Period value is greater
than 0. A 0 value indicates that you always retain DSAR reports.
The Subject Details page retains the history of reports created for each data subject on the Requests pane.
The Download Options column displays one of the following messages for each DSAR report created for a
data subject:
• Available for download. You can download the DSAR report as a CSV file from the associated DSAR Task
Properties page, or by clicking the Download DSAR Report icon on the Subject Details page and selecting
a CSV or PDF report format.
• Available for download until mm/dd/yyyy. You can download the DSAR report until the specified date. The
Subject Data Report Purge job is scheduled to run on the date listed in the column.
• Not available for download. The DSAR report was purged when the Subject Data Report Purge Job ran,
and you cannot download the DSAR report.
The Subject Data Report Purge job includes the following job step:
Purge Automatically removes DSAR reports according to the Data Privacy Settings on the Settings workspace.
DSAR For example, the Subject Data Report Retention Period value is 2 and the Subject Data Report Purge
Interval value is 7. The Subject Data Report Purge job runs every seven days, and the job step purges
DSAR reports from Data Privacy Management that users created at least two days before the job runs. If
the job runs today, Data Privacy Management retains DSAR reports that users created yesterday and
purges the reports when the job runs again in seven days.
When you run a Subject Registry scan, the service creates a master Subject Registry scan job. The job
includes multiple child jobs based on how you configure the scan. For example, you can choose to run golden
and transaction data store scans in the same scan or different scans. You can choose to include the required
domain discovery scan in the subject registry scan or run the domain discovery scan as a separate scan.
Tracks all the domain discovery jobs that run in the scan. Created if you include the Domain Discovery
scan type in the scan. If you have not run a domain discovery scan on data stores in the subject registry,
you can configure domain discovery properties in the subject registry scan. All domain discovery scans
must finish before other jobs can run.
Tracks subject scans of all golden data stores that you include in the scan. Created if you include the
Discover Subjects scan type in the scan.
Tracks subject scans of all transaction data stores that you include in the scan. Created if you include
the Link Subjects scan type in the scan.
Note: A child job is triggered after the previous child job finishes successfully.
When the service creates a master Subject Registry job, it also triggers individual Subject Scan jobs. Subject
scans are scan jobs that run on each golden and transaction data store included in the scan. A separate scan
job runs for each data store.
Deleting a master job deletes all its child jobs and the subject scans that it creates. If you delete the master
job, a message appears to confirm that you want to delete the job and all its child jobs and subject scans.
On the Jobs workspace, you can filter and view jobs by the Subject Registry and Subject Scan job types.
The Data Privacy dashboard job fetches the updated data from the Subject Registry tables and refreshes the
data counts of the Privacy dashboard.
The job must complete successfully for updated data to reflect when you refresh the dashboard.
You cannot start, stop, or pause a Privacy Dashboard job. You can resume a job that pauses or fails.
The job is listed in the Jobs workspace as: Privacy_Dashboard_<subject registry scan name>.
The Sync Catalog Updates job includes the following job steps:
Sync Catalog Updates Synchronizes the information from Enterprise Data Catalog that the Import Catalog Updates
job imported with the information about the data store in Data Privacy Management.
Evaluate Evaluates if the data stores contain the classification of sensitive data as defined in the
Classification classification policies imported from Enterprise Data Catalog. Calculates the risk score,
Policies residual risk cost, and sensitivity levels. Calculates the total number of sensitive fields or
files that each data store contains.
The job runs when you select Synchronize > Users from the Actions menu on the Data Stores workspace.
Sync Users
The Sync Users job step queries the list of domain users in the Enterprise Unified Metadata platform and
copies the user details to the repository.
To view and manage new users in Data Privacy Management, you must log out of Data Privacy Management
and log in again after you add the users in the Administrator tool and run the Synchronize Users job.
Job Management
On the Jobs workspace, you can monitor and manage jobs. You can also download reports for scan and
Import jobs.
Pausing a Job
You can pause a job that is in progress. When you pause a job, the job pauses at the next logical point within
the current job step or at the end of the current job step. For example, if you pause an Informatica
Resuming a Job
You can resume a job that has a failed, paused, or stopped status.
Stopping a Job
You can stop a job that is in progress. When you stop a job, the Data Privacy Management Service aborts the
job immediately. If you stop an alert or import job, the job stops after the current job step completes. To stop
the job at the next logical step, pause the job instead.
Exporting Jobs
You can export a list of jobs to a CSV file. The export file includes a list of all jobs and the corresponding job
properties. If you filtered the jobs list, the export file contains the jobs that match the active filter conditions.
After you scan data stores, you can view the scan results. You download the reports in a .zip file from the Job
Details page.
• Manually update individual fields/files on the Sensitive Fields or Sensitive Files page accessed from the
Security Dashboard.
• Update fields and then import the CSV file on the Data Stores workspace. From the Actions menu, select
Import > Protection Status and import a CSV file.
Related Topics:
• “Changing the Default Conformance Ranges” on page 45
• “Conformance Score” on page 209
After you scan unstructured data stores with a remote agent, you can view the scan results. You download
the Browse job step report in a CSV file from the Job Details page. You download the AgentProfiling job step
report in a ZIP file from the Job Details page.
You can use the report to process the rejected records. The report is a CSV file that contains the same file
format that the Import job requires. Correct the records in the file and remove the column that contains the
rejection reason. Then, import the file on the workspace that corresponds to the type of information that the
job imported.
309
Chapter 15
Users
A user is an account that accesses data in a data store. A user can be an application user, a database
user, or an operating system user. You manage users in external systems or LDAP directory services.
You import metadata related to the user account such as full name, department, and location. You can
import users from an LDAP directory service, Salesforce, and from a CSV file.
Groups
A group is a set of related users or groups that have the same data store access levels. You can import
groups from an LDAP directory service, Salesforce, and from a CSV file.
310
Group memberships
User access
User access specifies users or groups that have access to data stores and the type of access. You can
import user access from Salesforce and from a CSV file.
User activity
User activity is a set of actions that a user performs on sensitive data. You can import user activity from
Salesforce.
User aliases
A user alias is an alternative account name for a user. If a user has multiple accounts, you can import a
list of user aliases to resolve multiple accounts to one user. You import user aliases from a CSV file.
You can import data from a csv file. You can set up a recurring job that synchronizes Data Privacy
Management with user metadata from an LDAP directory service or Salesforce. When you import user
metadata, you schedule an import job. The import job saves users, groups, group memberships, user access,
and user aliases metadata in the Data Privacy Management repository. The import job saves user activity
metadata in the user activity store.
If user information changes in the source system, you must update Data Privacy Management to reflect the
changes. Import changes from a .csv file or run a synchronization job. Use the same method to create and
update user metadata. For example, if you imported users from an LDAP directory service, run an LDAP
synchronization job. If you imported users from a .csv file, import changes in a .csv file.
You can export information about users, user groups, user group membership, user access, and user aliases
to a CSV file. You can use the exported information for reporting or analysis. You can also use the exported
CSV file to update user metadata in the Data Privacy Management repository.
Create a folder on the domain machine. Enter the folder location when you configure the Data Privacy
Management Service during install. You can also update the information in the Data Privacy Management
Service service properties from the Administrator Tool after you install.
Mount the folder that you create on the domain machine to a location on the Spark node. In a multinode
setup, mount the location to all Spark nodes. The path to the mounted location must be identical on the
domain machine and all Spark nodes.
Create the folder on the domain machine and then create identical folder paths on all Spark nodes. Create the
mount on each node after you create the folder paths.
The following table lists the supported import types for each source:
Related Topics:
• “Import and Synchronization from an LDAP Directory Service” on page 314
• “Import and Synchronization from Salesforce” on page 319
• “User Import from a CSV File” on page 323
• “User Group Import from a CSV File” on page 326
• “User Group Membership Import from a CSV File” on page 329
• “User Access Import from a CSV File” on page 332
• “User Aliases Import from a CSV File” on page 337
Use the following process to import metadata about users and user access:
1. Create a connection to an LDAP directory service and schedule when the synchronization occurs.
The synchronization imports users, groups, and group memberships from the LDAP directory service
based on the criteria that you specify in the connection.
2. Import user access metadata from a CSV file.
Import the level of data store access for a user or group. When you import user access metadata for a
group, all users that are members of the group and the nested groups inherit the same user access
levels.
3. Optionally, import user aliases from a CSV file.
If a user has multiple user accounts, import a list of user aliases to resolve multiple user accounts to one
user.
Use the following process to import metadata about users and user access:
You can import from Microsoft Active Directory and IBM Tivoli Directory Service. If you have users in other
directory services, you must import users, groups, and group memberships from CSV files.
To import users, groups, and memberships from an LDAP directory service, you create a connection to an
LDAP server. You configure the connection to the LDAP server that contains the directory service from which
you want to import. You specify a base distinguished name and a search filter to define the set of users and
groups that the import job imports into the Data Privacy Management repository. The base distinguished
name is the point from which the LDAP server starts the search with the search filter.
You define when the import occurs and set up a schedule to periodically synchronize the Data Privacy
Management repository with the LDAP directory service. When you save the connection, the Data Privacy
Management Service creates an import job. You can schedule the job to run immediately, in the future, or set
up a job that runs on a regular basis.
During the synchronization, the import job uses the base distinguished name and search filter to read users,
groups, and group memberships from the LDAP directory service. The import job updates the Data Privacy
Management repository with additions, changes, and deletions for objects that are included in the LDAP
search results. The import job deletes an object from the Data Privacy Management repository if the object is
no longer available in the search results for an import job with the same base distinguished name and search
filter.
The connection properties cannot exceed 255 characters and are not case sensitive.
Required. Name of the LDAP connection that is stored in the Data Privacy Management repository. The
name must be unique.
Description
Server Name
Required. Host name or IP address of the machine that hosts the LDAP directory service.
Port
Required. Listening port for the LDAP server. This is the port number to communicate with the LDAP
directory service.
Required. Type of LDAP directory service. Select from the following directory services:
User Name
Password
Required if you specify the user name. Password for the LDAP user name. Leave blank for anonymous
login.
Base DN
Required. Distinguished name (DN) of the entry that serves as the starting point to search for users and
groups in the LDAP directory service. The LDAP directory service finds an object in the directory
according to the path in the distinguished name of the object. The import job uses the search base and
filters to define the set of users and groups to import from the LDAP directory service.
For example, in Microsoft Active Directory, the distinguished name of a user object might be
cn=UserName,ou=OrganizationalUnit,dc=DomainName, where the series of relative distinguished names
denoted by dc=DomainName identifies the DNS domain of the object.
For more information about search filters, see the documentation for the LDAP directory service.
Search Filter
Optional. An LDAP query string that specifies the criteria to search for users and groups in the LDAP
directory service. The filter can specify attribute types, assertion values, and matching criteria. The
import job uses the combination of the base distinguished name and search filter to define the set of
users and groups to import from the LDAP directory service.
If blank, the import job uses the filter (objectclass=*) to import all user and groups in the base
distinguished name.
For more information about search filters, see the documentation for the LDAP directory service.
Determines if the LDAP server uses the Secure Socket Layer (SSL) protocol. Required to connect to an
SSL enabled directory service.
If enabled, the LDAP server uses the Secure Socket Layer (SSL) protocol.
If not enabled, the LDAP server does not use the Secure Socket Layer (SSL) protocol. Use anonymous
login or simple authentication instead.
Create an LDAP connection for each set of user and groups to import from an LDAP directory service. Set up
search bases and filters to define the set of users and groups that the import job imports in to the Data
Privacy Management repository. You can create multiple connections to the same directory service if each
connection uses a different search filter.
For example, you defined a connection to an LDAP directory service and scheduled the synchronization to run
every Monday. You had major changes to users on a Wednesday and do not want to wait for the next
synchronization to occur the following Monday. You run the synchronization today. The next synchronization
occurs as defined in the LDAP connection.
You can synchronize one LDAP connection at a time. To synchronize multiple LDAP connections, repeat the
procedure for each LDAP connection.
When an import job starts, the job uses the LDAP connection properties that exist at the time the job starts. If
you edit an LDAP connection that is included in a running import job, the job does not include the changes.
You can create multiple connections to the same LDAP directory service as long as the search base and
filters are unique for each connection.
To import user metadata from Salesforce, you create a connection to a Salesforce instance. You add the
Salesforce data store from which to import user metadata. You define when the import occurs and set up a
schedule to periodically synchronize Data Privacy Management with Salesforce. When you save the
connection, the Data Privacy Management Service creates an Import job. You can schedule the job to run
immediately, in the future, or set up a job that runs on a regular basis.
During the synchronization, the Import job uses the Salesforce Bulk REST API to query the following
information from Salesforce:
Users
The Import job retrieves active and inactive users. The job retrieves the user ID, department, email, first
and last name, active status, manager ID, title, and user name.
User groups
The Import job retrieves user groups from Salesforce permission sets when you import user access.
The Import job retrieves user group memberships from the Salesforce permission set assignments when
you import user access.
User access
The Import job queries the Salesforce ObjectPermission and FieldPermission objects to retrieve user
access information for user groups. All users that are members of the group and the nested groups
inherit the same user access. The job imports user access only for data that is identified as sensitive in
the Data Privacy Management repository.
User activity
The Import job retrieves user activity from the Salesforce EventLogFile object for the API, REST API, and
Bulk API event types. The job stores the events in the user activity store. The job imports user activity
from a predefined time range. The time range that the job uses depends on the import synchronization
schedule. The time frame does not include the current day when the job runs. The time frame starts the
day before the job runs.
For example, if you run the job immediately or schedule the job to run monthly, the job imports user
activity from the last 30 days. If you schedule the job to run daily, the job imports user activity from the
last day. If you schedule the job to run weekly, the job imports user activity from the last week.
The Users workspace counts and the dashboard include all imported active and inactive users. The risk score
only considers active users in the user access count factor. The risk score considers active and inactive
users in the user activity count factor.
You can create multiple connections that import metadata from the same Salesforce data store. For
example, you create one connection to import users and user access on one synchronization schedule, such
as on a daily basis. You create another connection to import users and user activity on a different
synchronization schedule, such as on a weekly basis.
The import job only imports user access for data that is identified as sensitive in the Data Privacy
Management repository. Before you create a Salesforce connection, scan the Salesforce data store to
identify sensitive data.
Required. Name of the Salesforce synchronization connection to create in the Data Privacy
Management repository. The name must be unique.
Connection
Required. Name of the Salesforce data store from which you want to import data. You can choose
an existing data store or create a data store that connects to Salesforce. The list shows the
Salesforce data stores that exist in the Data Privacy Management repository and that include
complete connection properties. The list does not show data stores with incomplete connection
properties as the job cannot connect to an incomplete data store.
URL
Read-only. When you select a connection, the Data Privacy Management Service populates the value
with the Salesforce Service URL configured in the data store.
5. Specify the objects to synchronize.
You can select one or more options in the same synchronization job. By default, the job always
synchronizes users.
6. Click Next.
The Schedule page appears.
7. Schedule when the import job runs.
You can schedule the job to run immediately, in the future, or on a recurring schedule.
The scheduling options determine from which time range the job imports user activity. The time range
does not include the current day when the job runs. If you run the job immediately or schedule the job to
run monthly, the job imports user activity information from the last 30 days. If you schedule the job to
run daily, the job imports user activity from the last day. If you schedule the job to run weekly, the job
imports user activity from the last week.
8. Click Save.
The Data Privacy Management Service saves the connection in the Data Privacy Management repository
and creates an import job. The job runs according to the schedule that you configured. Use the Jobs
workspace to monitor the status of the job.
For example, you defined a connection and scheduled the synchronization to run every Monday. You had
major changes to users on a Wednesday and do not want to wait for the next synchronization to occur the
following Monday. You run the synchronization today. The next synchronization occurs as defined in the
connection.
You can synchronize one connection at a time. To synchronize multiple connections, repeat the procedure for
each connection.
When an import job starts, the job uses the connection properties that exist at the time the job starts. You
cannot edit a connection that is included in a running import job.
The following table lists the order of the columns, column headings, and the format for the values in the CSV
file:
1 UserName Required. Name of a user account that accesses information in a data store.
The combination of the UserName and UserDN columns identifies a unique user to
create, update, or delete. If the UserDN is blank, the UserName must be unique.
If blank, the import job rejects the row.
8 Department Optional. Department of the user. This department is not related to the data store
departments in the Data Privacy Management repository.
If blank, default is UNKNOWN.
9 Location Optional. Location of the user. This location is not related to the data store
locations in the Data Privacy Management repository.
If blank, default is UNKNOWN.
13 Action Optional. Instructs the import job to create, update, or delete a user in the Data
Privacy Management repository.
Enter one of the following values:
- U. Creates or updates a user in the Data Privacy Management repository. The
action that the job performs depends on if the user exists in the Data Privacy
Management repository and on the Replace Duplicates option. You configure the
Replace Duplicates option when you import the .csv file.
If the user does not exist, the import job creates the user.
If the user exists and the Replace Duplicates option is enabled, the import job
updates the user.
Warning: If a column in the .csv file contains a null value, the import job replaces
the value in the Data Privacy Management repository with a null value.
If the user exists and the Replace Duplicates option is not enabled, the import job
skips the row.
- D. Deletes a user from the Data Privacy Management repository. The user activity
data related to the user remains.
If the user does not exist, the import job rejects the row.
If blank, default is U. Creates or updates a user in the Data Privacy Management
repository.
Consider the following rules and guidelines when you import users from a .csv file:
• The first row of the CSV file must contain case-sensitive column headings in the required sequence. If the
CSV file does not contain all of the column headings in the required sequence and case, the Import job
fails.
• Each subsequent row after the column headings defines a user.
The combination of the UserName and UserDN columns defines a unique user. If the UserName and
UserDN in one row is the same as the UserName and UserDN in a second row, the import job treats the
second row as a duplicate of the first row.
• If the .csv file contains duplicate entries, the import job processes the entries in the order in which the
rows appear in the file.
• The column values cannot contain the following keywords or terms:
SELECT FROM
SELECT UNION
DELETE FROM
UPDATE SET
<SCRIPT
</SCRIPT>
ALERT(
"/>"
If a row contains keywords or characters that are not valid, the Import job rejects the row. The Import job
logs an error in the job log and processes the remaining rows.
• The column values cannot exceed 255 characters. If a row contains more than 255 characters in any
column, the Import job rejects the row.
• Enclose names that include a dot (.) or comma (,) within quotes for any column value that includes a
name. If a value for a name column includes a dot or comma not enclosed within quotes, the Import job
might not process the column values correctly. For example, the Import job processes a comma as a
delimiter between column values.
After the import job completes, you can view a list of the imported users in the User Access page of the
dashboard.
Related Topics:
• “Import Job” on page 296
• “Downloading Rejected Records for an Import Job” on page 307
• “Top Users Indicator” on page 526
The following table lists the order of the columns, column headings, and the format for the values in the CSV
file:
7 Action Optional. Instructs the import job to create, update, or delete a group in the Data
Privacy Management repository.
Enter one of the following values:
- U. Creates or updates a group in the Data Privacy Management repository. The
action that the job performs depends on if the group exists in the repository and
on the Replace Duplicates option. You configure the Replace Duplicates option
when you import the CSV file.
If the group does not exist, the import job creates the group.
If the group exists and the Replace Duplicates option is enabled, the import job
updates the group.
Warning: If a column in the CSV file contains a null value, the import job replaces
the value in the Data Privacy Management repository with a null value.
If the group exists and the Replace Duplicates option is not enabled, the import
job skips the row.
- D. Deletes a group and associated group memberships from the repository.
If the group does not exist, the import job rejects the row.
If blank, default is U. Creates or updates a group in the repository.
Consider the following rules and guidelines when you import user groups from a .csv file:
• The first row of the CSV file must contain case-sensitive column headings in the required sequence. If the
CSV file does not contain all of the column headings in the required sequence and case, the Import job
fails.
• Each subsequent row after the column headings defines a group.
The combination of the GroupName and GroupDN columns defines a unique group. If the GroupName and
GroupDN in one row is the same as the GroupName and GroupDN in a second row, the import job treats
the second row as a duplicate of the first row.
• If the CSV file contains duplicate entries, the Import job processes the entries in the order in which the
rows appear in the file.
• The column values cannot contain the following keywords or terms:
SELECT FROM
SELECT UNION
DELETE FROM
UPDATE SET
<SCRIPT
</SCRIPT>
ALERT(
"/>"
If a row contains keywords or characters that are not valid, the Import job rejects the row. The Import job
logs an error in the job log and processes the remaining rows.
• The column values cannot exceed 255 characters. If a row contains more than 255 characters in any
column, the Import job rejects the row.
• Enclose names that include a dot (.) or comma (,) within quotes for any column value that includes a
name. If a value for a name column includes a dot or comma not enclosed within quotes, the Import job
might not process the column values correctly. For example, the Import job processes a comma as a
delimiter between column values.
After the import job completes, you can view a list of the imported groups in the User Access page of the
dashboard.
Related Topics:
• “Import Job” on page 296
• “Downloading Rejected Records for an Import Job” on page 307
When you import group membership, you specify the users and groups that are assigned to a parent group.
The users and groups must exist in the repository before you begin the import.
The following table lists the order of the columns, column headings, and the format for the values in the CSV
file:
6 Action Optional. Instructs the import job to create, update, or delete a group membership in
the repository.
Enter one of the following values:
- U. Creates or updates a group membership in the repository. The action that the
job performs depends on if the group membership exists in the repository and on
the Replace Duplicates option. You configure the Replace Duplicates option when
you import the .csv file.
If the group membership does not exist, the import job creates the group
membership.
If the group membership exists and the Replace Duplicates option is enabled, the
import job updates the group membership.
If the group membership exists and the Replace Duplicates option is not enabled,
the import job skips the row.
- D. Deletes a group membership from the repository.
If the group membership does not exist, the import job rejects the row.
If blank, default is U. Creates or updates a group membership in the repository.
The following image shows the text view of a sample .csv file where the first row contains the column
headers, and each subsequent row contains the details for a group membership:
Consider the following rules and guidelines when you import group memberships from a .csv file:
• The first row of the CSV file must contain case-sensitive column headings in the required sequence. If the
CSV file does not contain all of the column headings in the required sequence and case, the Import job
fails.
• Each subsequent row after the column headings defines a group membership.
The combination of the MemberName and MemberDN defines a unique member for which to create or
update memberships. A member can be a user or a group. If the MemberName and MemberDN in one row
is the same as the MemberName and MemberDN in a second row, the import job treats the second row as
a duplicate of the first row.
• If the CSV file contains duplicate entries, the Import job processes the entries in the order in which the
rows appear in the file.
• The column values cannot contain the following keywords or terms:
SELECT FROM
SELECT UNION
DELETE FROM
UPDATE SET
<SCRIPT
</SCRIPT>
ALERT(
"/>"
If a row contains keywords or characters that are not valid, the Import job rejects the row. The Import job
logs an error in the job log and processes the remaining rows.
• The column values cannot exceed 255 characters. If a row contains more than 255 characters in any
column, the Import job rejects the row.
• Enclose names that include a dot (.) or comma (,) within quotes for any column value that includes a
name. If a value for a name column includes a dot or comma not enclosed within quotes, the Import job
might not process the column values correctly. For example, the Import job processes a comma as a
delimiter between column values.
Related Topics:
• “Import Job” on page 296
• “Downloading Rejected Records for an Import Job” on page 307
You import user access details into the Data Privacy Management repository from a CSV file if you manage
user access in an access management system such as SailPoint. You import the level of access that a user
or group has to a data store. You can specify access at the data store, schema, table, or column levels.
You also import the type of permissions the user or group has to the data store, such as read, write, or delete
permissions. For example, user jdoe has read and write permissions on the employee table in the payroll
schema. When you import user access information for a group, all users that are members of the group and
the nested groups inherit the same user access levels.
When you import user access, you specify users or groups that have access to data stores. The users,
groups, and data stores must exist in the repository before you begin the import. To import user access to a
schema, table, or column in a data store, the repository must include scan results for the data store.
The following table lists the order of the columns, column headings, and the format for the values in the CSV
file:
1 MemberName Required. User or group name for which you want to import access levels.
The combination of the MemberName and MemberDN columns identifies a unique user
to create, update, or delete.
If blank, the import job rejects the row.
3 MemberType Optional. Determines the type of member for the member name.
Enter one of the following values:
- U. The member name is a user.
- G. The member name is a group.
If blank, default is U (User).
4 AccessType Optional. Determines the level of access that the user or group has to a data store.
Enter one of the following values:
- R. Read access.
- W. Write access
- D. Delete access.
- RWD. Read, write, and delete access.
- All. Includes all administrator privileges, such as modify schema and tables.
If blank, default is All.
5 DataStore Required. Name of the data store to which the user has access.
You can specify one data store in a row. To specify access to multiple data stores for a
member, create one row for each data store.
If the data store name is blank or does not exist in the repository, the import job rejects
the row.
If you do not enter a SchemaName, TableName, or ColumnName, the import job sets
access to all schemas and tables for the data store.
6 SchemaName Optional. Name of the schema that identifies the table to which the user or group has
access. The name is case sensitive.
If you specify a SchemaName, you must specify a TableName.
If blank, the import job sets access to all schemas that Data Privacy Management
identified as sensitive for the data store.
If the table is not identified as sensitive in the repository or if the case does not match,
the import job rejects the row.
7 TableName Required if you specified a SchemaName or a ColumnName. Name of the table in the
data store to which the user or group has access. The name is case sensitive.
If blank and you did not specify a SchemaName, the import job sets access to all tables
that Data Privacy Management identified as sensitive for the data store.
If blank and you did specify a SchemaName, the import job rejects the row.
If the table is not identified as sensitive in the repository or if the case does not match,
the import job rejects the row.
8 ColumnName Optional. Name of the column to which the user or group has access. The name is case
sensitive.
If you specify a ColumnName, you must specify a TableName.
If blank, the import job sets access to all columns that Data Privacy Management
identified as sensitive for the table in the data store.
If the column is not identified as sensitive in the repository or if the case does not
match, the import job rejects the row.
9 Action Optional. Instructs the import job to create, update, or delete user access in the
repository.
Enter one of the following values:
- U. Creates or updates user access in the repository. The action that the job performs
depends on if the user access exists in the repository and on the Replace Duplicates
option. You configure the Replace Duplicates option when you import the CSV file.
If the user access does not exist, the import job creates the user access.
If the user access exists and the Replace Duplicates option is enabled, the import job
updates the user access.
Warning: If a column in the .csv file contains a null value, the import job replaces the
value in the repository with a null value.
If the user access exists and the Replace Duplicates option is not enabled, the import
job skips the row.
- D. Deletes user access from the repository.
If the user access does not exist, the import job rejects the row.
If blank, default is U. Creates or updates user access in the repository.
The following image shows the text view of a sample .csv file where the first row contains the column
headers and each subsequent row contains the details for user access:
Consider the following rules and guidelines when you import user access from a .csv file:
• The first row of the CSV file must contain case-sensitive column headings in the required sequence. If the
CSV file does not contain all of the column headings in the required sequence and case, the Import job
fails.
• Each subsequent row after the column headings defines user access.
The combination of the MemberName and MemberDN columns defines one unique member for which to
create or update access. A member can be a user or a group. If the MemberName and MemberDN in one
row is the same as the MemberName and MemberDN in a second row, the import job treats the second
row as a duplicate of the first row.
• If the .csv file contains duplicate entries, the import job processes the entries in the order in which the
rows appear in the file.
• The column values cannot contain the following keywords or terms:
SELECT FROM
SELECT UNION
DELETE FROM
UPDATE SET
<SCRIPT
</SCRIPT>
ALERT(
"/>"
If a row contains keywords or characters that are not valid, the Import job rejects the row. The Import job
logs an error in the job log and processes the remaining rows.
• The column values cannot exceed 255 characters. If a row contains more than 255 characters in any
column, the Import job rejects the row.
• Enclose names that include a dot (.) or comma (,) within quotes for any column value that includes a
name. If a value for a name column includes a dot or comma not enclosed within quotes, the Import job
might not process the column values correctly. For example, the Import job processes a comma as a
delimiter between column values.
After the import job completes, you can view how many users have access to sensitive fields in the User
Access indicator of the dashboard. You can also view the access levels for a user in the User Access page.
Related Topics:
• “Import Job” on page 296
• “Downloading Rejected Records for an Import Job” on page 307
• “User Access Indicator” on page 528
For example, John Doe has a user account jdoe in an LDAP directory service and a user account johnd in an
external user management system. You can import johnd as a user alias for jdoe. When you view user access
and user activity information in the dashboard, data for both user accounts is consolidated into one account.
When you import user aliases, you specify alias names for users. The users must exist in the Data Privacy
Management repository before you begin the import.
The following table lists the order of the columns, column headings, and the format for the values in the CSV
file:
1 UserName Required. Name of a user account that accesses information in a data store.
The combination of the UserName and UserDN columns identifies a unique user for
which to create or delete an alias.
If blank, the import job rejects the row.
4 Action Optional. Instructs the import job to create or delete an alias in the repository.
Enter one of the following values:
- U. Creates an alias in the repository.
- D. Deletes an alias from the repository.
If blank, default is U. Creates an alias in the repository.
The following image shows the text view of a sample .csv file where the first row contains the column
headers and each subsequent row contains the details for an alias:
Consider the following rules and guidelines when you import user aliases from a .csv file:
• The first row of the CSV file must contain case-sensitive column headings in the required sequence. If the
CSV file does not contain all of the column headings in the required sequence and case, the Import job
fails.
• Each subsequent row after the column headings defines one alias for one user.
The combination of the UserName and UserDN columns defines a unique user. If the UserName and
UserDN in one row is the same as the UserName and UserDN in a second row, the import job creates two
aliases for the user.
• The column values cannot contain the following keywords or terms:
SELECT FROM
SELECT UNION
DELETE FROM
UPDATE SET
<SCRIPT
</SCRIPT>
ALERT(
"/>"
If a row contains keywords or characters that are not valid, the Import job rejects the row. The Import job
logs an error in the job log and processes the remaining rows.
• The column values cannot exceed 255 characters. If a row contains more than 255 characters in any
column, the Import job rejects the row.
• Enclose names that include a dot (.) or comma (,) within quotes for any column value that includes a
name. If a value for a name column includes a dot or comma not enclosed within quotes, the Import job
might not process the column values correctly. For example, the Import job processes a comma as a
delimiter between column values.
Related Topics:
• “Import Job” on page 296
• “Downloading Rejected Records for an Import Job” on page 307
You cannot create, edit, or delete user and user access information from a workspace. To add, edit, or delete
user information, you must import details from a file. You can export the type of user information that you
want to edit. The export file contains the same format that is required for the import. Add changes to the file.
Then, import the file. When you import the file, the import job updates user information in the repository.
The following image shows the text view of a sample export file:
The following image shows the text view of a sample export file:
The following image shows the text view of a sample export file:
The following image shows the text view of a sample export file:
The following image shows the text view of a sample export file:
Anomaly Detection
This chapter includes the following topics:
Data Privacy Management analyzes user activity events to determine baseline behavior for a user and for the
user's peer group. To determine a baseline, it tracks trends in behavior over evaluation intervals for a set of
system-defined factors. When user activity behavior significantly deviates from the baseline, Data Privacy
Management detects an anomaly.
An anomaly is a set of one or more user activity events that deviate from the baseline behavior. For example,
Data Privacy Management detects an anomaly when a user downloads a high amount of sensitive data
outside of the user's normal working day and time. The anomaly contains the observed and expected values
for the anomalous factors used to detect the anomaly.
An anomalous factor is a parameter that Data Privacy Management uses to determine different types of
irregular behavior. For example, the time of day anomalous factor identifies irregular behavior based on the
time of the day that a user performs activity. A user normally performs activity from 8AM through 6PM. If the
user performs activity at 3AM, Data Privacy Management detects an anomaly.
Use the anomaly detection workspace to view the anomalies detected. The workspace includes predefined
views that display anomaly information in a variety of ways. For example, you can view a list of users that
had the most anomalies. You can view the data stores and data domains that were involved with the most
anomalies. You can mark an anomaly as read or flag an anomaly to review at a later time. You can also
delete anomalies or suppress certain types of anomalies from appearing on the UI.
The anomaly detection workspace is a dashboard that displays all anomalies. To generate security policy
violations, email notifications, or custom actions when Data Privacy Management detects an anomaly, you
must configure a security policy.
343
Anomaly Detection Prerequisites
Data Privacy Management uses several components to detect anomalous behavior on sensitive data and to
display information related to the anomaly.
Required to detect anomalous user behavior. Data Privacy Management analyzes user activity events to
determine baseline behavior for a user and for the user's peer group. When user activity behavior
deviates from the baseline, Data Privacy Management detects an anomaly. You can import user activity
events from system logs in CEF, LEEF, or JSON formats. You can also import user activity events from
Salesforce sources.
Set the default retention period for how long you want to store user activity events in the user activity store.
The anomaly detection workspace displays anomalies that exist in the user activity store. The Data
Privacy Management Service purges events from the user activity store based on the retention period.
Determine the period of time in which you want users to analyze anomalies. For example, if you want
users to view anomalies detected up to one month ago, set the retention period to 31 days. To set the
retention period, configure the Event Details Retention Period property in the Data Privacy Management
Service.
Set up data store access for Data Privacy Management user accounts.
Required for users to view anomalies in the anomaly detection workspace. Users can see anomalies that
are associated to the data stores to which they have access. To set up access to data stores:
Note: The user accounts are for users that log in to Data Privacy Management and not for users that
perform activity on data stores.
Import metadata about users from external user management systems into the Data Privacy Management repository.
Required for the anomaly details to show information related to the user that performed the anomalous
behavior. For example, the user full name, department, title, and office location. Required to determine
the baseline behavior of the user peer group. Data Privacy Management uses the user department to
determine user peer group behavior.
Note: A user is an account that accesses sensitive data in a data store, and not a user account that logs
in to Data Privacy Management.
If you do not import user metadata, then the anomaly details only show the user name that performed
the anomalous user activity event.
Create a location for each geographic area from which user activity originates.
Required for Data Privacy Management to detect anomalies and security policy violations based on the
location of the user activity.
When you create a location, you specify an IP address subnet to identify a geographic origin of user
activity. When Data Privacy Management evaluates user activity events for anomalous behavior, it tries
to identify the event location. To identify the event location, it matches the IP address in the event to the
IP address range in the location configuration. You can specify if the IP address range is from a VPN
provider. If enabled, Data Privacy Management does not detect anomalies based on distance for the
Relocation anomalous factor.
Required to detect anomalous behavior on sensitive data. Data Privacy Management identifies
anomalous behavior for activity on data that is identified as sensitive in the repository.
The workspace displays anomalies based on the event retention period for the user activity store and based
on data store access for the user. The event retention period that you configure in the Data Privacy
Management Service properties determines how long anomalies remain in the user activity store. The data
store security groups determine the users that have access to the anomalies. You can see anomalies that are
associated with at least one data store to which you have access. You have access to a data store if your
user account belongs to the same security group that is assigned to the data store.
To access the Anomaly Detection workspace, click the Anomalies icon in the header.
1. Anomalies
2. Anomaly indicators
Anomaly Summary
Use the anomaly summary to quickly assess the number and severity of anomalies across your enterprise.
You can view the number of users and data stores associated to anomalies and the number of anomalies in
each severity level. The anomaly summary shows information for all users based on the selected time period
and filter conditions in the workspace.
Aggregated number of users associated to the anomalies based on the selected time period and filter
conditions. The user count from the anomaly summary is not necessarily the same as the anomaly count
from the anomalies list. For example, you can have 14 users and 100 anomalies. An anomaly can only
include one user. However, one user can have multiple anomalies.
To view a list of all users that are associated to anomalies, click the Top Users indicator title, or select
View By > User.
Data Stores
Aggregated number of data stores associated to the anomalies based on the selected time period and
filter conditions. A data store is associated with an anomaly when at least one user activity event in the
anomaly included the data store. The count only includes the data stores to which you have access. An
anomaly can include multiple data stores. A data store can be included in multiple anomalies. The
relocation speed anomalous factor is not associated to data stores.
Note: The data store filter does not affect the data store count. For example, if you filter the workspace
by one data store, the data store count might show more than one data store. The anomaly that includes
the filtered data store can also include other data stores. The count includes all data stores for the
anomaly regardless of the data store filter.
Severity
Total number of anomalies in each severity level based on the selected time period and filter conditions.
Data Privacy Management uses an internal algorithm to calculate the severity level for an anomaly. The
severity level is mainly based on the number of anomalous factors used to detect the anomaly. An
anomaly can only have one severity level.
To view a list of all anomalies for each severity level, view the anomalies list panel, or select View By >
None. Then, filter the workspace by the severity level.
Related Topics:
• “Changing the Default Anomaly Severity Ranges” on page 46
By default, the list is sorted by an internal anomaly score in descending order, then by anomalous factor
name in ascending order.
The following figure shows an example of the Top Anomalous Factors indicator:
Click the title to open a list of all anomalous factors that are associated to at least one anomaly. You can
then drill down on the users, data stores, or anomalies for each anomalous factor.
2. Anomalous factors
Anomalous factors that are associated with anomalies. Each row contains a bar that represents the
number of anomalies that are associated to the anomalous factor.
Click an anomalous factor row to filter the workspace to show information related to the selected
anomalous factor. A filter condition with the selected anomalous factor name appears at the top of the
workspace. The anomaly summary, anomalies graph, anomaly indicators, and anomalies list display
information only for the selected anomalous factor.
Displays the number of anomalies that are associated to the anomalous factors.
4. Page icons
Click the back and forward arrows to scroll through the top 25 anomalous factors.
By default, the list is sorted by an internal anomaly score in descending order, then by data store name in
ascending order.
Note: When you apply a filter to the workspace, the workspace includes information for all anomalies that
meet the filter conditions. If you filter the workspace by specific data stores, the Top Data Stores indicator
might show additional data stores than the ones you specified in the filter condition. The anomalies that
include the filtered data stores can also include other data stores. For example, you filter the workspace by
ORA1 data store. The anomalies that are associated with ORA1 are also associated with data stores
PCRS_Linux and ORA2. The Top Data Stores indicator includes data stores ORA1, PCRS_Linux, and ORA2.
The following figure shows an example of the Top Data Stores indicator:
Click the title to open a list of all data stores that are associated to at least one anomaly. You can then
drill down on the users or anomalies for each data store.
2. Data stores
Data stores that are associated with anomalies. A data store is associated with an anomaly when at
least one user activity event in the anomaly included the data store.
Each row contains a bar that represents the number of anomalies that are associated to the data store.
The bar contains different colors based on the anomaly severity level.
Click a data store row to filter the workspace to show information related to the selected data store. A
filter condition with the selected data store name appears at the top of the workspace. The anomaly
3. Anomaly count
Displays the number of anomalies that are associated to the data stores.
4. Page icons
Click the back and forward arrows to scroll through the top 25 data stores.
By default, the list is sorted by an internal anomaly score in descending order, then by user name in
ascending order.
Click the title to open a list of all users that are associated to at least one anomaly. You can then drill
down on the data stores or anomalies for each user.
2. Users
Name of the user that performed the user activity events that are associated with the anomaly. If the
user account is associated with a user in the Data Privacy Management repository, the indicator displays
the user's full name. If the user account is not associated with a user, the indicator displays the user
account name from the source of the user activity information.
Each row contains a bar that represents the number of anomalies that are associated to the user. The
bar contains different colors based on the anomaly severity level.
Click a user row to filter the workspace to show information related to the selected user. A filter
condition with the selected user name appears at the top of the workspace. The anomaly summary,
anomalies graph, anomaly indicators, and anomalies list display information related to the selected user.
3. Anomaly count
Displays the number of anomalies that are associated with the users.
Click the back and forward arrows to scroll through the top 25 users.
To open the anomalous factors view from the Anomaly Detection workspace, select View By > Anomalous
Factors, or click the Top Anomalous Factors indicator title.
You can drill down on the counts in the Users, Data Stores, or Anomalies columns for each anomalous factor.
You can click a column header to sort the list by the selected column.
By default, the list is sorted by an internal anomaly score in descending order, then by anomalous factor
name in ascending order.
Anomalous factors that are associated to anomalies within the selected time period and filter
conditions.
Users
Aggregated number of users that are associated to the anomalous factor. Click the user count to view a
list of users for the selected anomalous factor. The users view opens with a filter condition for the
selected anomalous factor. The view includes the same time period that you selected in the anomalous
factor view.
You can click a user name to view more details about the user's activity on the User Profile page.
Data Stores
Aggregated number of data stores that are associated to the anomalous factor. Click the data store
count to view a list of data stores for the selected anomalous factor. The data stores view opens with a
filter condition for the selected anomalous factor. The view includes the same time period that you
selected in the anomalous factor view.
Note: The count includes all of the data stores that are associated with the anomaly, regardless of the
data stores to which you have access. When you click the count to open the data stores view, you only
see the data stores to which you have access.
Total number of anomalies that Data Privacy Management detected for the anomalous factor. Click the
anomaly count to view information related to the selected anomalous factor. The default workspace
view opens with a filter condition for the selected anomalous factor. The view includes the same time
period that you selected in the anomalous factor view.
Related Topics:
• “User Profile Page” on page 564
To open the data stores view from the Anomaly Detection workspace, select View By > Data Store, or click
the Top Data Stores indicator title.
You can drill down on the counts in the Users, Anomalies, or Severity columns. You can click a column header
to sort the list by the selected column.
By default, the list is sorted by an internal anomaly score in descending order, then by data store name in
ascending order.
List of data stores that are associated to anomalies within the selected time period and filter conditions.
When you click a count in a data store row, a view appears with the selected data store as the active
filter condition.
Users
Aggregated number of users that have anomalies for the data store. You can click the user count to view
the users that had anomalies for the data store. When you click the user count, the users view opens.
The workspace includes a filter condition for the selected data store and includes the same time period
that you selected in the data stores view.
You can click a user name to view more details about the user's activity on the User Profile page.
Total number of anomalies in all severity levels detected for the data store. You can click the anomaly
count to view anomaly information related to the selected data store. When you click the anomaly count,
the default workspace view opens. The workspace includes a filter condition for the selected data store
and includes the same time period that you selected in the data stores view.
Severity
Total number of anomalies in the selected severity level detected for the data store. Severity level is
based on the anomaly score. From the Settings page, you can specify the range of anomaly scores for
each severity level.
You can click the anomaly count to view anomaly information related to the selected data store and
anomaly severity level. When you click the anomaly count, the default workspace view opens. The
workspace includes filter conditions for the selected data store and anomaly severity level. The view
includes the same time period that you selected in the data stores view.
Users View
The users view displays a list of all users that have at least one or more anomalies within the selected time
period and filter conditions.
To open the users view from the Anomaly Detection workspace, select View By > User, or click the Top Users
indicator title.
You can drill down on each count in the Data Stores, Anomalies, or Severity columns. You can click a column
header to sort the list by the selected column.
By default, the list is sorted by an internal anomaly score in descending order, then by user name in
ascending order.
Name of the user account that performed the user activity events that are associated with the anomaly.
Data Privacy Management shows the user account name from the source of the user activity
information, such as user activity logs.
To retrieve the rest of the user information in the anomaly details, Data Privacy Management matches
the user account name to user names in the Data Privacy Management repository. If there is a match,
the anomaly details include values for the full name, title, group, and user location.
Full Name
Full name that is associated with the user account name in the Data Privacy Management repository. If
the full name is null in the Data Privacy Management repository, the value is a concatenation of the first
name and last name. If the first name and last name is null, the value is the user account name. If the
user account is not in the repository, the value is the user account name.
Title
Title that is associated with the user account name in the repository. If the user account is not in the
repository, the title is blank.
User Department
User department that is associated with the user account name in the repository. When Data Privacy
Management analyzes user activity events for anomalous behavior, it compares behavior of the user and
the user's peer group to detect an anomaly. Data Privacy Management uses the department for the user
peer group. Any user that has the same department is included in the user peer group. If the user
account is not in the repository, the user department is blank.
Data Stores
Total number of data stores that are associated to anomalies for the user. Click the data store count to
view a list of data stores for the selected user. The data stores view opens with a filter condition for the
selected user. The view includes the same time period that you selected in the users view.
Note: The count includes all of the data stores that are associated with the anomaly, regardless of the
data stores to which you have access. When you click the count to open the data stores view, you only
see the data stores to which you have access.
Anomalies
Total number of anomalies in all severity levels detected for the user. Click the anomaly count to view
anomaly information related to the selected user. The default workspace view opens with a filter
condition for the selected user. The view includes the same time period that you selected in the users
view.
Severity
Total number of anomalies in the selected severity level detected for the user. Severity level is based on
the anomaly score. From the Settings page, you can specify the range of anomaly scores for each
severity level.
You can click the anomaly count to view anomaly information related to the selected user and anomaly
severity level. When you click the anomaly count, the default workspace view opens. The workspace
includes filter conditions for the selected user and anomaly severity level. The view includes the same
time period that you selected in the data stores view.
Anomalies List
The anomalies list panel includes a list of all anomalies and the corresponding anomaly details. The list of
anomalies is dynamic and can change based on the time period and filter conditions that you select in the
To open the Anomaly Detection workspace, click the Anomalies icon in the header. When you select one of
the views in the Anomaly Detection workspace, the anomalies list disappears. Select View By > None to
display the anomalies list again.
By default, the list of anomalies is sorted by unread status, then by the highest severity level, then by the
anomaly ID. You can click a column header to sort the list of anomalies by the column.
Tip: Click the filter icon to filter the list of anomalies. For example, you can filter the list to show only high
severity anomalies.
2. Anomaly count
Total number of anomalies based on the selected time period and filter conditions.
3. Selection
You can perform actions on one or more anomalies. Select one, multiple, or all anomalies. Then, click the
Actions menu and select the action. For example, you can mark all anomalies as read or unread.
4. Read/Unread
Read or unread status of an anomaly. You can mark an anomaly as read or unread. Changes to the
status affects the sort order of the anomalies. The list of anomalies is first sorted by unread anomalies.
You can click the column header to sort by the status. You can filter the workspace by the status to show
only the unread or read anomalies.
By default, an anomaly is marked as unread until you mark the anomaly as read. For more information,
see “Marking an Anomaly as Read or Unread” on page 360.
5. Flag/Unflag
Flagged or unflagged status of the anomaly. Click the flag to change the status. A clear flag indicates an
anomaly that is not flagged. A filled flag indicates an anomaly that is flagged. You might want to flag the
anomalies that you need to review further or to flag for follow-up. You can click the column header to
sort by flag status. You can use the flag status as a filter condition for the workspace.
By default, an anomaly is not flagged until you mark the anomaly as flagged. The flag status does not
affect the default anomaly sort order. For more information, see “Flagging or Unflagging An
Anomaly ” on page 360.
6. Anomaly ID
Anomaly identification. Data Privacy Management uses the following syntax to assign an ID to an
anomaly:
AN_<Node ID>_<User hash code>_<Anomaly timestamp in Epoch format>
Name of the user account that performed the user activity events that are associated with the anomaly.
Data Privacy Management shows the user account name from the source of the user activity
information, such as user activity logs.
You can click the user name to view more details about the user's activity on the User Profile page.
8. Full Name
Full name that is associated with the user account name in the Data Privacy Management repository. If
the full name is null in the Data Privacy Management repository, the value is a concatenation of the first
name and last name. If the first name and last name is null, the value is the user account name. If the
user account is not in the repository, the value is the user account name.
9. Anomaly Score
A calculated value based on the weight of each anomalous factor specified on the Settings page. You
can use the score to sort and filter anomalies. You can also use the score to create a security policy. For
example, you can create a security policy to send an email alert when an anomaly with a score greater
than 75 is triggered.
Total number of data stores that the user accessed during the evaluation interval up until the anomalous
behavior. A data store is included in the count when at least one user activity event accessed sensitive
data in the data store during the evaluation interval. Expand the anomaly details to view a list of the data
store names.
Note:
• The count includes all of the data stores that the user accessed regardless of the data stores to
which you have access. When you view the anomaly details, you only see the data stores names to
which you have access.
• The number of data stores might be different than the number of data stores in the observed value
for the anomalous factor. The observed value shows the number of data stores that are associated
with the anomalous behavior. The expected value shows the number of data stores that the user
typically accesses.
11. Severity
Severity level of the anomaly. Data Privacy Management determines the severity level for each anomaly
based on the anomaly score. An anomaly can have a high, medium, or low severity level. From the
Settings page, you can specify the range of anomaly scores for each severity level.
12. Date/Time
Date and time of the anomalous behavior that is associated with the anomaly. This is not the date and
time when Data Privacy Management detected the anomaly. All time periods are in the UTC time zone.
Number of system-defined anomalous factors that are associated to the anomaly. Expand the anomaly
details to view a list of the anomalous factors that are associated to the anomaly. For more information,
see “Anomalous Factors” on page 358.
14. Refresh
Click to refresh the list of anomalies based on the selected time period and filter conditions.
Anomaly Details
The anomaly details panel shows information related to the user that performed the anomalous activity, the
data stores and data domains that are associated to the anomaly, and the anomalous factors that detected
the anomaly. The user information, such as the full name, title, and group, originates from data that was
imported into the Data Privacy Management repository.
You can expand each entry to show all details related to the anomaly.
Name of the user account that performed the user activity events that are associated with the anomaly.
Data Privacy Management shows the user account name from the source of the user activity
information, such as user activity logs.
To retrieve the rest of the user information in the anomaly details, Data Privacy Management matches
the user account name to user names in the Data Privacy Management repository. If there is a match,
the anomaly details include values for the full name, title, group, and user location.
You can click the user name to view more details about the user's activity on the User Profile page.
Full Name
Full name that is associated with the user account name in the Data Privacy Management repository. If
the full name is null in the Data Privacy Management repository, the value is a concatenation of the first
name and last name. If the first name and last name is null, the value is the user account name. If the
user account is not in the repository, the value is the user account name.
Click the View Associated User Activity link to open a view that lists all user activity events that occurred
within a close proximity to the anomalous behavior. You can increase or decrease the time range to
further analyze the user activity events.
Title
Title that is associated with the user account name in the repository. If the user account is not in the
repository, the title is blank.
Group
Peer group that Data Privacy Management uses to compare user behavior. When Data Privacy
Management analyzes user activity events for anomalous behavior, it compares behavior of the user and
the user's peer group to detect an anomaly. It uses the department that is associated with the user
account name in the Data Privacy Management repository for the peer group. Any user that has the
Location that is associated with the user account in the repository. This is the location from which the
user works. For example, the corporate office that the user is assigned to. Note that this is not
necessarily the same location from which the user performs the user activity events.
Data Stores
List of data stores that the user accessed during the evaluation interval up until the anomalous behavior.
A data store appears in the list when at least one user activity event accessed sensitive data in the data
store during the evaluation interval. If you do not have authorization for a data store in the list, Data
Privacy Management masks the data store name with a question mark. By default, the list is sorted by
data store name in ascending order.
Note: The number of data stores in the list might be different than the number of data stores in the
observed value for the anomalous factor. The observed value shows the number of data stores that are
associated with the anomalous behavior. The expected value shows the number of data stores that the
user typically accesses.
When Data Privacy Management detects an anomaly based on the relocation speed anomalous factor, it
does not associate data stores to the anomalous factor. The data store list is blank if the anomaly only
contains the relocation speed anomalous factor.
Data Domains
List of data domains that the user accessed during the evaluation interval up until the anomalous
behavior. A data domain appears in the list when at least one user activity event accessed sensitive data
that matches the data domain. By default, the list is sorted by data domain name in ascending order.
Note: The number of data domains in the list might be different than the number of data domains in the
observed value for the anomalous factor. The observed value in the anomalous factor shows the number
of data domains that are associated with the anomalous behavior. The expected value in the anomalous
factor shows the number of data domains that the user typically accesses.
Anomalous Factors
List of system-defined anomalous factors used to detect the anomaly. Most anomalous factors include
a degree, an observed value, and an expected value. To view the observed values for the relocation
speed factor, click the information icon next to the factor. For more information about each anomalous
factor, see “Anomalous Factors” on page 358.
Degree
System-defined degree that Data Privacy Management detects for the anomaly. Data Privacy
Management evaluates the observed and expected values and assigns a degree based on the deviation
from the expected value. An anomaly can either include an unusual degree or a highly unusual degree.
Suppressed
Observed
Count or value from the user activity events for the anomalous factor. This is the observed value during
the anomalous behavior. The exact value depends on the anomalous factor. For example, the observed
value for the sensitive fields anomalous factor is the number of sensitive fields that the user accessed in
one or more events. Data Privacy Management uses the observed and expected values to determine if
an anomaly is triggered. If Data Privacy Management triggers an anomaly, it compares the observed and
expected values to determine the anomaly degree.
Count or value that Data Privacy Management determines is the normal baseline for the anomalous
factor. The exact value depends on the anomalous factor. For example, the expected value for the
sensitive fields anomalous factor is the number of sensitive fields that a user normally accesses. Data
Privacy Management determines the expected number based on a pattern of behavior from the user over
a period of time. It uses the observed and expected values to determine if an anomaly is triggered. If it
triggers an anomaly, it compares the observed and expected values to determine the anomaly degree.
The relocation speed anomalous factor does not include an expected value. To view details related to
the relocation speed anomalous factor, click the information icon next to the observed value.
Related Topics:
• “User Profile Page” on page 564
Anomalous Factors
An anomalous factor is a parameter that Data Privacy Management uses to identify unusual behavior and to
detect an anomaly. Data Privacy Management can use one or more anomalous factor to detect an anomaly.
An anomaly can include multiple factors.
You can view a list of anomalies that are associated with each anomalous factor in the Top Anomalous
Factors indicator. When you view the details for an anomaly in the anomaly list panel, you can view a list of
the anomalous factors that detected the anomaly. You can also view the observed and expected values for
each anomalous factor.
Data Privacy Management detects anomalies based on the following system-defined anomalous factors:
Sensitive Records
Detects anomalies based on the number of sensitive records a user accessed that match a classification
policy. For example, a user typically accesses 10,000 rows of sensitive data. If the user accesses
1,000,000 rows of sensitive data, then Data Privacy Management detects an anomaly.
Data Domains
Detects anomalies based on the number of data domains that a user accesses. For example, a user
typically retrieves sensitive data that matches 10 data domains. If the user retrieves sensitive data that
matches 1000 data domains, then Data Privacy Management detects an anomaly.
Note that the factor detects the number of matched data domains, regardless of how many columns
matched the data domain. The Sensitive Field factor detects the actual number of columns that matched
a data domain. Multiple sensitive columns can match the same data domain. For example, a user can
access 10 sensitive columns that matched one credit card data domain.
Sensitive Events
Detects anomalies based on the number of user activity events that a user performs. For example, a user
typically performs 5 events that access sensitive data a day. If the user performs 1000 events in one day,
then Data Privacy Management detects an anomaly.
Time of Day
Detects anomalies based on the time of day that a user performs activity. For example, a user typically
performs activity from 8 AM to 6 PM. If the user performs activity at 3 AM, then Data Privacy
Management detects an anomaly.
Day of Week
Detects anomalies based on the day of the week that a user performs activity. For example, a user
typically performs activity from Monday through Friday. If the user performs activity on Saturday or
Sunday, then Data Privacy Management detects an anomaly.
When it detects an anomaly based on the Day of Week anomalous factor, the anomaly can only be
associated with one user activity event. The observed value for the anomalous factor shows the day of
the week on which the anomalous user activity event occurred.
Data Stores
Detects anomalies based on the number of data stores that a user accesses. For example, a user
typically retrieves sensitive data from 10 data stores. If the user retrieves sensitive data from 1000 data
stores, then Data Privacy Management detects an anomaly.
Detects anomalies based on the data stores that a user typically accesses. For example, a user typically
accesses the PC1, PC2, and PC3 data stores on a regular basis. If a user accesses an ORACLE1 data
store, then Data Privacy Management detects an anomaly.
Relocation Speed
Detects anomalies based on the practicality of traveling between two locations according to geographic
distance and land speed. For example, a user performs activity from a location in the United States. An
hour later, the user performs activity from a location in India. The user cannot possibly travel the
distance between the two locations within an hour. This factor helps to identify compromised or shared
user credentials.
When Data Privacy Management detects an anomaly based on the relocation speed anomalous factor, it
does not associate data stores to the anomalous factor. The anomaly details only include the observed
value for the anomalous factor. Data Privacy Management does not provide an expected value.
Related Topics:
• “Changing Anomaly Factor Weights” on page 44
Anomaly Management
Use the Anomaly Detection workspace to view all of the anomalies that Data Privacy Management detected.
You can perform the following tasks in the Anomaly Detection workspace:
After you flag or unflag an anomaly, you can sort on the flag status. You can use the status to filter the
workspace. For example, you can filter the workspace to only show information for flagged anomalies.
Changes to the flag status do not affect the anomaly sort order in the anomalies list panel.
Note: When you flag or unflag an anomaly, Data Privacy Management updates the anomaly in the user
activity store. The change impacts all users that can see the anomaly, not just the logged-in user.
1. Click the Anomalies icon in the header to open the Anomaly Detection workspace.
2. Navigate to the anomalies list panel and perform one of the following steps:
• To flag or unflag one anomaly, click the icon in the flag status column for the associated anomaly.
The following image shows an example of the flag status column:
• To flag or unflag multiple anomalies, select the check box next to the anomalies. From the Actions
menu, select Flag or Unflag.
The following image shows an example of selected anomalies:
After you mark an anomaly as read or unread, you can use the status to filter the workspace. For example,
you can filter the workspace to only show information for unread anomalies.
Note: When you mark an anomaly as read or unread, Data Privacy Management updates the anomaly in the
user activity store. The change impacts all users that can see the anomaly, not just the logged-in user.
1. Click the Anomalies icon in the header to open the Anomaly Detection workspace.
2. Navigate to the anomalies list panel and perform one of the following steps:
• To mark one anomaly as read or unread, click the icon in the read/unread status column for the
associated anomaly.
• To mark multiple anomalies as read or unread, select the check box next to the anomalies. From the
Actions menu, select Mark as Read or Mark as Unread.
The following image shows an example of selected anomalies:
Data Privacy Management marks the anomaly as read or unread and refreshes the sort order based on the
status.
1. Click the Anomalies icon in the header to open the Anomaly Detection workspace and then click the
Filter icon.
The filter panel appears in the workspace. The following image shows the filter panel:
The following table lists the CSV files in the Data.zip file that you export from the Anomaly Detection
workspace:
UbaTopUsers.csv Contains information from the Top Users indicator. The file includes a list
of up to 25 users that have the highest number of anomalies.
UbaTopDataStores.csv Contains information from the Top Data Stores indicator. The file includes
a list of up to 25 data stores that have the highest number of anomalies.
UbaTopAnomalyFactors.csv Contains information from the Top Anomalous Factors indicator. The file
includes a list of up to 25 anomalous factors that have the highest number
of anomalies.
UbaSummary.csv Contains information from the anomaly summary. The file includes the total
number of users and data stores associated with anomalies and the
number of anomalies in each severity level.
UbaAnomalyGraph.csv Contains information from the anomaly graph. The file includes a count of
anomalies for each time period.
UbaAnomalyDetailsDataStores.csv Contains information from the Data Stores view. The file contains a list of
all data stores that have at least one or more anomalies.
UbaAnomalyDetailsDataDomains.csv Contains a list of all data domains that are associated with at least one or
more anomalies.
UbaAnomalyDetailsAnomalyFactors.csv Contains information from the Anomalous Factors view. The file contains a
list of all anomalous factors that have at least one or more anomalies and
the corresponding observed and expected values.
1. Click the Anomalies icon in the header to open the Anomaly Detection workspace.
2. Optionally, set the filter conditions, time period, and view.
Deleting an Anomaly
You can delete anomalies in the Anomaly Detection workspace.
Warning: When you delete an anomaly, no other users can view the anomaly. You cannot restore deleted
anomalies.
1. Click the Anomalies icon in the header to open the Anomaly Detection workspace.
2. Select the check box next to the anomaly that you want to delete.
You can select one, multiple, or all anomalies.
3. From the Actions menu, select Delete.
A message appears to confirm that you want to delete the anomaly.
The Data Privacy Management Service deletes the anomaly from the user activity store.
Suppressing an Anomaly
To minimize the number of unnecessary anomalies created, create a suppression rule that Data Privacy
Management can use to ignore scenarios that would normally trigger an anomaly. In the rule, you must
specify the anomalous factors that you want Data Privacy Management to ignore. You can also specify the
period of time that you want the suppression rule to be in effect.
1. Click the Anomalies icon in the header to open the Anomaly Detection workspace.
2. Select the check box next to an anomaly.
3. Click Actions > Suppress.
To suppress anomalies, you must create a suppression rule. In the rule you specify the anomalous factors
that you do not want to view anomalies for. After anomaly detection, Data Privacy Management does not
display anomalies that are solely based on the suppressed factors.
For example, an employee from an office in the U.S. is visiting a branch office in Australia for a week.
Typically, Data Privacy Management will create an anomaly when users access data outside of their normal
business hours. To prevent such an anomaly, create a rule to suppress the Time of Day anomalous factor for
the duration that the user is working in Australia.
You create suppression rules on the Anomaly Detection workspace. After you create the suppression rule,
you can view, filter, sort, edit, export, and delete the suppression rule on the Suppression Rules workspace.
The workspace displays suppression rules for anomalies. When the rule is in effect, Data Privacy
Management uses the rule to ignore unusual behavior associated with the anomalous factors specified in the
rule.
To access the Suppression Rules workspace, click Manage > Suppression Rules.
366
The following image shows a sample Suppression Rules workspace:
1. Suppression rules
2. Clear filter icon
3. Filter conditions
4. Clear filter condition icon
5. Filter
6. Time period
7. Actions menu
8. Refresh icon
By default, the list of rules is sorted by status. You can click a column header to sort the list of rules by the
column.
Tip: Click the filter icon to filter the list of suppression rules. For example, you can filter the list to show only
suppression rules associated with the Unexpected Data Store factor.
Property Description
1. Expand/collapse Click to show or hide the rule details. For more information, see “Suppression
Rules Details” on page 368 .
2. Suppression rules count Total number of rules based on the selected time period and filter conditions.
3. Selection You can perform actions on one or more rules. To edit or delete a suppression
rule, select the check box next to the rule. Click the Actions menu and select
the action. You can select one, multiple, or all rules to delete.
5. User Name Name of the user account that created the suppression rule. Click the user
name to view more details about the user's activity on the User Profile page.
6. Anomalous Factors The anomalous factors associated with the suppression rule.
7. Display Time Period Filters the list to show anomalies with either a From or To date that is within the
time period. Time periods are in the UTC time zone.
Note: For time periods in days, the workspace displays information for full days.
A full day is from 0:00 through 23:59 UTC. The time period you select might not
include information from the current day. For example, you select 30 days. The
workspace shows suppression rules that will be active in the next 30 days. The
count begins from the last completed day in the UTC time zone, which can be
yesterday.
8. From and To Time Period The time period that the rule is in effect. Time is in the UTC time zone.
9. Status Indicates if the rule is active or inactive. Data Privacy Management only uses
Active rules in anomaly detection.
10. Created By User name of the user that created the suppression rule.
11. Creation Time Date and time that the rule was created. Time is in UTC time zone.
12. Refresh Click to refresh the list of rules based on the selected time period and filter
conditions.
Related Topics:
• “User Profile Page” on page 564
You can expand each entry to show the details related to the rule.
Property Description
From Time The start date and time for the rule to be enforced if the status is also set to Active.
To Time The end date and time for the rule to be enforced.
Anomalous Factors List of system-defined anomalous factors that you do not want Data Privacy Management to use
to detect anomalies.
The filter panel appears on the workspace. The following image shows the filter panel:
1. Click that you do not want to view anomalies for.Manage > Suppression Rules.
A list of all the suppression rules appears. You can use the filter options to narrow the list of rules.
2. Click the check box next to the suppression rule you want to edit.
3. From the Actions menu, select Edit.
Edit the suppression rule properties that you want to change.
4. Click Save.
The following table lists the CSV files in the Data.zip file that you export from the Suppression Rules
workspace:
Actions
This chapter includes the following topics:
Actions Overview
You can configure reusable actions and include them in security policies or in tasks that you run on demand.
You can create the following action types on the Actions workspace:
• Custom. Custom actions run a custom script when the conditions specified in the action are met.
• Email. Email actions send an email to designated recipients if a security policy violation occurs, to
automate data subject requests, or when you discover information in Data Privacy Management and you
want to notify other users.
• Service Management. Service management actions create a ServiceNow ticket in the event of a security
policy violation, to satisfy a data subject request, or when you identify another issue in Data Privacy
Management that requires creating a ticket.
• System Log. System log actions create a log message that Data Privacy Management can send to a
remote system log server in the event of a security policy violation.
You can include multiple actions in a security policy, and you can include the same action in multiple security
policies.
When you specify actions manually to run tasks, you can create an action or specify an existing action. You
create manual actions from the Actions > Take Action menu option on the following views:
372
• Top Data Stores grid page
You can use placeholders in actions as a variable for a specific detail. When an action runs an associated
task, Data Privacy Management substitutes each placeholder with the value for the variable.
Actions Workspace
On the Actions workspace, you can view and manage custom, email, service management, and system log
actions. The workspace includes a page that displays a list of actions and a detail page for each action.
The workspace lists actions by name, extension category, extension name, and extension status. To access
the Actions workspace, click Manage > Actions.
1. Actions
2. Action count
3. Clear filter icon
4. Filter conditions
5. Clear filter condition icon
6. Filter icon
7. Actions menu
8. Refresh icon
You can access action properties from this page by clicking an action name.
Property Description
Name Required. The name is not case sensitive and must be unique within the Data Privacy Management
repository. The name cannot exceed 255 characters, contain spaces, or include the following
special characters: \~!$%^&*()+
Description Optional. Long description of the action that does not exceed 255 characters.
Property Description
Parameter Provides a text field. To add a placeholder, enter a dollar sign ($) and select a placeholder
from the list that appears. The script uses the values to produce the desired output.
Action Type Optional. For actions configured with extensions that use the DSR Custom Plugin, select a
subject request type from the list.
Report Optional. For actions configured with extensions that use the DSR Custom Plugin, select one of
the following options for the subject information the task provides:
- Attributes. The task produces results that contain only the types of information that you
hold about data subjects.
- Attributes and Values. The task produces results that contain both the types of information
that you hold about data subjects and the values of the personal data.
Manually close task Optional. Select to indicate that the user assigned to the task marks the task as closed on the
after job completion Tasks workspace after the task job is successful. If you do not select this check box, Data
Privacy Management automatically updates the task status to Closed after the job completes
successfully.
Context File Type Required. Select CSV or JSON as the file type in the Attachments field on the Task Properties
page after the associated task runs.
Execute separate For actions configured with extensions that use Custom Plugin V1, select to create a separate
action for each data task for each data store that is included in the action when the custom script runs.
store
Add Context to Task Optional. Select to attach the details of the violation, data subject request, or incident that
Details as prompted the custom action. If selected, the context details appear in the Attachments field on
Attachment the Task Properties page.
Notify assignee via Optional. Select to send an email to notify the assignee when tasks resulting from the action
email are assigned to them.
Tags Optional. Select one or more tags that identify custom action characteristics.
Notes Optional. A text field that contains descriptive notes about the custom action. Notes cannot
exceed 255 characters.
Due in Days Optional. The number of days from the date the custom action runs that the task must
complete.
Property Description
To The email addresses of the recipients of the email. Use a comma to separate email addresses.
Cc Optional. The email addresses of the recipients copied on the email. Use a comma to separate
email addresses.
Bcc Optional. The email addresses of the recipients blind copied on the email. Recipients in the Bcc
field are not visible to the recipients in the To and Cc fields. Use a comma to separate email
addresses.
Subject Optional. A short description that highlights the message. To add a placeholder, enter a dollar
sign ($) and select a placeholder from the list that appears. Before sending the email, Data
Privacy Management replaces the placeholders with specific values.
Message Optional. A message regarding the reason for sending the email. To add a placeholder, enter a
dollar sign ($) and select a placeholder from the list that appears. Before sending the email,
Data Privacy Management replaces the placeholders with specific values.
For example, if you add the following text:
A violation that included $ViolationDataStoreCount data store(s)
occurred.
Violation detected: $ViolationDateTime
User's login: $LoginUserName
Data store names: $DataStoreName
Data Privacy Management replaces the placeholders with values and send the email with a
message such as the following example:
A violation that included 1 data store occurred.
Violation detected: 11/3/2019, 10:25:21 AM
User's login: JaneJohn
Data store names: DEI_US_DailyTrans
Export selected Select to attach a CSV file to the email that contains the details of the violation, subject request,
items as email or other incident that prompted the email action.
attachment
Attachments Optional. Add files that you want to attach to emails that Data Privacy Management sends when
the email action runs.
Action Type Optional. For actions configured with extensions that use the DSR Email Plugin, select a subject
request type from the list.
Report Optional. For actions configured with extensions that use the DSR Email Plugin, select one of the
following options for the subject information the task provides:
- Attributes. The task produces results that contain only the types of information you hold
about data subjects.
- Attributes and Values. The task produces results that contain both the types of information
you hold about data subjects and the values of the personal data.
Attach DSAR Optional. For actions configured with extensions that use the DSR Email Plugin, select to attach
Report if Available a DSAR report, if available to the email that Data Privacy Management sends when the task runs.
Manually close Optional. Select to indicate that the user assigned to the task marks the task as closed on the
task after job Tasks workspace after the task job is successful. If you do not select this check box, Data
completion Privacy Management automatically updates the task status to Closed after the job completes
successfully.
Context File Type Required. Select CSV or JSON as the file type that appears in the Attachments field of the Task
Properties page after the associated task runs.
Execute separate For actions configured with extensions that use Email plugin V1, select to indicate that Data
action for each Privacy Management creates separate tasks for each data store that is included in the action
data store when the email task runs.
Add Context to Optional. Select to attach the details of the violation, data subject request, or incident that
Task Details as prompted the email action. If selected, the context details appear in the Attachments field of the
Attachment Task Properties page.
Notify assignee Optional. Select to send an email to notify the assignee when tasks resulting from the action are
via email assigned to them.
Tags Optional. Select one or more tags that identify email action characteristics.
Notes Optional. A text field that contains descriptive notes about the email action. Notes cannot
exceed 255 characters.
Due in Days Optional. The number of days from the date the email action runs that the task must complete.
Property Description
Short Description A short description of the action. You can add placeholders to include specific details in the
description. To add a placeholder, enter a dollar sign ($) and select a placeholder from the
list that appears.
Description Optional. A more detailed description of the action. You can add placeholders to add more
details in the description. To add a placeholder, enter $ and select a placeholder from the list
that appears.
Urgency Optional. Select one of the following urgency levels for the ServiceNow ticket that tasks
based on the action create: High, Low, or Medium.
Impact Optional. Select one of the following options to indicate the level of impact that the
ServiceNow ticket that tasks based on the action create: High, Low, or Medium.
Category Optional. Select one of the following options to specify the product category with a reported
issue or the nature of the subject request that prompted the creation of a ServiceNow ticket:
- Database
- Hardware
- Inquiry
- Network
- Software
Attachments Optional. Add files that you want to attach to ServiceNow tickets that Data Privacy
Management creates when the service management action runs.
Wait for ServiceNow Optional. Select to indicate that the ServiceNow ticket status must change to In Progress
Ticket Status Change before the service management task can close.
Action Type Optional. For actions configured with extensions that use the DSR ServiceNow Plugin, select
a subject request type from the list.
Attach DSAR Report if Optional. For actions configured with extensions that use the DSR ServiceNow Plugin, select
Available to attach a DSAR report, if available, to the ServiceNow ticket.
Manually close task Optional. Select to indicate that the user assigned to the task marks the task as closed on the
after job completion Tasks workspace after the task job is successful. If unselected, Data Privacy Management
automatically updates the task status to Closed after the job completes successfully.
Context File Type Required. Select CSV or JSON as the file type in the Attachments field on the Task Properties
page after the associated task runs.
Execute separate For actions configured with extensions that use ServiceNow Plugin V1, select to indicate that
action for each data Data Privacy Management creates separate tasks for each data store included in the action
store when the service management task runs.
Add Context to Task Optional. Select to attach the details of the violation, data subject request, or incident that
Details as prompted the service management action. If selected, the context details appear in the
Attachment Attachments field on the Task Properties page.
Notify assignee via Optional. Select to send an email to notify the assignee when tasks resulting from the action
email are assigned to them.
Tags Optional. Select one or more tags that identify service management action characteristics.
Notes Optional. A text field that contains descriptive notes about the service management action.
Notes cannot exceed 255 characters.
Due in Days Optional. The number of days from the date the service management action runs that the task
must complete.
Property Description
Message Required. The contents of the system log message that Data Privacy Management sends to
the server when a security policy violation triggers the action. To add a placeholder, enter a
dollar sign ($) and select a placeholder from the list that appears. In the event of a security
policy violation, Data Privacy Management replaces the placeholders with specific values.
Add the message contents in a text format, such as Common Event Format (CEF) or Log Event
Extended Format (LEEF).
For example, if you add the following CEF-formatted text:
CEF:0|security|threatmanager|1.5|100|Discovered high-risk data stores
with CreditCard_Visa domain |10|CS1Label=ApplicationName CS1=S@S
CS2Label=DataStoreNames CS2=$DataStoreName
Data Privacy Management replaces the placeholders with values and sends a system log
message such as the following example:
CEF:0|security|threatmanager|1.5|100|Discovered high-risk data stores
with CreditCard_Visa domain |10|CS1Label=ApplicationName CS1=S@S
CS2Label=DataStoreNames CS2=PCRS_Linux/ORA05
Log Level Optional. Select one of the following log levels for the message:
- Alert (1)
- Critical (2)
- Debug (7)
- Emergency (0)
- Error (3)
- Informational (6)
- Notice (5)
- Warning (4)
Manually close task Optional. Select to indicate that the user assigned to the task marks the task as closed on the
after job completion Tasks workspace after the task job is successful. If unselected, Data Privacy Management
automatically updates the task status to Closed after the job completes successfully.
Execute separate Optional. Select to indicate that Data Privacy Management creates separate tasks for each
action for each data data store included in the action when the system log task runs.
store
Add Context to Task Optional. Select to attach the details of the violation or incident that prompted the system log
Details as action. If selected, the context details appear in the Attachments field on the Task Properties
Attachment page.
Notify assignee via Optional. Select to send an email to notify the assignee when tasks resulting from the action
email are assigned to them.
Tags Optional. Select one or more tags that identify system log action characteristics.
Notes Optional. A text field that contains descriptive notes about the system log action. Notes
cannot exceed 255 characters. .
Due in Days Optional. The number of days from the date the system log action runs that the task must
complete.
A list of placeholders appears when you enter a dollar sign ($) in the Parameter text box of a custom action,
in the Subject or Message field of an email action, in the Short Description or Description field of a service
management action, or in the Message field of a system log action.
The following table lists the placeholders that you can use when you create or edit an action on the Actions
workspace, create a new manual action, or when you add a custom, email, service management, or system
log action to a security policy. The table also indicates the type of security policy for which the placeholder is
valid, if applicable.
Placeholders in Custom, Email, Service Management, and System Log Actions 381
Placeholder Description Anomaly Data User
Security Store Activity
Policy Security Security
Policy Policy
Action Management
You can create and manage actions on the Actions workspace.
When you edit an action on the Actions workspace, the changes apply immediately to security policies and
active tasks that include the action.
Creating an Action
Create an action to configure details that Data Privacy Management requires to run associated tasks.
To create a system log action that uses TCP/SSL protocol, first import the SSL certificate for the system log
server to the Informatica domain truststore, and then restart the Informatica domain. For more information,
see the H2L article Updating Keystore and Truststore Certificates to Maintain Secure Communication.
Editing an Action
When you edit an action, the changes apply immediately to security policies and active tasks that include the
action.
Copying an Action
You can copy an action to create a new action with similar properties. After you copy the action, edit the
action properties.
Deleting an Action
You can delete actions that are not included in a security policy.
Manual Actions
This chapter includes the following topics:
You create manual actions from the Actions > Take Action menu option on the following views in Data
Privacy Management:
The Actions > Take Action menu provides the following options:
Manual actions to protect data create a protection task with a status of New and a prefix of PRT- on the
Tasks workspace.
385
The remaining manual actions allow you to create a new action or use an existing action as a template.
These manual actions create system tasks with a prefix of SYS-. You can run a system task immediately, or
you can save the manual action as a New system task on the Tasks workspace and manually run the task
later.
In Data Privacy Management, you can add system actions to anomaly, data store, decryption, and user
activity security policy types. You can add Persistent Data Masking protection actions to data store and user
activity security policies only. You can add encryption protection actions to decryption security policies only.
Related Topics:
• “Data Domains List Page” on page 531
• “Data Stores Grid Page” on page 533
• “Data Stores List Page” on page 535
• “Proliferation Page” on page 539
• “Sensitive Fields Page” on page 546
Create a new action or select an existing service management action to use as a template for creating the
ticket. When you create a new action, specify the service management extension and all required properties.
When you select an existing service management action, the extension and other properties such as
placeholders appear automatically. Include a description of the action, which can contain placeholders. For
more information, see “Placeholders in Custom, Email, Service Management, and System Log Actions” on
page 380.
Optionally, specify the urgency level, impact, and category of the issue requiring a service management
ticket. You can also attach the full details of the violation to the ticket, add tags and other notes, and specify
the number of days from now that the task is due.
Important: Data Privacy Management can only perform actions or create tasks for anomalies that are
associated with at least one data store. Verify that each anomaly lists at least one data store in the Data
Store column before creating a ServiceNow ticket for the anomaly.
1. On the Overview workspace, click a data store name from the Top Data Stores indicator.
The Proliferation page appears.
2. Select the check box next to one or more data stores on the Proliferation page.
3. From the Actions menu, select Take Action > Create a Service Management Ticket.
Note: The Create a Service Management Ticket option might be disabled for one of the following
reasons: you do not have privileges to create a service management task or a compatible service
management extension does not exist for the selected data stores.
The Create a Service Management Ticket page displays on the Tasks workspace.
4. Select whether to create a new service management action or to use an existing service management
action that is listed on the Actions workspace.
If you select a new action, you first choose the service management extension for the action and then
enter properties for the action. If you select an existing action, all properties for the action are populated.
You can edit the existing action properties except for the associated extension.
The following image shows the Create a Service Management Ticket page after selecting three data
stores and an existing service management action:
Related Topics:
• “Security Dashboard Overview” on page 510
• “Proliferation Page” on page 539
• “Top Data Stores Indicator” on page 522
1. Click the Security Policy Violations icon in the Data Privacy Management header. The following image
shows the Security Policy Violations icon:
5. Optionally, add a description. This description will display under the task number on the Tasks
workspace to identify the task.
6. Select or change one of the following task assignments:
• Assign to Self. Assign the task to yourself.
• System Assigned. Data Privacy Management assigns the task to the data store owner. If a data store
owner is not defined, Data Privacy Management assigns the task to you.
• Selected. Select a user from the list.
7. Enter or edit a short description of the action. You can add placeholders to include specific details in the
description. For more information, see “Placeholders in Custom, Email, Service Management, and
System Log Actions” on page 380.
8. In the Description field, optionally add or edit a more detailed description of the action. You can add
placeholders to include specific details in the description.
9. Optionally, select or edit one of the following urgency levels for the ServiceNow ticket that tasks based
on the action will create:
• High
• Low
• Medium
10. Optionally, select or edit one of the following options to indicate the level of impact that the ServiceNow
ticket that tasks based on the action will create:
Related Topics:
• “Security Policy Violations Workspace” on page 462
1. On the Overview workspace, click the Top Data Stores indicator label.
The Data Stores page appears.
2. Click Grid in the left column of the page.
6. Optionally, add a description. This description will display under the task number on the Tasks
workspace to identify the task.
7. Select or change one of the following task assignments:
• Assign to Self. Assign the task to yourself.
• System Assigned. Data Privacy Management assigns the task to the data store owner. If a data store
owner is not defined, Data Privacy Management assigns the task to you.
• Selected. Select a Data Privacy Management user from the list.
1. On the Overview workspace, click the Top Data Domains indicator label.
The Data Domains list page appears.
2. Select the check box next to one or more data domains.
3. From the Actions menu, select Take Action > Create a Service Management Ticket.
Note: The Create a Service Management Ticket option might be disabled if you do not have privileges to
create a service management task or you do not have access to the data stores in the selected data
domains.
The Create a Service Management Ticket page displays on the Tasks workspace.
4. Select whether to create a new service management action or to use an existing service management
action that is listed on the Actions workspace.
If you select a new action, you first choose the service management extension for the action and then
enter properties for the action. If you select an existing action, all properties for the action are populated.
You can edit the existing action properties except for the associated extension.
5. Optionally, add a description. This description will display under the task number on the Tasks
workspace to identify the task.
6. Select or change one of the following task assignments:
• Assign to Self. Assign the task to yourself.
• System Assigned. Data Privacy Management assigns the task to the data store owner. If a data store
owner is not defined, Data Privacy Management assigns the task to you.
• Selected. Select a user from the list.
7. Enter or edit a short description of the action. You can add placeholders to include specific details in the
description. For more information, see “Placeholders in Custom, Email, Service Management, and
System Log Actions” on page 380.
8. In the Description field, optionally add or edit a more detailed description of the action. You can add
placeholders to include specific details in the description.
9. Optionally, select or edit one of the following urgency levels for the ServiceNow ticket that tasks based
on the action will create:
• High
• Low
• Medium
10. Optionally, select or edit one of the following options to indicate the level of impact that the ServiceNow
ticket that tasks based on the action will create:
1. On the Overview workspace, click the Top Data Stores indicator label.
The Data Stores page appears.
2. Click Grid in the left column of the page.
The Data Stores page displays the list of data stores.
3. Select the check box next to one or more data stores.
6. Optionally, add a description. This description will display under the task number on the Tasks
workspace to identify the task.
7. Select or change one of the following task assignments:
• Assign to Self. Assign the task to yourself.
• System Assigned. Data Privacy Management assigns the task to the data store owner. If a data store
owner is not defined, Data Privacy Management assigns the task to you.
• Selected. Select a user from the list.
8. Enter or edit a short description of the action. You can add placeholders to include specific details in the
description. For more information, see “Placeholders in Custom, Email, Service Management, and
System Log Actions” on page 380.
Protecting Data
You can create a protection task on demand when you identify data stores or sensitive fields in Data Privacy
Management that need to be protected.
The new protection task appears on the Tasks workspace, where you configure the protection extension and
protection rules for the task. For more information, see “Configuring Protection Tasks” on page 499.
Important: Data Privacy Management can only perform actions or create tasks for anomalies that are
associated with at least one data store. Verify that each anomaly lists at least one data store in the Data
Store column before creating a protection task for the anomaly.
1. Click the Anomalies icon in the Data Privacy Management header. The following image shows the
Anomalies icon:
4. In the Data Stores field, deselect the check box next to any associated data stores in the anomaly or
anomalies for which you do not want to create a protection task.
5. In the Data Domains field, select the data domains in the data stores to protect.
6. Optionally, add a description for the task.
7. Select one of the following task assignments:
• Assign to Self. Assign the task to yourself.
• System Assigned. Data Privacy Management assigns the task to the data store owner. If a data store
owner is not defined, Data Privacy Management assigns the task to you.
• Selected. Select a user from the list.
8. To attach the anomaly details to the protection task, select the Add Context to Task Details as
Attachment check box.
9. To indicate that Data Privacy Management will notify the user assigned to configure the protection task
once you create it, select the Notify assignee via email check box.
10. Optionally, select tags for the protection task and add any notes.
11. Optionally, select a task due date.
12. Click Save as Task.
Data Privacy Management creates a separate protection task for each selected data store. The tasks
appear on the Tasks list page with a status of New and the extension is Unconfigured. A message
indicates the number of tasks the Data Privacy Management Servicesaved successfully and the number
of tasks that encountered an error. The service saves successful protection tasks in the Data Privacy
Management repository.
The user assigned to the tasks can now configure protection from the Tasks workspace.
1. On the Overview workspace, click a data store name from the Top Data Stores indicator.
The Proliferation page appears.
2. Select the check box next to one or more data stores on the Proliferation page.
3. From the Actions menu, select Take Action > Protect Data.
Note: The Protect Data option might be disabled for one of the following reasons: you do not have
privileges to create a protection task, a compatible protection extension does not exist for the selected
data stores, or the data stores do not contain any sensitive fields.
The Run a Protection Task page displays on the Tasks workspace. The Scope pane lists the selected
data stores and associated data domains, which are all selected by default. The Scope Preview pane
provides options for grouping the list of data stores and data domains so that you can quickly view the
current protection status and scope of the data stores and data domains that will be included in the
protection task.
The following image shows this view after selecting three data stores on the Proliferation page:
4. Optionally, in the Data Stores field, deselect the check box next to any associated data stores for which
you do not want to create a protection task.
5. Optionally, In the Data Domains field, deselect the check box next to any associated data domains you
do not want to protect.
6. Optionally, add a description for the task.
7. Select one of the following task assignments:
• Assign to Self. Assign the task to yourself.
• System Assigned. Data Privacy Management assigns the task to the data store owner. If a data store
owner is not defined, Data Privacy Management assigns the task to you.
• Selected. Select a user from the list.
Related Topics:
• “Security Dashboard Overview” on page 510
• “Proliferation Page” on page 539
• “Top Data Stores Indicator” on page 522
1. Click the Security Policy Violations icon in the Data Privacy Management header. The following image
shows the Security Policy Violations icon:
4. Optionally, in the Data Stores field, deselect the check box next to any associated data stores for which
you do not want to create a protection task.
5. Optionally, In the Data Domains field, deselect the check box next to any associated data domains you
do not want to protect.
6. Optionally, add a description for the task.
7. Select one of the following task assignments:
• Assign to Self. Assign the task to yourself.
• System Assigned. Data Privacy Management assigns the task to the data store owner. If a data store
owner is not defined, Data Privacy Management assigns the task to you.
• Selected. Select a user from the list.
8. To attach the security policy violation details to the protection task, select the Add Context to Task
Details as Attachment check box.
9. To indicate that Data Privacy Management will notify the user assigned to configure the protection task
once you create it, select the Notify assignee via email check box.
10. Optionally, select tags for the protection task and add any notes.
11. Optionally, select a task due date.
12. Click Save as Task.
Data Privacy Management creates a separate protection task for each selected data store. The tasks
appear on the Tasks list page with a status of New and the extension is Unconfigured. A message
indicates the number of tasks the Data Privacy Management Service saved successfully and the number
of tasks that encountered an error. The service saves successful protection tasks in the Data Privacy
Management repository.
The user assigned to the tasks can now configure protection from the Tasks workspace.
1. On the Overview workspace, click the Top Data Stores indicator label.
The Data Stores page appears.
2. Click Grid in the left column of the page.
The Data Stores page displays the list of data stores.
3. Click a value in the Sensitive Fields column.
The Sensitive Fields page appears.
4. From the Actions menu, select Take Action > Protect Data.
Note: The Protect Data option might be disabled for one of the following reasons: you do not have
privileges to create a protection task, a compatible protection extension does not exist for the data
stores that contain the sensitive fields, or the data store has not been scanned and has a value of Not
Analyzed in the Protection Status column.
The Run a Protection Task page displays on the Tasks workspace. The Scope pane lists the data stores
and data domains associated with the sensitive fields, which are all selected by default. The Scope
Preview pane provides options for grouping the list of sensitive fields so that you can quickly view the
current protection status of the data stores and data domains that will be included in the protection task,
as well as the number of sensitive fields in each data store and data domain.
The following image shows this view after selecting a sensitive field number on the Sensitive Fields
page:
5. Optionally, in the Data Stores field, deselect the check box next to any associated data stores for which
you do not want to create a protection task.
6. Optionally, In the Data Domains field, deselect the check box next to any associated data domains you
do not want to protect.
7. Optionally, add a description for the task.
8. Select one of the following task assignments:
Related Topics:
• “Security Dashboard Overview” on page 510
• “Sensitive Fields Page” on page 546
• “Top Data Stores Indicator” on page 522
1. On the Overview workspace, click the Top Data Domains indicator label.
The Data Domains list page appears.
2. Select the check box next to one or more data domains.
3. From the Actions menu, selectTake Action > Protect Data.
Note: The Protect Data option might be disabled for one of the following reasons: you do not have
privileges to create a protection task, a compatible protection extension does not exist for the data
stores in the selected data domains, or the data store has not been scanned and has a value of Not
Analyzed in the Protection Status column.
The Run a Protection Task page displays on the Tasks workspace. The Scope pane lists the data stores
and data domains associated with the sensitive fields, which are all selected by default. The Scope
Preview pane provides options for grouping the list of sensitive fields so that you can quickly view the
current protection status of the data stores and data domains that will be included in the protection task,
as well as the number of sensitive fields in each data store and data domain.
4. The Data Domains field lists the data domain or domains with sensitive fields that you selected to
protect. Optionally, if you selected more than one data domain, deselect any data domains you do not
want to include in the protection task.
5. Optionally, in the Data Stores field, deselect the check box next to any associated data stores for which
you do not want to create a protection task.
6. Optionally, add a description for the task.
7. Select one of the following task assignments:
• Assign to Self. Assign the task to yourself.
• System Assigned. Data Privacy Management assigns the task to the data store owner. If a data store
owner is not defined, Data Privacy Management assigns the task to you.
• Selected. Select a user from the list.
8. To attach the data domain details to the protection task, select the Add Context to Task Details as
Attachment check box.
9. To indicate that Data Privacy Management will notify the user assigned to configure the protection task
once you create it, select the Notify assignee via email check box.
10. Optionally, select tags for the protection task and add any notes.
11. Optionally, select a task due date.
12. Click Save as Task.
Data Privacy Management creates a separate protection task for each data store. The tasks appear on
the Tasks list page with a status of New and the extension is Unconfigured. A message indicates the
number of tasks the Data Privacy Management Service saved successfully and the number of tasks that
encountered an error. The service saves successful protection tasks in the Data Privacy Management
repository.
The user assigned to the tasks can now configure protection from the Tasks workspace.
1. On the Overview workspace, click the Top Data Stores indicator label.
The Data Stores page appears.
2. Click Grid in the left column of the page.
The Data Stores page displays the list of data stores.
3. Select the check box next to one or more data stores.
4. From the Actions menu, selectTake Action > Protect Data.
Note: The Protect Data option might be disabled for one of the following reasons: you do not have
privileges to create a protection task, a compatible protection extension does not exist or is not
supported for the selected data stores, or the data store has not been scanned and has a value of Not
Analyzed in the Protection Status column.
The Run a Protection Task page displays on the Tasks workspace. The Scope pane lists the data stores
and data domains associated with the sensitive fields, which are all selected by default. The Scope
Preview pane provides options for grouping the list of sensitive fields so that you can quickly view the
current protection status of the data stores and data domains that will be included in the protection task,
as well as the number of sensitive fields in each data store and data domain.
The following image shows this view after selecting five data stores that include two data domains on
the Top Data Stores grid page:
5. Optionally, in the Data Stores field, deselect the check box next to any data stores for which you do not
want to create a protection task.
6. Optionally, in the Data Domains field, deselect the check box next to any data stores for which you do not
want to create a protection task.
7. Optionally, add a description for the task.
8. Select one of the following task assignments:
• Assign to Self. Assign the task to yourself.
• System Assigned. Data Privacy Management assigns the task to the data store owner. If a data store
owner is not defined, Data Privacy Management assigns the task to you.
• Selected. Select a user from the list.
Related Topics:
• “Security Dashboard Overview” on page 510
• “Top Data Stores Indicator” on page 522
Create a new action or select an existing custom action to use as a template for running the script. When you
create a new action, specify the custom extension and all required properties. When you select an existing
custom action, the extension and other properties such as placeholders appear automatically. Include a
description of the action, which can contain placeholders. For more information, see “Placeholders in
Custom, Email, Service Management, and System Log Actions” on page 380.
You can also edit the parameter and specify how Data Privacy Management will run and close the task. You
can also add tags and other notes, and edit the due date.
Important: Data Privacy Management can only perform actions or create tasks for anomalies that are
associated with at least one data store. Verify that each anomaly lists at least one data store in the Data
Store column before running a custom script for the anomaly.
1. Click the Anomalies icon in the Data Privacy Management header. The following image shows the
Anomalies icon:
5. Optionally, add a description. This description will display under the task number on the Tasks
workspace to identify the task.
6. Select or change one of the following task assignments:
• Assign to Self. Assign the task to yourself.
• System Assigned. Data Privacy Management assigns the task to the data store owner. If a data store
owner is not defined, Data Privacy Management assigns the task to you.
• Selected. Select a user from the list.
7. Optionally, enter or edit a parameter for the action. You can add placeholders. For more information, see
“Placeholders in Custom, Email, Service Management, and System Log Actions” on page 380.
8. Optionally, select the Manually close task after job completion check box to indicate that the user
assigned to the task will mark the task as closed on the Tasks workspace after the task job is
successful. If you do not select this check box, Data Privacy Management will automatically update the
task status to Closed after the job completes successfully.
9. Optionally, select the Execute separate action for each data store check box to indicate that Data
Privacy Management will create separate tasks for each data store that is included in the action when
the script runs.
10. Optionally, select the Add Context to Task Details as Attachment check box to attach the details of the
anomaly to the tasks associated with the action.
Related Topics:
• “Anomaly Detection Workspace” on page 345
1. On the Overview workspace, click a data store name from the Top Data Stores indicator.
The Proliferation page appears.
2. Select the check box next to one or more data stores on the Proliferation page.
3. From the Actions menu, select Take Action > Run a Custom Script.
Note: The Run a Custom Script option might be disabled if you do not have privileges to create a custom
task or you do not have access to the selected data stores.
The Run a Custom Script page displays on the Tasks workspace.
4. Select whether to create a new custom action or to use an existing custom action that is listed on the
Actions workspace.
If you select a new action, you first choose the custom extension for the action and then enter properties
for the action. If you select an existing action, all properties for the action are populated. You can edit
the existing action properties except for the associated extension.
5. Optionally, add a description. This description will display under the task number on the Tasks
workspace to identify the task.
6. Select or change one of the following task assignments:
• Assign to Self. Assign the task to yourself.
• System Assigned. Data Privacy Management assigns the task to the data store owner. If a data store
owner is not defined, Data Privacy Management assigns the task to you.
• Selected. Select a user from the list.
7. Optionally, enter or edit a parameter for the action. You can add placeholders. For more information, see
“Placeholders in Custom, Email, Service Management, and System Log Actions” on page 380.
8. Optionally, select the Manually close task after job completion check box to indicate that the user
assigned to the task will mark the task as closed on the Tasks workspace after the task job is
successful. If you do not select this check box, Data Privacy Management will automatically update the
task status to Closed after the job completes successfully.
9. Optionally, select the Execute separate action for each data store check box to indicate that Data
Privacy Management will create separate tasks for each data store that is included in the action when
the script runs.
10. Optionally, select the Add Context to Task Details as Attachment check box to attach the data store
details to the tasks associated with the action.
11. To indicate that Data Privacy Management will notify the user assigned to the custom task once you
create it, select the Notify assignee via email check box.
12. Optionally, select tags for the custom task and add any notes.
13. Optionally, select a task due date.
Related Topics:
• “Security Dashboard Overview” on page 510
• “Proliferation Page” on page 539
• “Top Data Stores Indicator” on page 522
1. Click the Security Policy Violations icon in the Data Privacy Management header. The following image
shows the Security Policy Violations icon:
5. Optionally, add a description. This description will display under the task number on the Tasks
workspace to identify the task.
6. Select or change one of the following task assignments:
• Assign to Self. Assign the task to yourself.
• System Assigned. Data Privacy Management assigns the task to the data store owner. If a data store
owner is not defined, Data Privacy Management assigns the task to you.
• Selected. Select a user from the list.
7. Optionally, enter or edit a parameter for the action. You can add placeholders. For more information, see
“Placeholders in Custom, Email, Service Management, and System Log Actions” on page 380.
8. Optionally, select the Manually close task after job completion check box to indicate that the user
assigned to the task will mark the task as closed on the Tasks workspace after the task job is
successful. If you do not select this check box, Data Privacy Management will automatically update the
task status to Closed after the job completes successfully.
9. Optionally, select the Execute separate action for each data store check box to indicate that Data
Privacy Management will create separate tasks for each data store that is included in the action when
the script runs.
10. Optionally, select the Add Context to Task Details as Attachment check box to attach the violation
details to the tasks associated with the action.
11. To indicate that Data Privacy Management will notify the user assigned to the custom task once you
create it, select the Notify assignee via email check box.
12. Optionally, select tags for the custom task and add any notes.
13. Optionally, select a task due date.
Related Topics:
• “Security Policy Violations Workspace” on page 462
1. On the Overview workspace, click the Top Data Stores indicator label.
The Data Stores page appears.
2. Click Grid in the left column of the page.
The Data Stores page displays the list of data stores.
3. Click a value in the Sensitive Fields column.
The Sensitive Fields page appears.
4. From the Actions menu, select Take Action > Run a Custom Script.
Note: The Run a Custom Script option might be disabled if you do not have privileges to create a custom
task or you do not have access to the data store that contains the sensitive fields.
The Run a Custom Script page displays on the Tasks workspace.
5. Select whether to create a new custom action or to use an existing custom action that is listed on the
Actions workspace.
If you select a new action, you first choose the custom extension for the action and then enter properties
for the action. If you select an existing action, all properties for the action are populated. You can edit
the existing action properties except for the associated extension.
6. Optionally, add a description. This description will display under the task number on the Tasks
workspace to identify the task.
7. Select or change one of the following task assignments:
• Assign to Self. Assign the task to yourself.
• System Assigned. Data Privacy Management assigns the task to the data store owner. If a data store
owner is not defined, Data Privacy Management assigns the task to you.
• Selected. Select a user from the list.
8. Optionally, enter or edit a parameter for the action. You can add placeholders. For more information, see
“Placeholders in Custom, Email, Service Management, and System Log Actions” on page 380.
9. Optionally, select the Manually close task after job completion check box to indicate that the user
assigned to the task will mark the task as closed on the Tasks workspace after the task job is
successful. If you do not select this check box, Data Privacy Management will automatically update the
task status to Closed after the job completes successfully.
10. Optionally, select the Execute separate action for each data store check box to indicate that Data
Privacy Management will create separate tasks for each data store that is included in the action when
the script runs.
11. Optionally, select the Add Context to Task Details as Attachment check box to attach the sensitive field
details to the tasks associated with the action.
12. To indicate that Data Privacy Management will notify the user assigned to the custom task once you
create it, select the Notify assignee via email check box.
13. Optionally, select tags for the custom task and add any notes.
14. Optionally, select a task due date.
Related Topics:
• “Security Dashboard Overview” on page 510
• “Sensitive Fields Page” on page 546
• “Top Data Stores Indicator” on page 522
Running a Custom Script from the Top Data Domains List Page
You can run a custom script from the Top Data Domains list page.
1. On the Overview workspace, click the Top Data Domains indicator label.
The Data Domains list page appears.
2. Select the check box next to one or more data domains.
3. From the Actions menu, select Take Action > Run a Custom Script.
Note: The Run a Custom Script option might be disabled if you do not have privileges to create a custom
task or you do not have access to the data stores included in the selected data domains.
The Run a Custom Script page displays on the Tasks workspace.
4. Select whether to create a new custom action or to use an existing custom action that is listed on the
Actions workspace.
If you select a new action, you first choose the custom extension for the action and then enter properties
for the action. If you select an existing action, all properties for the action are populated. You can edit
the existing action properties except for the associated extension.
5. Optionally, add a description. This description will display under the task number on the Tasks
workspace to identify the task.
6. Select or change one of the following task assignments:
• Assign to Self. Assign the task to yourself.
• System Assigned. Data Privacy Management assigns the task to the data store owner. If a data store
owner is not defined, Data Privacy Management assigns the task to you.
• Selected. Select a user from the list.
7. Optionally, enter or edit a parameter for the action. You can add placeholders. For more information, see
“Placeholders in Custom, Email, Service Management, and System Log Actions” on page 380.
8. Optionally, select the Manually close task after job completion check box to indicate that the user
assigned to the task will mark the task as closed on the Tasks workspace after the task job is
successful. If you do not select this check box, Data Privacy Management will automatically update the
task status to Closed after the job completes successfully.
9. Optionally, select the Execute separate action for each data store check box to indicate that Data
Privacy Management will create separate tasks for each data store that is included in the action when
the script runs.
10. Optionally, select the Add Context to Task Details as Attachment check box to attach the data domain
details to the tasks associated with the action.
11. To indicate that Data Privacy Management will notify the user assigned to the custom task once you
create it, select the Notify assignee via email check box.
12. Optionally, select tags for the custom task and add any notes.
13. Optionally, select a task due date.
Running a Custom Script from the Top Data Stores Grid Page
You can run a custom script from the Top Data Stores grid page.
1. On the Overview workspace, click the Top Data Stores indicator label.
The Data Stores page appears.
2. Click Grid in the left column of the page.
The Data Stores page displays the list of data stores.
3. Select the check box next to one or more data stores.
4. From the Actions menu, select Take Action > Run a Custom Script.
Note: The Run a Custom Script option might be disabled if you do not have privileges to create a custom
task or you do not have access to the selected data stores.
The Run a Custom Script page displays on the Tasks workspace.
5. Select whether to create a new custom action or to use an existing custom action that is listed on the
Actions workspace.
If you select a new action, you first choose the custom extension for the action and then enter properties
for the action. If you select an existing action, all properties for the action are populated. You can edit
the existing action properties except for the associated extension.
The following image shows the Run a Custom Script page after selecting five data stores that each have
a residual risk value greater than $30k:
Related Topics:
• “Security Dashboard Overview” on page 510
• “Top Data Stores Indicator” on page 522
Sending an Email
You can create an email task on demand when you identify an issue in Data Privacy Management that
requires a notification email to be sent.
Create a new action or select an existing email action to use as a template for creating the ticket. When you
create a new action, specify the email extension and all required properties. When you select an existing
email action, the extension and other properties such as placeholders appear automatically. Include a
description of the action, which can contain placeholders. For more information, see “Placeholders in
Custom, Email, Service Management, and System Log Actions” on page 380.
Important: Data Privacy Management can only perform actions or create tasks for anomalies that are
associated with at least one data store. Verify that each anomaly lists at least one data store in the Data
Store column before sending an email for the anomaly.
1. Click the Anomalies icon in the Data Privacy Management header. The following image shows the
Anomalies icon:
5. Optionally, add a description. This description will display under the task number on the Tasks
workspace to identify the task.
6. Select or change one of the following task assignments:
• Assign to Self. Assign the task to yourself.
• System Assigned. Data Privacy Management assigns the task to the data store owner. If a data store
owner is not defined, Data Privacy Management assigns the task to you.
• Selected. Select a user from the list.
7. In the From field, enter or edit the sender email address.
8. In the To field, enter or edit the recipient email address. You can enter multiple addresses separated by a
comma.
9. Optionally, enter email addresses for recipients to be copied on the email in the Cc and Bcc fields.
10. In the Subject field, enter or edit a subject for the email.
11. Enter or edit the email message contents. You can add placeholders. For more information, see
“Placeholders in Custom, Email, Service Management, and System Log Actions” on page 380.
12. Optionally, select the Export selected items as email attachment check box to attach a CSV file to the
email that contains the anomaly details.
13. Optionally, add attachments to the email.
14. Optionally, select the Manually close task after job completion check box to indicate that the user
assigned to the task will mark the task as closed on the Tasks workspace after the task job is
Related Topics:
• “Anomaly Detection Workspace” on page 345
1. On the Overview workspace, click a data store name from the Top Data Stores indicator.
The Proliferation page appears.
2. Select the check box next to one or more data stores on the Proliferation page.
3. From the Actions menu, select Take Action > Send an Email.
Note: The Send an Email option might be disabled if you do not have privileges to create an email task or
you do not have access to the selected data stores.
The Send an Email page displays on the Tasks workspace.
4. Select whether to create a new email action or to use an existing email action that is listed on the
Actions workspace.
If you select a new action, you first choose the email extension for the action and then enter properties
for the action. If you select an existing action, all properties for the action are populated. You can edit
the existing action properties except for the associated extension.
5. Optionally, add a description. This description will display under the task number on the Tasks
workspace to identify the task.
6. Select or change one of the following task assignments:
• Assign to Self. Assign the task to yourself.
• System Assigned. Data Privacy Management assigns the task to the data store owner. If a data store
owner is not defined, Data Privacy Management assigns the task to you.
• Selected. Select a user from the list.
7. In the From field, enter or edit the sender email address.
8. In the To field, enter or edit the recipient email address. You can enter multiple addresses separated by a
comma.
9. Optionally, enter email addresses for recipients to be copied on the email in the Cc and Bcc fields.
10. In the Subject field, enter or edit a subject for the email.
11. Enter or edit the email message contents. You can add placeholders. For more information, see
“Placeholders in Custom, Email, Service Management, and System Log Actions” on page 380.
12. Optionally, select the Export selected items as email attachment check box to attach a CSV file to the
email that contains the data store details.
13. Optionally, add attachments to the email.
Related Topics:
• “Security Dashboard Overview” on page 510
• “Proliferation Page” on page 539
• “Top Data Stores Indicator” on page 522
1. Click the Security Policy Violations icon in the Data Privacy Management header. The following image
shows the Security Policy Violations icon:
5. Optionally, add a description. This description will display under the task number on the Tasks
workspace to identify the task.
6. Select or change one of the following task assignments:
• Assign to Self. Assign the task to yourself.
• System Assigned. Data Privacy Management assigns the task to the data store owner. If a data store
owner is not defined, Data Privacy Management assigns the task to you.
• Selected. Select a user from the list.
7. In the From field, enter or edit the sender email address.
8. In the To field, enter or edit the recipient email address. You can enter multiple addresses separated by a
comma.
9. Optionally, enter email addresses for recipients to be copied on the email in the Cc and Bcc fields.
10. In the Subject field, enter or edit a subject for the email.
11. Enter or edit the email message contents. You can add placeholders. For more information, see
“Placeholders in Custom, Email, Service Management, and System Log Actions” on page 380.
Related Topics:
• “Security Policy Violations Workspace” on page 462
1. On the Overview workspace, click the Top Data Stores indicator label.
The Data Stores page appears.
2. Click Grid in the left column of the page.
The Data Stores page displays the list of data stores.
3. Click a value in the Sensitive Fields column.
The Sensitive Fields page appears.
4. From the Actions menu, select Take Action > Send an Email.
Note: The Send an Email option might be disabled if you do not have privileges to create an email task or
you do not have access to the data store that contains the sensitive fields.
The Send an Email page displays on the Tasks workspace.
5. Select whether to create a new email action or to use an existing email action that is listed on the
Actions workspace.
6. Optionally, add a description. This description will display under the task number on the Tasks
workspace to identify the task.
7. Select or change one of the following task assignments:
• Assign to Self. Assign the task to yourself.
• System Assigned. Data Privacy Management assigns the task to the data store owner. If a data store
owner is not defined, Data Privacy Management assigns the task to you.
• Selected. Select a user from the list.
8. In the From field, enter or edit the sender email address.
9. In the To field, enter or edit the recipient email address. You can enter multiple addresses separated by a
comma.
10. Optionally, enter email addresses for recipients to be copied on the email in the Cc and Bcc fields.
11. In the Subject field, enter or edit a subject for the email.
12. Enter or edit the email message contents. You can add placeholders. For more information, see
“Placeholders in Custom, Email, Service Management, and System Log Actions” on page 380.
13. Optionally, select the Export selected items as email attachment check box to attach a CSV file to the
email that contains the sensitive field details.
Related Topics:
• “Security Dashboard Overview” on page 510
• “Sensitive Fields Page” on page 546
• “Top Data Stores Indicator” on page 522
1. On the Overview workspace, click the Top Data Domains indicator label.
The Data Domains list page appears.
2. Select the check box next to one or more data domains.
3. From the Actions menu, select Take Action > Send an Email.
Note: The Send an Email option might be disabled if you do not have privileges to create an email task or
you do not have access to the data stores included in the selected data domains.
The Send an Email page displays on the Tasks workspace.
4. Select whether to create a new email action or to use an existing email action that is listed on the
Actions workspace.
If you select a new action, you first choose the email extension for the action and then enter properties
for the action. If you select an existing action, all properties for the action are populated. You can edit
the existing action properties except for the associated extension.
5. Optionally, add a description. This description will display under the task number on the Tasks
workspace to identify the task.
6. Select or change one of the following task assignments:
• Assign to Self. Assign the task to yourself.
• System Assigned. Data Privacy Management assigns the task to the data store owner. If a data store
owner is not defined, Data Privacy Management assigns the task to you.
• Selected. Select a user from the list.
7. In the From field, enter or edit the sender email address.
8. In the To field, enter or edit the recipient email address. You can enter multiple addresses separated by a
comma.
9. Optionally, enter email addresses for recipients to be copied on the email in the Cc and Bcc fields.
10. In the Subject field, enter or edit a subject for the email.
11. Enter or edit the email message contents. You can add placeholders. For more information, see
“Placeholders in Custom, Email, Service Management, and System Log Actions” on page 380.
12. Optionally, select the Export selected items as email attachment check box to attach a CSV file to the
email that contains the data domain details.
13. Optionally, add attachments to the email.
1. On the Overview workspace, click the Top Data Stores indicator label.
The Data Stores page appears.
2. Click Grid in the left column of the page.
The Data Stores page displays the list of data stores.
3. Select the check box next to one or more data stores.
4. From the Actions menu, select Take Action > Send an Email.
Note: The Send an Email option might be disabled if you do not have privileges to create an email task or
you do not have access to the selected data stores.
The Send an Email page displays on the Tasks workspace.
5. Select whether to create a new email action or to use an existing email action that is listed on the
Actions workspace.
If you select a new action, you first choose the email extension for the action and then enter properties
for the action. If you select an existing action, all properties for the action are populated. You can edit
the existing action properties except for the associated extension.
6. Optionally, add a description. This description will display under the task number on the Tasks
workspace to identify the task.
7. Select or change one of the following task assignments:
• Assign to Self. Assign the task to yourself.
• System Assigned. Data Privacy Management assigns the task to the data store owner. If a data store
owner is not defined, Data Privacy Management assigns the task to you.
• Selected. Select a user from the list.
8. In the From field, enter or edit the sender email address.
9. In the To field, enter or edit the recipient email address. You can enter multiple addresses separated by a
comma.
10. Optionally, enter email addresses for recipients to be copied on the email in the Cc and Bcc fields.
11. In the Subject field, enter or edit a subject for the email.
12. Enter or edit the email message contents. You can add placeholders. For more information, see
“Placeholders in Custom, Email, Service Management, and System Log Actions” on page 380.
13. Optionally, select the Export selected items as email attachment check box to attach a CSV file to the
email that contains the data store details.
14. Optionally, add attachments to the email.
Related Topics:
• “Security Dashboard Overview” on page 510
• “Top Data Stores Indicator” on page 522
Create a new action or select an existing system log action to use as a template for writing the message.
When you create a new action, specify the system log extension and all required properties. When you select
an existing system log action, the extension and other properties such as placeholders appear automatically.
Include a description of the action, which can contain placeholders. For more information, see “Placeholders
in Custom, Email, Service Management, and System Log Actions” on page 380.
Optionally, edit the log level and specify how Data Privacy Management will run and close the task. You can
also add tags and other notes, and edit the due date.
Important: Data Privacy Management can only perform actions or create tasks for anomalies that are
associated with at least one data store. Verify that each anomaly lists at least one data store in the Data
Store column before writing a system log message for the anomaly.
1. Click the Anomalies icon in the Data Privacy Management header. The following image shows the
Anomalies icon:
5. Optionally, add a description. This description will display under the task number on the Tasks
workspace to identify the task.
1. On the Overview workspace, click a data store name from the Top Data Stores indicator.
The Proliferation page appears.
2. Select the check box next to one or more data stores on the Proliferation page.
3. From the Actions menu, select Take Action > Write a System Log Message.
Note: The Write a System Log Message option might be disabled if you do not have privileges to create a
system log task or you do not have access to the selected data stores.
The Write a System Log Message page displays on the Tasks workspace.
4. Select whether to create a new system log action or to use an existing system log action that is listed on
the Actions workspace.
If you select a new action, you first choose the system log extension for the action and then enter
properties for the action. If you select an existing action, all properties for the action are populated. You
can edit the existing action properties except for the associated extension.
The following image shows the Write a System Log Message page after selecting three data stores and
an existing system log action:
5. Optionally, add a description. This description will display under the task number on the Tasks
workspace to identify the task.
6. Select or change one of the following task assignments:
• Assign to Self. Assign the task to yourself.
• System Assigned. Data Privacy Management assigns the task to the data store owner. If a data store
owner is not defined, Data Privacy Management assigns the task to you.
Related Topics:
• “Security Dashboard Overview” on page 510
• “Proliferation Page” on page 539
• “Top Data Stores Indicator” on page 522
1. Click the Security Policy Violations icon in the Data Privacy Management header. The following image
shows the Security Policy Violations icon:
5. Optionally, add a description. This description will display under the task number on the Tasks
workspace to identify the task.
1. On the Overview workspace, click the Top Data Stores indicator label.
The Data Stores page appears.
2. Click Grid in the left column of the page.
The Data Stores page displays the list of data stores.
3. Click a value in the Sensitive Fields column.
The Sensitive Fields page appears.
4. From the Actions menu, select Take Action > Write a System Log Message.
Note: The Write a System Log Message option might be disabled if you do not have privileges to create a
system log task or you do not have access to the selected data stores.
The Write a System Log Message page displays on the Tasks workspace.
5. Select whether to create a new system log action or to use an existing system log action that is listed on
the Actions workspace.
If you select a new action, you first choose the system log extension for the action and then enter
properties for the action. If you select an existing action, all properties for the action are populated. You
can edit the existing action properties except for the associated extension.
The following image shows the Write a System Log Message page after selecting a data store that has
32 sensitive fields, 30 of which are unprotected, and a risk score of 1:
6. Optionally, add a description. This description will display under the task number on the Tasks
workspace to identify the task.
7. Select or change one of the following task assignments:
Related Topics:
• “Security Dashboard Overview” on page 510
• “Sensitive Fields Page” on page 546
• “Top Data Stores Indicator” on page 522
1. On the Overview workspace, click the Top Data Domains indicator label.
The Data Domains list page appears.
2. Select the check box next to one or more data domains.
3. From the Actions menu, select Take Action > Write a System Log Message.
Note: The Write a System Log Message option might be disabled if you do not have privileges to create a
system log task or you do not have access to the data stores in the selected data domains.
The Write a System Log Message page displays on the Tasks workspace.
4. Select whether to create a new system log action or to use an existing system log action that is listed on
the Actions workspace.
If you select a new action, you first choose the system log extension for the action and then enter
properties for the action. If you select an existing action, all properties for the action are populated. You
can edit the existing action properties except for the associated extension.
The following image shows the Write a System Log Message page after selecting 12 data domains that
include 26 data stores:
5. Optionally, add a description. This description will display under the task number on the Tasks
workspace to identify the task.
6. Select or change one of the following task assignments:
• Assign to Self. Assign the task to yourself.
• System Assigned. Data Privacy Management assigns the task to the data store owner. If a data store
owner is not defined, Data Privacy Management assigns the task to you.
• Selected. Select a user from the list.
7. Optionally, enter or edit the contents of the system log message that Data Privacy Management will send
to the remote system log server configured in the extension. You can add placeholders. For more
Writing a System Log Message from the Top Data Stores Grid
Page
You can write a system log message to a remote server from the Top Data Stores grid page.
1. On the Overview workspace, click the Top Data Stores indicator label.
The Data Stores page appears.
2. Click Grid in the left column of the page.
The Data Stores page displays the list of data stores.
3. Select the check box next to one or more data stores.
4. From the Actions menu, select Take Action > Write a System Log Message.
6. Optionally, add a description. This description will display under the task number on the Tasks
workspace to identify the task.
7. Select or change one of the following task assignments:
• Assign to Self. Assign the task to yourself.
• System Assigned. Data Privacy Management assigns the task to the data store owner. If a data store
owner is not defined, Data Privacy Management assigns the task to you.
• Selected. Select a user from the list.
8. Optionally, enter or edit the contents of the system log message that Data Privacy Management will send
to the remote system log server configured in the extension. You can add placeholders. For more
information, see “Placeholders in Custom, Email, Service Management, and System Log Actions” on page
380.
9. Optionally, select one of the following log levels for the message:
• Alert (1)
• Critical (2)
• Debug (7)
• Emergency (0)
Related Topics:
• “Security Dashboard Overview” on page 510
• “Top Data Stores Indicator” on page 522
Security Policies
This chapter includes the following topics:
For example, you can create a security policy to detect one of the following situations:
When you create an anomaly, data store, or user activity security policy, you set the severity level of the
violation and specify a rule that defines the violation. Optionally, you can specify an action for Data Privacy
Management to perform in the event of a violation. Data Privacy Management can send emails and run an
executable script that performs a custom action.
Based on the type of security policy you create, Data Privacy Management evaluates anomalies, data stores,
encrypted data, and user activity to determine if the rule in a security policy is met. When the rule criteria for
an anomaly, data store, or user activity policy is met, Data Privacy Management creates a security policy
violation. You can view violations on the Security Policy Violations workspace. When the rule criteria for a
decryption policy is met, Data Privacy Management decrypts the encrypted data.
You view and manage security policies from the Security Policies workspace. To access the workspace, click
Manage > Security Policies from the header.
446
Security Policy Types
When you create a new security policy on the Security Policies workspace, specify the security policy type
based on the information you want Data Privacy Management to evaluate.
• Anomaly policy. Data Privacy Management evaluates anomalies to determine if an anomaly matches the
conditions in the anomaly policy.
• Data store policy. Data Privacy Management runs the Evaluate Security Policy job to evaluate data stores
that you can access. The job determines if a data store matches the conditions in the data store policy.
• Decryption policy. Data Privacy Management evaluates all data that meets the criteria specified in the
conditions for the policy.
• User activity policy. Data Privacy Management evaluates events that stream to the user activity store to
determine if an event matches the conditions in the user activity policy. It evaluates only the new events
since the previous evaluation.
If an anomaly, data store, or user activity event matches the conditions in the respective security policy, it
creates a security policy violation. If the conditions specified in a decryption policy are met, it decrypts the
data that meets the conditions.
1. Edit each production data store and assign a tag such as ProductionDB.
You tag the production data stores so you can prevent the non-production data stores from triggering a
violation.
2. Create a security policy with the following conditions:
• Data store tag equals ProductionDB.
• Data store country equals Canada.
• Risk score percentage is equal to or greater than 20%.
To decrypt data based on conditions such as a user group, you create a new decryption security policy. On
the Define Rule page, add the conditions for decrypting data. For example, you might add some of the
following conditions in the All are true tab:
• One or more data domains or data domain groups. For example, a data domain group named "Employee
Data" includes sensitive data for employees in data domains such as First Name, Last Name, Birthday,
Gender, Social Security Numbers, Address, City, Zip Code, and Salary.
• A user group with authorization to view decrypted data. For example, a user group named "Human
Resources Managers" that need to view the "Employee Data" in clear.
Add actions for how Data Privacy Management will decrypt the data. Select Protect Data and specify an
encryption extension. You can send an email to notify users when a policy condition to decrypt data is met.
You can write a system log message to the host defined in the system log extension from protection remote
agent servers.
Decryption security policies do not generate jobs in Data Privacy Management when the actions to decrypt
data run. Instead, Data Privacy Management notifies all protection remote agents of the decryption actions,
unless the security policy includes a condition for data stores associated with protection remote agents. In
this case, Data Privacy Management notifies only the protection remote agents associated with the data
stores specified in the condition.
Related Topics:
• “Security Policy Properties” on page 450
• “Publishing Protection Remote Agents” on page 84
The default workspace view displays a list of security policies. From the list, you can select a security policy
to view and edit details. You can use filters to quickly find the security policies that you want to manage.
To access the Security Policies workspace, click Manage > Security Policies.
The following image shows an example of the Security Policies list page:
1. Security policies
2. Security policy count
3. Clear Filter icon
4. Filter conditions
5. Clear Filter Condition icon
6. Security policy properties
7. Filter icon
8. Actions menu
• Click a security policy name. To edit the properties, click Edit on the Remote Agent Details page.
• Select the check box next to a security policy and then select Open or Edit from the Actions menu.
When you create a new security policy, select one of the following options from the Actions > New menu:
• Anomaly Policy
• Data Store Policy
• Decryption Policy
• User Activity Policy
Name
The name is not case sensitive and cannot exceed 255 characters. Security policy names cannot contain
a space or the following special characters: ~!$%^&*()+
Description
Optional. A security policy description that differentiates between similarly named security policies. The
description cannot exceed 255 characters.
The name of the security policy group. The default group is the Default security policy group.
For more information, see Chapter 21, “Security Policy Groups” on page 456.
For anomaly, data store, and user activity policies, indicates the severity level of the security policy in the
event of a violation. Options are:
• Informational
• Low
• Medium
• High
• Critical
Owner
Optional. Individual, group, or organization name that is responsible for the accuracy and integrity of the
security policy. The name is case-sensitive and cannot exceed 100 characters.
Status
• Active
• Inactive. Default.
Last Modified by
Displays the author of the last update to the security policy.
Last Modified on
Rule
Specifies the conditions, condition groups, and actions defined for a security policy.
For example, you want to be alerted when data stores with a sensitivity status of Critical or Restricted
are accessed over the weekend. You can specify the following conditions:
All are true
Data Store Sensitivity - Any of - Critical, Restricted.
Day of week - Any of - Saturday, Sunday.
The following image shows the example rule:
Actions
Optional. For decryption policies, specifies the decryption method and optionally, email or system log
actions that Data Privacy Management must perform if the conditions in the policy are met. For all other
policy types, specifies the action Data Privacy Management must perform in the event of a security
policy violation.
• Create a Service Management Ticket. For anomaly, data store, or user activity security policies only.
Create a new action or use an existing service management action. Specify the urgency, impact, and
service category. Optionally add file attachments to a ServiceNow ticket, specify the number of days
from now that the action is due, and add tags, notes, and a description, which optionally includes
placeholders. Service management actions also include details about how Data Privacy Management
will run and close associated tasks, and whether to send an email to notify users when associated
tasks are assigned to them. In the event of a security policy violation, the Data Privacy Management
Service runs the service management task and creates the service management ticket for one or
more data stores.
For more information, see “Service Management Action Properties” on page 377.
• Protect Data. For data store, decryption, or user activity security policies only. Optionally, select tags,
specify the number of days until the action is due, add attachments, and enter notes.
For data store or user activity security policies, in the event of a security policy violation, Data Privacy
Management creates a protection task for each data store included in the violation. The new
protection tasks are displayed on the Tasks workspace, where you configure and run the tasks.
When the conditions of a decryption security policy are met, Data Privacy Management dynamically
decrypts data in all protection remote agent data stores by default. To decrypt data in specific data
stores associated with protection remote agents, add a Data Store Name condition to the policy.
Note: You cannot protect data in an anomaly security policy because the violation for this type of
policy might be associated with diverse data stores that cannot be protected by the same protection
extension.
For more information, see “Protecting Data” on page 400.
• Run a Custom Script. For anomaly, data store, or user activity security policies only. Create a new
action or use an existing custom action. Specify parameters, tags, notes, the number of days from
now that the action is due, details about how Data Privacy Management will run and close associated
tasks, and whether to send an email to notify users when associated tasks are assigned to them.
When you save the security policy, Data Privacy Management validates the path and determines
whether the file is executable. In the event of a security policy violation, Data Privacy Management
runs the script file.
For more information, see “Custom Action Properties” on page 374.
• Send an Email. Create a new action or use an existing email action. Assign the task to yourself or to a
different user. Provide an email address in the From field. Provide one or more email addresses in the
To, Cc, and Bcc fields. Separate email addresses with a comma. Optionally, provide a subject and
message, attachments, tags, notes, details about how Data Privacy Management will run and close
associated tasks, and the number of days from now that the action is due. If a security policy
violation occurs, it sends an email to the task assignee and recipient email addresses.
Note: Decryption security policies must use the default email extension.
For more information, see “Email Action Properties” on page 375.
• Write a System Log Message. Create a new action or use an existing system log action. Specify the
server connection details, tags, notes, the number of days from now that the action is due, details
about how Data Privacy Management will run and close associated tasks, and whether to send an
email to notify users when associated tasks are assigned to them. System log actions also include
the log message impact level and the contents of the system log message in a text format, which
optionally includes placeholders.
Note: After you add a system log action to a decryption security policy, you must publish to the
protection remote agents associated with the data stores in the policy.
For more information, see “System Log Action Properties” on page 379.
Note: For anomaly security policies, Data Privacy Management can only perform actions or create tasks
in the event of a violation if the security policy is associated with at least one data store.
For data store security policies only. Specifies when Data Privacy Management runs the Evaluate
Security Policies job, and indicates whether or not it shows only violations from data stores scanned
since the last time the job ran.
Related Topics:
• “Decryption Policy Example” on page 448
• “Publishing Protection Remote Agents” on page 84
Organize security policies in logical groups. For example, you can create security policy groups based on the
following security policy attributes:
You can also create a security policy group by copying an existing security policy group.
Data Privacy Management includes a default security policy group that has all the security policies. When you
create a new security policy, the Data Privacy Management Service adds the security policy to the default
security policy group. To remove a security policy from the default group, edit the security policy.
When you review security policy violations, you can filter the violations by one or more security policy groups.
To access the Security Policy Groups workspace, click Manage > Security Policy Groups.
456
Security Policy Group List Panel
The security policy group list panel shows all of the security policy groups that match the active filter. By
default, you view the security policy group list panel when you access the security policy groups workspace.
The following image shows an example of the security policy groups workspace with the security policy
group list panel:
The following image shows an example of a security policy group details panel:
The name is not case sensitive and must be unique within the Data Privacy Management repository. The
name cannot exceed 255 characters. The name cannot contain a space or special characters.
Description
Optional. Long description of the security policy group. The description cannot exceed 254 characters.
Use this property to specify unique details to differentiate between similarly named security policy
groups.
Owner
Optional. Individual, group, or organization name that is responsible for the accuracy and integrity of the
security policy group. The name is case-sensitive and cannot exceed 100 characters.
Security Policies
• Security Policies Selected. The button shows the number of security policies in the security policy
group. When you click the button, the list of security policies in the security policy group appear. This
list appears by default when you edit a security policy group.
• Assign Security Policies. The button shows the number of security policies available in the Data
Privacy Management repository. When you click the button, the list of all security policies appear.
This list appears by default when you create a security policy group.
You can add one or more security policies to the security policy group. To add, click the check box
next to the name of the security policy.
To quickly find specific security policies from the list, click the security policy filter icon and specify
the filter conditions.
The following image shows a sample Securities Policies section with a filter condition enabled:
First, export the security policy groups, security policies, and actions from the instance you want to copy
from. The Data Privacy Management Service downloads a JSON file to your machine. The export file is
named SecurityPolicyGroups.json.
Import the SecurityPolicyGroups.json file to the instance you want to copy the security policy groups,
security policies, and actions to.
A success or error message appears after Data Privacy Management processes the JSON file. To view
details about the import, read the Data Privacy Management log file.
For example, you want to detect user activity on fields that contain US social security numbers. You set up a
user activity policy that specifies that user activity must impact sensitive data that match the data domain for
US social security. Data Privacy Management evaluates the stream of user activity events. When a user
activity event impacts fields that contain US social security numbers, Data Privacy Management creates a
violation.
Data Privacy Management lists violations on the Security Policy Violations workspace. In addition, if you
specified an action in the security policy, such as email notification or custom script execution, then Data
Privacy Management performs the action.
You can manage violations. For example, you can mark a violation as read or flag a violation to review at a
later time. You can add and remove tags to a violation. You can delete violations. You can also export
violation details for further analysis.
Click the Security Policy Violations icon on the Overview page to access the Security Policy Violations
workspace.
462
The information in the workspace changes based on the selected view, time period, and filter conditions. You
can show or hide the violation indicators and violations graph on the workspace.
The violations you see on the Security Policy Violations workspace are violations that involve at least one
data store that you have access to. The workspace includes predefined views that display violation
information in a variety of ways. You can view the data stores, users, and anomalies that triggered the most
number of violations.
To access the Security Policy Violations workspace, click the Security Policy Violations icon in the header.
Section Description
Violations Displays a list of violations based on the selected view, time period, and filter conditions.
For more information, see “Violations List” on page 465 .
Violation Displays the top 25 users, data stores, and security policies that are associated with the most
indicators violations based on the selected time period and filter conditions. Click a row in an indicator to filter
the workspace by the selected item. Click an indicator title to open a view that shows all information
for the indicator category.
For more information, see “Security Policy Violation Indicators” on page 465 .
Violations Displays a trend of violations based on the selected time period and filter conditions. Click an entry in
graph the graph to filter the workspace by the selected item. All time periods are in the UTC time zone.
Filter Filters the information on the workspace to match the filter conditions. When you click the Filter icon,
the filter pane appears with a list of properties. Set the properties to filter information based on
violation ID, flag status, read status, severity level, users, data stores and data store tags, violation
tags, security policies, security policy groups, security policy types, data domains, data store
department, data store country, or data store location. The filter conditions apply to all sections on the
workspace.
When you apply a filter, the workspace displays the active filter conditions. You can clear all filters or
clear specific filter conditions.
Violations Displays key results to help you quickly assess the number and severity of violations across your
summary enterprise. You can view the number of users associated to violations, the aggregated number of data
stores associated to the violations, and the violation count based on severity level.
For more information see, “Security Policy Violations Summary” on page 464 .
Actions From the Actions menu, you can mark a violation as read or unread. You can flag, unflag, or delete a
menu violation. You can add or update tags. You can export information related to violations.
For more information, see “Security Policy Violations Management” on page 476
Display The Views field provides different options to view and display the aggregated violation information.
options You can view violations by users, data stores, or by the security policies.
For more information, see “Security Policy Violations Workspace Views” on page 474 .
Use the Display and time Period fields to Filter information related to violations detected within the
selected time period. All time periods are in the UTC time zone.
Note: For time periods in days, the workspace displays information for full days. A full day is from 0:00
through 23:59 UTC. The time period you select might not include information from the current day. For
example, you select 30 days. The workspace shows violations that were detected within the last 30
days. The count begins from the last completed day in the UTC time zone, which can be yesterday.
Aggregated number of users associated with the violations. A user can be associated with multiple
violations.
Data Stores
Aggregated number of data stores associated with the violations. A violation can include multiple data
stores. A data store can be included in multiple violations.
Note: The data store filter does not affect the data store count. For example, if you filter the workspace
by one data store, the data store count might show more than one data store. The violation that includes
Severity
Total number of violations in each severity level. Data Privacy Management uses the severity level of the
security policy as the severity level for the corresponding violation. A violation can only have one severity
level.
Click the corresponding title to open a list of all data stores, users, or security policies that are
associated with at least one violation. For each indicator, the list shows the number of associated users,
security policies, data stores, and violations.
Violation count
Page icons
Click the back and forward arrows to scroll through the entries.
The following image shows an example of the Security Policy Violations workspace indicators:
The Top Users indicator displays the name of the user account that performed the user activity event that is
associated with the violation. The user account name appears based on the source of the user activity
information, such as user activity logs.
Click a user name to filter the workspace data for information related to the selected user.
Note: If you disable User Activity in the Data Privacy Management Service properties, you cannot view
information in the Top Users indicator.
Click a data store row on the Top Data Stores indicator to filter the workspace data for information related to
the selected data store.
Click a security policy row on the Top Security Policies indicator to filter the workspace data for information
related to the selected security policy.
Violations List
The violations list panel includes a list of all violations and the corresponding violations details. The list of
violations is dynamic and can change based on the time period and filter conditions that you select in the
To view the violations list, select one of the views in the Security Policy Violations workspace. The violations
list disappears. Select View By > None to display the violations list again.
By default, the list of violations is sorted by unread status, then by the highest severity level, and then by the
violation timestamp. You can click a column header to sort the list of violations by that column.
Tip: Click the filter icon to filter the list of violations. For example, you can filter the list to show only critical
severity violations.
Property Description
Selection box You can perform actions on one or more violations. Select one, multiple, or all
violations. Then, click the Actions menu and select the action. For example, you can
mark all violations as read or unread.
Violations count Number of violations that involve at least one data store to which you have access.
This number depends on the selected time period and filter conditions.
Read/Unread Read or unread status of a violation. You can mark a violation as read or unread.
Changes in the status affect the sort order of the violations. The list of violations is
first sorted by unread violations. Click the column header to sort by the status. You
can filter the workspace by the status to show only the unread or read violations.
By default, a violation is marked as unread until you mark the violations as read. For
more information, see “Marking a Violation as Read or Unread” on page 477 .
Flag/Unflag Flagged or unflagged status of the violations. Click the flag to change the status. A
clear flag indicates a violation that is not flagged. A filled flag indicates a violation
that is flagged. You might want to flag the violations that you need to review further
or to flag for follow-up. Click the column header to sort by flag status. You can use
the flag status as a filter condition for the workspace.
By default, a violation is not flagged until you mark the violations as flagged. The
flag status does not affect the default violations sort order. For more information,
see “Flagging or Unflagging A Security Policy Violation” on page 477 .
Violation ID Violation identification number. When Data Privacy Management detects a violation,
it assigns a unique ID to the violation in the following format:
<Abbreviation for violation type>_<Unique number for node
ID(s)>_<Violation date and time in Epoch format>
Violation types have the following abbreviations:
- Anomaly violation: ANV
- Data store violation: DSV
- User activity violation: UAV
An example of a user activity violation ID is: UAV_10078_1480379025542
Violation Date/Time Date and time that Data Privacy Management detected the violation. The date and
time are in the UTC time zone.
Security Policy Name Name of the security policy that triggered the violation. Click on the violation ID to
see details such as the security policy condition that triggered the violation and the
anomaly, data store, or user activity that matched the security policy condition.
User Name Name of the user account that performed the user activity event associated with the
violation. Data Privacy Management shows the user account name from the source
of the user activity information, such as user activity logs. You can click a user name
to view more details about the user's activity on the User Profile page.
Full Name Full name that is associated with the user account name in the Data Privacy
Management repository. If the full name is null in the Data Privacy Management
repository, the value is a concatenation of the first name and last name. If the first
name and last name is null, the value is the user account name. If the user account is
not in the repository, the value is the user account name.
Data Stores Total number of data stores that match the rule in the security policy that triggered
the violation. Click on the violation ID to see the data stores associated with the
violation.
Note: The count includes all of the data stores that are associated with the
violations, regardless of the data stores to which you have access. When you view
the violations details, you only see the data stores name to which you have access.
Security Policy Type Specifies whether the violation is based on an anomaly, data store, or user activity
security policy.
Severity Severity level of the violations. Data Privacy Management uses the severity level of
the security policy as the severity level for the corresponding violation. A violation
can have a critical, high, medium, low, or informational severity level.
Refresh Click to refresh the list of violations based on the selected duration and filter
conditions.
The following image shows a sample Security Policy Violation details page:
The Security Policy Violations details page includes the following sections:
Property Description
Detail tabs The detail tabs allow you to switch between the following views:
- Violation Details
- Security Policy Details
For more information, see “Violation Details” on page 469 and “Security Policy Details” on page 474 .
Violation ID Violation identification number that Data Privacy Management assigns when a violation is
detected. The highest number indicates the most recent violation.
Actions menu From the Actions menu, you can mark the violation as read or unread. You can flag or unflag the
violation. You can also export details about the violation.
Property Description
Violation ID Violation identification number. When Data Privacy Management detects a violation, it
assigns a unique ID to the violation in the following format:
<Abbreviation for violation type>_<Unique number for node
ID(s)>_<Violation date and time in Epoch format>
Violation types have the following abbreviations:
- Anomaly violation: ANV
- Data store violation: DSV
- User activity violation: UAV
An example of a user activity violation ID is: UAV_10078_1480379025542
Violation Date and Date and time that Data Privacy Management detected the violation. The date and time are in
Time the UTC time zone.
Security Policy Name Name of the security policy that triggered the violation. Click on the Security Policy Details
tab to see the conditions that triggered the violation.
User Name Name of the user account that performed the user activity event associated with the violation.
Data Privacy Management shows the user account name from the source of the user activity
information, such as user activity logs. To learn more about a user, click the user name. The
User Profile page appears.
For data store security policies, the user name is not applicable and appears as Not
Applicable.
Data Stores Total number of data stores that are associated with the violation.
Note: The count includes all of the data stores that are associated with the violation,
regardless of the data stores to which you have access. When you view the list of data stores,
you only see the data stores names to which you have access.
Security Policy Type Specifies whether the violation is based on an anomaly, data store, or user activity security
policy.
Severity Severity level of the violations. Data Privacy Management uses the severity level of the
security policy as the severity level for the corresponding violation. A violation can have a
critical, high, medium, low, or informational severity level.
Property Description
Anomaly ID Anomaly identification number. Data Privacy Management uses the following syntax to
assign an ID to the anomaly:
AN_<Node ID>_<User hash code>_<Anomaly timestamp in Epoch format>
Severity Severity level of the anomaly. Data Privacy Management automatically determines the
severity level for each anomaly based on an internal algorithm. An anomaly can have a
high, medium, or low severity level.
Anomaly Date and Time Date and time of the anomalous behavior that is associated with the anomaly. This is not
the date and time when Data Privacy Management detected the anomaly. The date and time
are in the UTC time zone.
Anomalous Factors Number of system-defined anomalous factors that are associated to the anomaly. For more
information, see “Anomalous Factors” on page 358.
Anomaly Score A calculated value based on the weight of each anomalous factor specified on the Settings
page. Indicates the degree of risk of the anomaly.
Full Name Full name that is associated with the user account name in the Data Privacy Management
repository. If the full name is null in the Data Privacy Management repository, the value is a
concatenation of the first name and last name. If the first name and last name is null, the
value is the user account name. If the user account is not in the repository, the value is the
user account name.
Title Title that is associated with the user account name in the Data Privacy Management
repository. If the user account is not in the repository, the title appears as -
User Department User department that is associated with the user account name in the Data Privacy
Management repository.
User Group Peer group that Data Privacy Management uses to compare user behavior. When Data
Privacy Management analyzes user activity events for anomalous behavior, it compares the
behavior of the user and the user's peer group to detect an anomaly. It uses the department
associated with the user account name in the repository for the peer group. Adds the user
from the same department in the peer group. If the user account is not in the repository, the
group is blank.
User Location Location that is associated with the user account in the Data Privacy Management
repository. This is the location from where the user works. For example, the corporate
office that the user is assigned.
Note: This location is not necessarily the same location from where the user performs the
user activity events.
Data Stores List of all data stores that are associated with the anomaly. A data store is associated with
an anomaly when at least one user activity event in the anomaly included the data store.
If you do not have authorization for a data store in the list, Data Privacy Management
masks the data store name and displays ?. By default, the list is sorted by data store name
in ascending order.
Data Domains List of all data domains that are associated with an anomaly. A data domain is associated
with an anomaly when the user's activity event in the anomaly accessed sensitive data that
matches the data domain. By default, the list is sorted by data domain name in ascending
order.
Anomalous Factor List of anomalous factors associated with the anomaly that triggered the security policy
Details violation. For each anomalous factor, the list shows the factor name, qualitative degree of
difference between the observed and expected behaviors, whether the anomalous factor
was suppressed at the time the anomaly was triggered, and the observed and expected
values. An anomaly can either include an unusual degree or a highly unusual degree.
Property Description
Data Store List of data stores associated with violations within the selected duration and filter conditions.
The list includes only the data stores to that which you have access. Click a data store to see
proliferation details.
Risk Score The number indicates the risk level of a data store. Data Privacy Management calculates the risk
score using different risk factors. The higher the number, the greater the risk to sensitive data.
Click the value to see the risk factors and their respective weights that Data Privacy Management
used to calculate the risk score.
Protection Status The percentage of sensitive fields in the data store that are protected and unprotected. Green
represents the percentage of protected fields. Red represents the percentage of unprotected
fields.
Targets The number of target data stores for the source data store. Click the value to see the proliferation
path from the source data store to the target data stores.
Sensitive Fields Number of sensitive fields in the data store. Click the value to see the list of sensitive fields.
Risk Cost The cost that the business will incur in the event sensitive data is exposed.
Owner Individual, group, or organization that is responsible for the accuracy and integrity of the data
store.
• User details
• Data store details
• Violation details
User Details
The user details section includes the following properties:
Property Description
User Name Name of the user account that performed the user activity event. Data Privacy Management
shows the user account name from the source of the user activity information, such as user
activity logs.
Full Name Full name that is associated with the user account name in the Data Privacy Management
repository. If the full name is null in the Data Privacy Management repository, the value is a
concatenation of the first name and last name. If the first name and last name is null, the value
is the user account name. If the user account is not in the repository, the value is the user
account name.
Title Title that is associated with the user account name in the Data Privacy Management repository.
If the user account is not in the repository, the title appears as -
Department User department that is associated with the user account name in the repository.
Location The location associated with the user account in the repository. This is the location from where
the user works. For example, the corporate office that the user is assigned to.
Note: The location need not necessarily be same as where the user performs the user activity
event.
User IP The IP address of the device the user-activity event originated from.
Property Description
Data Store The name of the data store that the user accessed.
Owner Individual, group, or organization that is responsible for the accuracy and integrity of the data
store associated with the violation.
Department Department that is responsible for the maintenance or ownership of the data store.
Data Store Group Logical grouping of related data stores to which the data store associated with the violation
belongs.
Data Store Type Type of database repository for the data store associated with the violation.
Sensitivity Level A qualitative degree of data sensitivity assigned to the sensitive data associated with the
violation.
Risk Score The number indicates the risk level of a data store. Data Privacy Management calculates the risk
score using different risk factors. The higher the number, the greater is the risk to sensitive data.
Risk Cost The cost that the business will incur in the event sensitive data in the data store is exposed.
Data store host The host or IP address for the data store associated with the violation.
or IP address
Violation Details
The violation details section includes the following properties:
Property Description
User Activity Date and time of the user activity event. The date and time are in the UTC time zone.
Date and Time
Object The names of the objects, such as tables in a relational database, associated with the violation.
Operation Specifies the nature of the request from the user. Values can be Select, Insert, Update, or Delete.
Data Domain The number of data domains that match the sensitive fields which were impacted by the user
Count request.
Data Domains The list of data domains that match the sensitive fields that were impacted by the user request. By
default, the list is sorted by the data domain name in ascending order.
Results The number of records or rows of sensitive data that were impacted by the user request.
Error Status Specifies whether the user received an error after submitting the query. Data Privacy Management
determines the error status from the source of the user activity information, such as user activity
logs.
If the user did not receive an error after submitting the query, the value is None. If the user
received an error, the value is the value in the corresponding field in the user activity logs.
Error Details If the user received an error message after submitting the query, displays the error message.
Otherwise, displays None.
Data Privacy Management determines error details from the source of the user activity
information, such as user activity logs.
Event Source The name of the application that streams user activity information to Data Privacy Management.
Property Description
Name Name of the security policy that the violation is based on.
Description Description of the security policy that the violation is based on.
Rules Conditions that the anomaly, data store, or user activity event associated with the violation, matched.
The rules property also shows what action Data Privacy Management took after detecting the violation.
Displays a list of all data stores that have one or more violations. The view only includes the data stores
to which you have access.
Displays a list of all security policies that have one or more violations.
Users view
Related Topics:
• “User Profile Page” on page 564
You can click a column header on the page to sort the list by the selected column. By default, the list is
sorted by the number of violations in descending order.
Property Description
Data Stores List of data stores that are associated with violations.
Users Aggregated number of users that have violations for the data store.
Security Policies Number of security policies that a violation was based on.
Violations Total number of violations in all severity levels detected for the data store.
The following table lists the properties that appear on the Security Policies view:
Property Description
Data Stores Total number of data stores associated with the security policy.
Users Aggregated number of users that are associated with the security policy.
Security Policy Type Specifies whether the violation is based on an anomaly, data store, or user activity security
policy.
The following table lists the properties that appear on the Users view:
Property Description
Data Stores Total number of data stores associated with violations for the user.
Note: The count includes all of the data stores that are associated with the violation, regardless of
the data stores to which you have access. When you click the count to open the data stores view,
you only see the data stores to which you have access.
Security Policies Total number of security policies that were the basis of violations triggered by the user.
Violations Total number of violations in all severity levels detected for the user.
Severity Total number of violations in the selected severity level detected for the user.
User Name Name of the user account that performed the user activity event that is associated with the
violation.
Full Name Full name that is associated with the user account name in the Data Privacy Management
repository.
Title The title associated with the user account name in the Data Privacy Management repository.
User Department User department that is associated with the user account name in the Data Privacy Management
repository.
• Data and counts that appear correspond to the selected period and filter conditions.
• Data Stores. On the Security Policies and Users views, you can click the count to view a list of data stores
for a selected policy or user.
• Users. On the Data Stores and Security Policies views, you can click the user count to view users related
to violations for a data store or security policy.
• Violations. You can click the violations count to view violation information for a data store, security policy,
or user.
• Severity. On the Data Stores and Users views, you can click the violations count to view severity
information for a data store or user. Data Privacy Management uses the severity level of the security
policy as the severity level for the corresponding violation. A violation can have a critical, high, medium,
low, or informational severity level.
You can perform the following tasks in the Security Policy Violations workspace:
The following image shows the Security Policy Violations icon in the header:
After you flag or unflag a violation, you can sort on the flag status. You can use the status to filter the
workspace. For example, you can filter the workspace to only show information for flagged violations.
Changes to the flag status do not affect the violation sort order in the violations list panel.
Note: When you flag or unflag a violation, Data Privacy Management updates the violation. The change
impacts all users that can see the violation, not just the logged-in user.
1. Click the Security Policy Violations icon in the header to open the Security Policy Violations workspace.
2. Navigate to the violations list panel and perform one of the following steps:
• To flag or unflag one violation, click the icon in the flag status column for the associated violation.
The following image shows an example of the flag status column:
• To flag or unflag multiple violations, select the check box next to the violations. From the Actions
menu, select Flag or Unflag.
After you mark a violation as read or unread, you can use the status to filter the workspace. For example, you
can filter the workspace to only show information for unread violations.
Note: When you mark a violation as read or unread, Data Privacy Management updates the violation. The
change impacts all users that can see the violation, not just the logged-in user.
1. Click the Security Policy Violations icon in the header to open the Security Policy Violations workspace.
2. Navigate to the violations list panel and perform one of the following steps:
• To mark one violation as read or unread, click the icon in the read/unread status column for the
associated violation.
The following image shows an example of the read/unread status column:
To add or remove tags, select the violations. You can use the workspace filter to filter violations based on
tags. If you select multiple violations and then open the Manage Tags page, tags that are common to all
selected violations appear selected in the Tags field. Changes that you make apply to all selected violations.
1. From the list of violations on the Security Policy Violations workspace, select the required violations.
To add and manage tags of a single violation, you can perform the task from the Security Policy
Violation details page.
2. Click Actions > Manage Tags.
The Manage Tags page appears.
3. Click the Tags field to view a list of tags with a text field to create tags.
Tags that are common to selected violations appear selected in the list.
4. Perform one of the following steps:
• To create and apply a tag, enter the tag in the text field and click Add.
The tag is created and applied to the violations.
• To apply a tag from the list of tags, select the tags.
• To remove a tag assigned to a violation, clear the tag selection.
You can remove a tag assignment but cannot delete a tag.
Exporting Violations
You can export information about violations and use the exported file for reporting analysis. You can export
information from the Security Policy Violations workspace and from each view in the workspace.
When you export information, Data Privacy Management saves a compressed file to the default download
directory on your machine. The compressed file includes CSV files that contain information related to the
page you exported from. The information in the CSV files matches the selected filter conditions and time
periods at the time of the export. You cannot export information for violations that you manually select in the
violation list.
1. Click the Security Policy Violations icon in the header to open the Security Policy Violations workspace.
2. Optionally, set the filter conditions, time period, and view.
3. From the Actions menu, select Export.
A dialog box appears with the exported file. The export file is named Data.zip. Data Privacy Management
saves the file to the default download directory on your machine. If you export the file again, Data
Privacy Management downloads a new file. The file name includes an incremental number such as Data
(1).zip.
Warning: The deletion impacts all Data Privacy Management users, not just the logged-in user. After you
delete a violation, it is not possible to restore the violation.
1. Click the Security Policy Violations icon in the header to open the Security Policy Violations workspace.
2. Select the check box next to the violation that you want to delete.
You can select one, multiple, or all violations.
3. From the Actions menu, select Delete.
A message appears to confirm that you want to delete the violation.
Tasks
This chapter includes the following topics:
Tasks Overview
A task performs a specific function depending on the task type and how you configure the task. When a task
runs, Data Privacy Management runs an associated job and updates the task status as the job progresses
and completes successfully or fails.
DSAR tasks create an automated Data Subject Access Request report as a CSV file.
Protection tasks use one of the following protection extensions to protect sensitive and privacy data:
• Encryption
• Persistent Data Masking - Big Data
• Persistent Data Masking - Remote Domain
System tasks use an extension to perform a corresponding action based on how you configure the action
properties. The following task types are system tasks:
Data Privacy Management automatically creates some tasks, such as when actions in a security policy
include tasks and a violation of the security policy occurs. You can also create protection, system, and data
subject tasks manually when you perform manual actions.
Each task type follows a typical workflow as it progresses through status changes and as the associated job
for each task runs.
480
Tasks Workspace
From the Tasks workspace, you can view and manage custom, data subject, DSAR, email, protection, service
management, and system log tasks.
The Tasks workspace displays a list of tasks and a panel of tabs that displays the number of tasks for the
current user, the number of completed tasks, and the total number of tasks. You can also view details and
properties pages for each task in the list.
To access the Tasks workspace, click the Tasks icon in the header:
To open the Tasks workspace, click the Tasks icon in the header.
To sort the list of tasks by other than the task name, click a column header or the Filter icon to sort the list by
specific conditions. To see the list of all tasks that are associated with data stores that you can access, click
All Tasks.
1. Task list
2. All Tasks view
3. Tasks icon
4. Filter icon
5. Display Closed Tasks box
6. Actions menu
7. Refresh icon
Task Name Enter a full or partial task name. The workspace displays a list of tasks that match the name or
name fragment.
Task Status Select a task status to view the list of tasks in that status.
Data Store Enter a full or partial data store name. The workspace displays the list of tasks associated with
the data stores that match the name or name fragment.
Data Store Select one or more of the following data store categories: Application, Big Data, Cloud,
Categories Database Management, Data Integration, File Management. To view tasks associated with all
data store categories, select All.
Data Store Types Select one or more of the data store types that Data Privacy Management lists on the Data
Stores workspace. To view tasks associated with all data store types, select All.
Extension Select an extension to view the list of tasks associated with the extension.
Data Domains Select one or more data domains to view the list of tasks associated with the selected
domains. To view tasks associated with all data domains, select All.
Current Task Owner Select a user name to view the list of tasks the user currently owns.
Users Associated Select one or more user names to view the list of tasks assigned to the users at any point
with Task Any Time during the task workflow. To view tasks associated with all users at any time, select All.
Task Tags Select one or more tags to view the list of tasks associated with the tags.
Due Date Click in the field to view the Select Date Range dialog box. From the calenders in the From and
To panes, select a due date range and then click Apply. The workspace displays a list of tasks
with due dates in the range you selected.
To access the Properties page for a DSAR task, click a task name on the Tasks list page.
Property Description
Task Name The name that Data Privacy Management automatically generates when the task is created, in the
following format: <DSAR>-<10-digit number>.
Data Subject The name of the data subject selected on the Subject Registry workspace for creating a DSAR
report.
Assigner Shows whether the Data Privacy Management system or a user assigned the task to the current
owner.
Execution Output Contains a DSAR report as a CSV file attachment. If the task encountered errors, the property also
contains a DSAR error report as a CSV file attachment. Click each file name to download one CSV
file at a time.
Attachments Optional. Includes any attachments that users added to the task.
Notes Includes any notes that the task creator added to the task. To add other notes, type in the text
field and then click the + icon. Data Privacy Management lists the most recent notes at the top of
the display box. The first line contains the name of the user who added the note and the date and
timestamp the user entered the note. The second line contains the note text.
To access the Properties page for a protection, system, or data subject task, click a task name on the Tasks
list page.
The <task name> Properties page appears. The following image shows an example Properties page for the
email task SYS-1547177832:
The protection, system, and data subject task Properties pages include the following properties:
Property Description
Task Name The name that is automatically generated when the task is created, in the following format:
<prefix>-<10-digit number>.
- For tasks configured to run with a custom, email, service management, or system log extension,
the task name prefix is SYS.
- For custom, email, and service management tasks configured for a data subject, the task name
prefix is DSR.
- For tasks configured to run with an encryption or other protection extension, the task name
prefix is PRT.
Description Optional. A description of the task that uses a maximum of 255 characters. Task descriptions help
you find tasks on the Tasks list page. You can edit this property when you edit the task.
Data Stores For protection and system tasks only, lists the data stores included in the task. Each protection
task is associated with only one data store. System tasks can include more than one data store.
Data Subject For data subject tasks only, lists the name of the data subject that displays on the Subject
Registry workspace.
Violations If a protection or system task runs because a violation of a security policy that contains an action
to run the task occurred, this property appears on the task Properties page. The property lists the
security policy violation IDs associated with the task. You can view the violation details on the
Security Policy Violations workspace.
Extension Lists one of the following categories for the extension that runs the task: Custom, Email,
Category Protection, Service Management, or System Log.
Extension Lists the name of the extension plugin configured on the Extensions workspace.
Assigner Shows whether the Data Privacy Management system or a user assigned the task to the current
owner.
Notify Assignee Specifies whether or not Data Privacy Management sends an email to the user assigned to the task
Via Email to notify the user that the task is created. Default value is Yes. You can edit this property when you
edit the task.
Due Date The date the task is due. The default value is the current date. You can edit this property when you
edit the task.
Tags Optional. Lists any tags associated with the task. You can edit this property when you edit the
task.
Jobs For tasks that Data Privacy Management has not run, the property is empty. For tasks that have
run, lists the job ID for each Orchestration or Protection job that ran the task. You can click the job
ID to view the Job Details page on the Jobs workspace.
Execution For custom, email, protection, and system log tasks, the property is always empty.
Output For service management tasks that ran successfully, provides a link to the ServiceNow ticket. To
view the ticket that the task created, click the URL.
Attachments Optional. Includes any attachments users added to the task. Also includes a .txt file if the task
creator selected the Add Context to Task Details as Attachment option.
Notes For protection tasks that have run and for all system tasks, contains information about the task
creation. The first line contains the name of the user who created the task and the date and
timestamp the user created the task.
On a second line, protection tasks display the text in the Description property. System tasks that
are not created for a data subject display the following syntax:
CreatedSystemTaskFor<extension category>Action
For protection, system, and data subject tasks, includes any notes that the task creator added.
You can add other notes. Type in the text field and then click the + icon. Data Privacy Management
lists the most recent notes at the top of the display box. The first line contains the name of the
user who added the note and the date and timestamp the user entered the note. The second line
contains the note text.
To access the custom Task Details page, click a custom task name on the Tasks list page. On the page that
appears with the task name at the top, click the Task Details tab.
The following image shows the custom Task Details page for the task SYS-1553680314:
Property Description
Parameter Shows placeholders defined in the custom action associated with the task.
Action Type For data subject custom tasks only. Displays the type of subject request that the task fulfills.
Report For data subject custom tasks only. Indicates whether the task provides attributes only or
attributes and values.
Manually close task If true, indicates that a user must manually close the task after the associated Orchestration
after job completion job completes successfully. If false, Data Privacy Management closes the task after the task
reaches the Job Successful status.
Execute separate For system custom tasks only. If true, Data Privacy Management creates a separate task for
action for each data each data store included in the action when the custom script runs. If false, Data Privacy
store Management creates a single task for all affected data stores when the custom script runs.
Edit Click to edit the task details or properties if the task has a status of New or Job Successful.
Run Click to run the task if the task has a status of New and the task is assigned to you.
To access the DSAR Task Details page, click a DSAR task name on the Tasks list page. On the page that
appears with the task name at the top, click the Task Details tab.
The following image shows the DSAR Task Details page for the completed task DSAR-1568841683:
Property Description
Data Store Lists the data stores that contain privacy data about the data subject in the DSAR report.
Contains a list of the data stores that completed successfully when the task ran and the data
stores that failed when the task ran.
Data Store Admin The user name of the administrator for each data store.
Retry Selected For DSAR tasks that are not in progress, click after selecting one or more completed or failed
Items data stores to run the DSAR task again for the selected data stores and the data subject.
Retry All Failed For DSAR tasks that are not in progress, click to run the DSAR task again for the failed data
Items stores and the data subject.
To access the email Task Details page, click an email task name on the Tasks list page. On the page that
appears with the task name at the top, click the Task Details tab.
Property Description
Bcc The email addresses of blind copied recipients that are not displayed when the task runs.
Subject The subject of the email message that Data Privacy Management sends to the specified
recipients when the task runs.
Message The contents of the email message that Data Privacy Management sends to the specified
recipients when the task runs. The message can contain placeholders.
Export selected items If true, indicates that Data Privacy Management will attach a CSV file with the task details to
as email attachment the email the task sends to the specified recipients.
Attachments Contains any attachments that the task creator added to the task.
Action Type For data subject email tasks only. Displays the type of subject request that the task fulfills.
Report For data subject email tasks only. Indicates whether the task provides attributes only or
attributes and values.
Attach DSAR Report if For data subject email tasks only. If true, indicates that Data Privacy Management attaches a
Available DSAR report, if available, to the email message.
Manually close task If true, indicates that a user must manually close the task after the associated Orchestration
after job completion job completes successfully. If false, Data Privacy Management closes the task after the task
reaches the Job Successful status.
Execute separate For system email tasks only. If true, Data Privacy Management creates a separate task for
action for each data each data store included in the action when the custom script runs. If false, Data Privacy
store Management creates a single task for all affected data stores when the custom script runs.
Edit Click to edit the task details or properties if the task has a status of New or Job Successful.
Run Click to run the task if the task has a status of New and the task is assigned to you.
To access the protection Task Details page, click a protection task name on the Tasks list page. On the page
that appears with the task name at the top, click the Task Details tab.
The following image shows the protection task details for the task PRT-1547181944, which uses a Persistent
Data Masking - Big Data Management protection extension and has a status of New:
Data Store The data store that contains sensitive data that Data Privacy Management protects when
the task runs.
Data Domains The data domains included in the data store that contain sensitive fields that you will
protect when the task runs.
Configure Protection Click to configure the protection rules for sensitive fields if the task has a status of New.
Edit Click to edit task details if the task has a status of Job In Progress, Job Failed, or Closed.
Schedule Protection Click to schedule when the task will run if the task has a status of Configured.
Job
View Configuration Click to view the configuration for a protection task in any status except for New.
To access the service management Task Details page, click a service management task name on the Tasks
list page. On the page that appears with the task name at the top, click the Task Details tab.
The following image shows the service management task details for the task SYS-1547179976:
The service management Task Details page includes the following properties:
Property Description
Description A description of the task that appears under the task name on the Tasks list page. The
description can contain placeholders.
Category One of the following categories: Database, Hardware, Inquiry, Network, Software.
Attach context to For system service management tasks only. If true, indicates that Data Privacy Management
Service Now Ticket will attach a CSV file with the task details to the ServiceNow ticket that the task creates.
Attachments Contains any attachments that the task creator added to the task.
Wait for Service Now For data subject service management tasks only. If true, indicates that the ServiceNow
Ticket Status Change ticket status must change to In Progress before the service management task in Data
Privacy Management can close.
Action Type For data subject service management tasks only. Displays the type of subject request that
the task fulfills.
Report For data subject service management tasks only. Indicates whether the task provides
attributes only or attributes and values.
Attach DSAR Report if For data subject service management tasks only. If true, indicates that the task attaches a
Available DSAR report, if available, to the ServiceNow ticket.
Manually close task If true, indicates that a user must manually close the task after the associated
after job completion Orchestration job completes successfully. If false, Data Privacy Management closes the
task after the task reaches the Job Successful status.
Execute separate For system service management tasks only. If true, Data Privacy Management creates a
action for each data separate task for each data store included in the action when the custom script runs. If
store false, Data Privacy Management creates a single task for all affected data stores when the
custom script runs.
Edit Click to edit the task details or properties if the task has a status of New or Job Successful.
Run Click to run the task if the task has a status of New and the task is assigned to you.
To access the system log Task Details page, click a system log task name on the Tasks list page. On the
page that appears with the task name at the top, click the Task Details tab.
The following image shows the system log task details for the task SYS-1547181540:
Message The contents of the message that system log task writes to the remote server. The message
can contain placeholders.
Manually close task If true, indicates that a user must manually close the task after the associated Orchestration
after job completion job completes successfully. If false, Data Privacy Management closes the task after the task
reaches the Job Successful status.
Execute separate If true, Data Privacy Management creates a separate task for each data store included in the
action for each data action when the custom script runs. If false, Data Privacy Management creates a single task
store for all affected data stores when the custom script runs.
Edit Click to edit the task details or properties if the task has a status of New or Job Successful.
Run Click to run the task if the task has a status of New and the task is assigned to you.
Task Status
Data Privacy Management tasks change status as they progress through the task workflow. The Tasks list
page shows the status for each task in the Status column.
When protection, system, and data subject tasks run, Data Privacy Management runs an associated job that
you can view and track on the Jobs workspace.
New Protection, The status of tasks when they - For protection tasks, assign a protection
System, Data first appear on the Tasks extension to the task and configure the task.
Subject workspace. - For system and data subject tasks, run the
task.
You can also edit the task details and
properties before configuring or running the
task.
Configured Protection The status after you finish Schedule a protection job for the task.
configuring a protection task. You can also select one of the following
options from the Actions menu:
- Reassign Task
- Upload Attachment
- Mark Task as Completed
- Mark Task as Closed
- Reject Task
- Reconfigure Task
- Edit Task
- Delete Task
- Export
In Progress DSAR The status of DSAR tasks when Wait for the task to finish with a status of
they first appear on the Tasks Completed. You can then view the task
workspace and are running Properties page and download the DSAR report
automatically. as a CSV file, or download a DSAR Report from
the Subject Details page.
Job Protection The Protection job is scheduled Wait until the job runs. You can also select one
Scheduled to run at a specified time in the of the following options from the Actions menu:
future. - Reassign Task
- Upload Attachment
- Reconfigure Task
- Edit Task
- Export
Job In Protection, The Protection or Orchestration Wait until the job finishes. If required, you can
Progress System, Data job for the task is running. pause, stop, or resume a running Protection or
Subject Orchestration job on the Jobs workspace.
Job Failed Protection, The Protection or Orchestration To delete the task, delete the terminated job on
System, Data job is terminated and you can no theJobs workspace. Return to the Tasks
Subject longer run or resume the job on workspace, select the task, and then delete the
the Jobs workspace. task.
Job Protection, The job associated with running - For system tasks that enable the Manually
Successful System, Data the task completed successfully. close task after job completion option, close
Subject the task.
- For protection tasks, complete the task.
Completed DSAR and For DSAR tasks, Data Privacy Close the task.
Protection Management finished generating For protection tasks, you can also select one of
the requested DSAR report for a the following options from the Actions menu:
subject. - Reassign Task
For protection tasks, the - Reconfigure Task
Protection job was successful - Edit Task
and the task is marked as - Delete Task
complete.
Closed DSAR, For system and data subject From the Actions menu, you can select one of
Protection, tasks that do not enable the the following options:
System, Data Manually close task after job - Delete Task
Subject completion option, Data Privacy - Rollback (encryption protection tasks only)
Management automatically When a protection task is closed, Data Privacy
closes the task after the job Management copies the protection status for
completes successfully. the columns in the task to the columns on the
For DSAR and protection tasks, data store.
indicates that the task was
marked as complete and then
closed.
Rejected Protection A user rejected the configuration You can select one of the following options
of the protection task. from the Actions menu:
- Configure Task
- Reassign Task
- Mark Task as Configured
- Mark Task as Closed
- Mark Task as Completed
- Edit Task
- Delete Task
- Export
Related Topics:
• “Job Management” on page 304
• “Job Types” on page 284
Task Management
You can manage custom, data subject, DSAR, email, protection, service management, and system log tasks
on the Tasks workspace.
You create new protection and system tasks when you take manual action from the following Data Privacy
Management views:
You create data subject tasks that create a service management ticket, run a custom script, or send an email
when you select an option in the Actions > Take Action menu on the Subject Details page.
Data Privacy Management creates new tasks when actions are included in an anomaly, data store, or user
activity security policy and a violation of that policy occurs.
The topics in this section provide instructions for managing data subject, DSAR, protection, and system
tasks.
1. On the Tasks workspace. select the check box next to a task name.
2. Click Actions > Upload Attachment.
3. Navigate to the directory on your machine that contains the file. Select the file and click Open.
The Task Details and Properties page appears and the file is added in the Attachments field. The
attachment remains in the task properties.
4. Optionally, click Add to attach more files.
For example, you can remove an existing proof of protection file from a protection task and replace it by
adding a new proof of protection file to the task.
1. Click the Tasks icon in the header to open the Tasks workspace.
The Tasks workspace displays a list of tasks.
2. Click the name of the task for which you want to remove an attachment.
The Task Details and Properties page appears.
3. In the Attachments field, click the X at the top right of the attachment file icon to delete the file.
4. Optionally, click + to attach a new file.
5. Optionally, you can edit, mark the task as closed, or reassign any task type. You can run custom, data
subject, email, service management, or system log tasks. If the task is a protection task, you can view
the task configuration, schedule a protection job for the task, mark the task as completed, or reject the
task.
Before performing these steps, terminate any running jobs for the task.
1. On the Tasks workspace, select the check box next to one or more tasks that you want to delete.
2. Click Actions > Delete Task.
The Confirm Delete prompt asks if you are sure you want to delete the selected tasks.
3. Click Yes, I'm Sure.
The task must be assigned to you and you must have privileges to edit tasks. You cannot edit a task that has
a status of Job In Progress, Job Failed, or Closed.
1. Click the Tasks icon in the header to open the Tasks workspace.
The Tasks workspace displays a list of tasks.
2. To edit a custom task, click the task name to display the Task Details and Properties page. Click Edit.
3. On the Properties view, you can edit the description, the Notify Assignee Via Email check box, the due
date, tags, and add any notes.
4. On the Task Details view, you can edit the parameter and the check boxes that specify how Data Privacy
Management will run and close the task.
For data subject custom tasks, you can also edit the Action Type and Report properties.
5. Click Save.
The task must be assigned to you and you must have privileges to edit tasks. You cannot edit a task that has
a status of Job In Progress, Job Failed, or Closed.
1. Click the Tasks icon in the header to open the Tasks workspace.
The Tasks workspace displays a list of tasks.
2. To edit an email task, click the task name to display the Task Details and Properties page. Click Edit.
3. On the Properties view, you can edit the description, the Notify Assignee Via Email check box, the due
date, tags, and notes.
4. On the Task Details view, you can edit the sender and recipient email addresses, email subject and
message, and the check boxes that specify how Data Privacy Management will run and close the task.
You can also add attachments.
For data subject email tasks, you can also edit the request type, Report property, and the option to attach
a DSAR report.
5. Click Save.
The task must be assigned to you and you must have privileges to edit tasks. You cannot edit a task that has
a status of Job In Progress, Job Failed, or Closed.
1. Click the Tasks icon in the header to open the Tasks workspace.
The Tasks workspace displays a list of tasks.
2. To edit a service management task, click the task name to display the Task Details and Properties page.
Click Edit.
3. On the Properties view, you can edit the description, the Notify Assignee Via Email check box, the due
date, tags, and notes.
4. On the Task Details view, you can edit the short description, description, and the urgency, impact and
category selections. You can also add attachments and edit the check boxes that specify how Data
Privacy Management will run and close the task.
For data subject service management tasks, you can edit the request type, Report property, the option to
wait for a ServiceNow ticket status change, and the option to attach a DSAR report.
5. Click Save.
The task must be assigned to you and you must have privileges to edit tasks. You cannot edit a task that has
a status of Job In Progress, Job Failed, or Closed.
1. Click the Tasks icon in the header to open the Tasks workspace.
The Tasks workspace displays a list of tasks.
2. To edit a service management task, click the task name to display the Task Details and Properties page.
Click Edit.
3. On the Properties view, you can edit the description, the Notify Assignee Via Email check box, the due
date, tags, and notes.
4. On the Task Details view, you can edit the message, log level selection, and the check boxes that specify
how Data Privacy Management will run and close the task. Optionally, you can use placeholders in the
message.
5. Click Save.
Exporting a Task
You can export task information on the Tasks workspace to a CSV file. You can use the information from the
exported file for reporting analysis.
1. Click the Tasks icon in the header to open the Tasks workspace.
The Tasks workspace displays a list of tasks.
2. Optionally, sort the tasks by extension, number of data stores, due date, or status.
3. From the Actions menu, select Export.
A prompt at the bottom of the Tasks workspace asks if you want to open or save the CSV file. The
default file name is Tasks.csv.
Note: For custom, email, service management, and system log actions, if the Manually close task after job
completion check box is not selected in the action properties, Data Privacy Management will automatically
change the status of the associated task to Closed after the job completes successfully. In this case, you do
not need to perform these steps to close a task.
On the Tasks workspace, mark a task as closed in one of the following ways:
1. Select the check box next to one or more system or data subject tasks that have a status of Job
Successful, or one or more protection tasks that have a status of Completed. From the Actions menu,
select Mark Task as Closed.
2. Click a task name to display the Task Details and Properties page. From the Actions menu, select Mark
Task as Closed.
If no errors occurred, Data Privacy Management updates the task status to Closed. For protection tasks,
Data Privacy Management also copies the protection status for the columns in the task to the columns
on the data store.
By default, the Tasks workspace does not list closed tasks. To view closed tasks, select the Display
Closed Tasks check box.
Reassigning a Task
At any stage of the workflow before a protection, system, or data subject task is closed, you can reassign a
task to another Data Privacy Management user.
1. On the Tasks workspace, select the check box next to one or more tasks you want to reassign.
2. From the Actions menu, select Reassign Task.
3. To assign the task back to the original assigner or the Administrator, select Assigner | Administrator. To
assign the task to another user, select Assign To and then select a user from the list.
4. Optionally, enter notes.
5. Click OK.
System tasks appear with a prefix of SYS. Data subject tasks appear with a prefix of DSR.
The user assigned to the new task runs the task manually.
1. On the Tasks workspace, select the check box next to one or more new custom, data subject, email,
service management, or system log tasks that you want to run.
2. Click Actions > Run Task.
If the task ran successfully, the status changes to Job Successful. You can run the task again or mark
the task as closed.
1. On the Tasks list page, select the check box next to the protection task you want to configure.
2. Select Actions > Configure Task.
3. Select the check box next to one or more sensitive fields to configure for protection. You can select
fields that have the same data type.
Important: To configure protection for sensitive fields, the field data type must be defined. You cannot
apply protection to sensitive fields that do not have a value in the Data Type column. To configure
encryption rules for sensitive fields in Cloudera Hive data stores, the Data Type column must have a
value of string.
The Protection Properties dialog box displays configuration options based on the protection extension
defined in the Apply Protection Extension field.
4. Optionally, to apply the default protection rules to all sensitive fields in the data store based on the
protection extension and the rules specified in data domains, select the top left check box next to the
Schema column and then select Apply default rules from the Actions menu.
A confirmation prompt asks you to confirm that you want to apply default protection rules for all data
domains in the data store. To confirm, click Yes, I'm Sure. Data Privacy Management displays a
message indicating the number of columns for which the default data domain protection rules are
applied.
5. To mark one or more selected sensitive fields as protected, select Mark as protected from the Actions
menu.
6. To mark one or more selected sensitive fields as not protected, select Mark as unprotected from the
Actions menu.
7. To erase the configuration settings you applied before completing the configuration, select Clear
confirguration from the Actions menu.
8. To manually configure other sensitive fields in the data store, or to configure protection rules that differ
from the default rules defined in data domains, perform the steps that correspond to the protection
extension for the task:
• “Configuring Encryption Rules and Encryption Keys” on page 501
• “Editing Persistent Data Masking Rules” on page 502
When finished, change the protection task status to Configured in one of the following ways:
1. To designate the sensitive field as protected, in the Protection Properties box, toggle the Protected Field
indicator to display Yes.
Related Topics:
• “Decrypting a Closed Encryption Task” on page 503
• “Adding or Updating a Protection Extension on the Data Domain Details Page” on page 74
1. In the Protection Properties dialog box, click No to designate the sensitive field as protected.
The value for the Protected Field changes to Yes.
2. Select the masking rule to apply to the protected field.
Note: If the data domain associated with the field is configured with protection rules, the Masking Rules
list displays only the configured rules, and the Show All Rules check box is cleared by default. If you
select a masking rule from the list and then select the Show All Rules check box, Data Privacy
Management clears the assigned rule and you must select the rule again from the Masking Rules list. If
the data domain associated with the field is not configured with protection rules, the Show All Rules
check box is selected by default.
The following image shows the Protection Properties dialog box for the sensitive field
PASSPORT_NUMBER, which uses the Passport Rule masking rule:
3. Depending on the masking rule you selected in step 2, the Unique Substitution Column field might
appear. This field is required. Select a substitution column from the list.
The following image shows the Protection Properties dialog box for the sensitive field FIRSTNAME,
which uses the First Name Rule masking rule and the FIRSTNAME unique substitution column:
4. Click Save.
1. On the Tasks list page, select the check box next to a protection task that uses an encryption extension
and has a status of Closed.
2. Click Actions > Rollback.
The task must be assigned to you and you must have privileges to edit tasks. You cannot edit a task that has
a status of Job In Progress, Job Failed, or Closed.
1. Click the Tasks icon in the header to open the Tasks workspace.
The Tasks workspace displays a list of tasks.
2. To edit a protection task, click the task name to display the Task Details and Properties page. Click Edit.
3. On the Properties view, you can edit the description, the Notify Assignee Via Email check box, the due
date, tags, and notes.
4. On the Task Details view, you can edit the selections for the data domains to include in the protection
task.
5. Click Save.
1. Click the Tasks icon in the header to open the Tasks workspace.
The Tasks workspace displays a list of tasks.
2. Optionally, click the Refresh icon to display the current task statuses.
3. Mark a protection task as completed in one of the following ways:
• Select the check box next to one or more protection tasks that have a status of Configured, Job
Successful, or Rejected. From the Actions menu, select Mark Task as Completed.
• Click a protection task name to display the task details page. From the Actions menu, select Mark
Task as Completed.
The Confirm Complete Action prompt asks if you are sure you want to mark the selected task(s) as
complete.
5. Reassign the task to the Data Privacy Management user who will close the task. The system can assign
the task or you can select a user from the list.
6. Optionally, enter any notes.
7. Click OK.
A message indicates if the Data Privacy Management Service marked the task as complete or if an error
occurred. If no errors occurred, Data Privacy Management updates the task status to Completed.
You can add attachments and mark the task as closed.
1. Click the Tasks icon in the header to open the Tasks workspace.
The Tasks workspace displays a list of tasks.
2. Mark a task as configured in one of the following ways:
• Select the check box next to one or more protection tasks. From the Actions menu, select Mark Task
as Configured.
• Click a task name to display the protection Task Details page. Click Mark Configuration as Complete.
Data Privacy Management updates the task status to Configured.
You can now schedule a protection job for the configured task.
Before performing these steps, terminate any running protection jobs for the task. For more information, see
Terminating a Job in the Jobs chapter.
If a protection job fails, you might need to change some task configuration settings. Perform the steps in
“Configuring Protection Tasks” on page 499 to update the protection properties and then perform the
following steps to reconfigure the task:
1. Click the Tasks icon in the header to open the Tasks workspace.
The Tasks workspace displays a list of tasks.
2. Select the check box to the left of the protection task that you want to reconfigure and click Actions >
Reconfigure Task.
You receive a message that indicates the success or failure of the reconfigure action.
Before performing these steps, terminate any running protection jobs for the task. For more information, see
Terminating a Job in the Jobs chapter.
1. Click the Tasks icon in the header to open the Tasks workspace.
The Tasks workspace displays a list of tasks.
2. Reject a protection task in one of the following ways:
• Select the check box next to one or more protection tasks that you want to reject. Click Actions >
Reject Task.
• Click a protection task name to display the Task Details and Properties page. Click Actions > Reject
Task.
The Reject Task page appears.
3. Optionally, reassign the task to another Data Privacy Management user.
4. In the Notes field, enter a reason for rejecting the task.
5. Click OK.
You receive a message that indicates the success or failure of the rejection. If no errors occurred, Data
Privacy Management updates the status of the protection task to Rejected.
Related Topics:
• “Configuring Protection Tasks” on page 499
• “Protection Job” on page 299
1. On the Tasks workspace, schedule a Persistent Data Masking - Big Data protection job in one of the
following ways:
• Select the check box next to a configured protection task that is associated with the Persistent Data
Masking - Big Data (PDM BDE) protection extension. From the Actions menu, select Schedule
Protection Job.
• Click a task name to display the protection Task Details and Properties page. Click Schedule
Protection Job.
1. Click the Tasks icon in the header to open the Tasks workspace.
The Tasks workspace displays a list of tasks.
2. Schedule a Persistent Data Masking - Remote Domain protection job in one of the following ways:
• Select the check box next to a protection task that is associated with the Persistent Data Masking -
Remote Domain protection extension. From the Actions menu, select Schedule Protection Job.
• Click a task name to display the protection Task Details and Properties page. Click Schedule
Protection Job.
The Schedule Protection Job: PRT-<task_number> page displays the settings defined for the protection
extension in the Runtime Configuration pane. You can edit these settings for the protection job.
3. In the Batch Update field, select Yes or No.
If you select Yes, specify a batch size in the Batch Size field.
4. Optionally, edit the selections in the check boxes that indicate whether to disable indexes, constraints,
and triggers.
5. Optionally, select the arrow next to the Advanced Settings label to display the advanced settings.
1. Click the Tasks icon in the header to open the Tasks workspace.
The Tasks workspace displays a list of tasks.
2. Click the protection task name that has a completed configuration you want to view.
The Task Details and Properties page appears.
3. Click View Configuration.
The View Configuration: PRT-<task_number> page appears. You can view the sensitive fields in the data
store and whether or not each column is protected and configured.
509
Chapter 24
Security Dashboard
This chapter includes the following topics:
Each indicator on the Overview workspace displays high-level information about a key risk factor. Most
dashboard indicators contain links for accessing and managing the metric details from other pages.
For example, click the Top Data Stores indicator title to open the Data Stores page that lists data stores from
highest to lowest risk score. To open the Proliferation page for a data store listed in the indicator, click the
data store name. To open the Sensitive Fields page for a data store listed in the indicator, click the value in
the Fields/Files column.
510
The following image shows a sample Overview workspace:
You can view the percentage of scanned data stores, the average risk score for sensitive data, the number of
data stores in each protection and sensitivity level, and the estimated cost of unprotected sensitive data.
Click an indicator to view the details of the metric.
• The solid blue section represents the percentage of fully scanned data stores.
• The dotted blue section represents the percentage of partially scanned data stores.
• The gray section represents the percentage of data stores that Data Privacy Management has not
scanned.
• The percentage number to the right of the bar represents the total percentage of data stores that are fully
or partially scanned.
To view the exact percentages of fully or partially scanned data stores, hover over the solid and dotted
sections of the blue bar. To view the list of fully scanned, partially scanned, or unscanned data stores, click
the respective section of the Discovery Bar indicator. The Data Stores workspace opens in a new tab, and
lists the data stores for the scanned status you selected.
To view more details about a data store from the Data Stores list page, click the data store name. The Data
Store Details page appears.
The following image shows the Risk Score indicator on the Overview workspace:
1 Average risk Indicates the average risk score of data stores that Data Privacy Management
score for the scanned in the current month. Data Privacy Management calculates risk scores
current month based on the following metrics for sensitive data in each scanned data store:
- Sensitivity level
- Protection status
- Number of sensitive fields
- Number of sensitive records
- Number of targets
- Residual risk cost
- Number of active users
- Number of active and inactive users with activity in Data Privacy Management
2 Risk score range The minimum and maximum values for the risk score. Defaults are 0 and 100.
3 Trend icon Indicates the direction of change in the average risk score from the previous month
to the current month:
- An up arrow indicates an increase.
- A down arrow indicates a decrease.
- A straight line indicates no change.
4 Change from Indicates the difference in the average risk score value between the previous month
previous month and the current month.
5 Color categories Indicates the level of the current risk score with 10 squares that are each a shade of
green, yellow, orange, and red. Each color square represents 10 units of the risk
score range moving left to right.
For example, low risk scores between 1-30 are represented by a shade of green,
medium risk scores between 40-60 are represented by a shade of yellow, and higher
risk scores are represented by a shade of orange or red.
6 Current month Indicates the color category of the risk score for the current month.
color category
7 Line graph Plots the average risk score for each month over a maximum 12-month period to
show the trend change. If there are no scans in the current month, Data Privacy
Management reuses the average risk score from the previous month.
You can change the weight of each risk score factor from the Settings workspace. Data Privacy Management
recalculates the risk score of the scanned data stores and refreshes the values on the Risk Score indicator.
Related Topics:
• “Adding Custom Risk Score Factors” on page 43
• “Changing Risk Score Factor Weights” on page 42
Data Privacy Management determines the protection status of sensitive fields and files when you scan data
stores and when you import protection status for a data store on the Data Stores workspace.
The following image shows the Protection Status indicator on the Overview workspace.
1. The percentage of discovered sensitive fields and files that are not protected. To view the exact
percentage and number of unprotected sensitive fields and files, hover over the red bar.
2. The percentage of discovered sensitive fields and files that are protected. To view the exact percentage
and number of protected sensitive fields and files, hover over the green bar.
You can define a maximum of five sensitivity levels on the Settings workspace. When you create a
classification policy, you specify a sensitivity level for the policy. If a data store meets the classification
policy criteria during a scan, Data Privacy Management associates the sensitivity level of the classification
policy with the data store.
The following image shows the Sensitivity Level indicator on the Overview workspace:
1. The names of the data sensitivity level classifications that are defined on the Settings workspace.
2. The bar is a visual representation of the percentage of data stores for each sensitivity level, with higher
sensitivity levels in red or orange and lower sensitivity levels in yellow or green. To view the percentage
and number of data stores that matched a classification policy for each sensitivity level, hover over the
bar.
Related Topics:
• “Creating a Classification Policy” on page 226
• “Specifying Data Store Sensitivity Levels” on page 42
The following image shows the Residual Risk indicator on the Overview workspace:
1 Number of sensitive The total number of sensitive fields and files for all data stores at the current
fields/files time.
2 Number of The total number of sensitive data impressions for all classification policies at
classification policy the current time.
impressions Data Privacy Management calculates the number of policy impressions with a
formula for each user query that returns sensitive data records. Policy
impressions that a user accesses equal the number of sensitive fields a query
returns multiplied by the number of rows returned for each sensitive field in the
query.
For example, if a user searches a data store for the three sensitive fields SSN,
FIRST_NAME, and LAST_NAME, and the search returns 100 rows, Data Privacy
Management calculates 300 policy impressions.
3 Total risk Sum of the cost of unprotected sensitive data in all the data stores in the
enterprise. The residual risk cost for each data store is the product of the
following factors:
- The cost for one impression of sensitive data that you specified in the
classification policy.
- The number of unprotected sensitive impressions in the data store.
4 Total risk for Sum of the cost of unprotected sensitive data in all the data stores in the
previous month enterprise for the previous month.
5 Currency and unit of Shows the type of currency anc unit of the residual risk cost. For example, the $
cost sign indicates the United States dollar and M represents units in millions.
6 Trend icon Indicates the direction of the residual risk cost from the previous month to the
current month. The up arrow indicates an increase and the down arrow indicates
a decrease in the risk cost. A horizontal line indicates no change from the
previous month to the current month.
7 Change from Indicates the difference in the residual risk between the previous month and the
previous month current month.
Related Topics:
• “Classification Policy Properties” on page 220
• “Changing the Default Risk Cost” on page 45
For information about customizing the indicator display, see “Customizing the Security Dashboard” on page
569.
The following image shows the expanded view of the Sensitive Data by Data Store Groups indicator:
The Sensitive Data by Data Store Groups indicator lists the names of the classification policies and data
store groups with sensitive fields/files that match each classification policy. To view the full name of an
incomplete data store group or classification policy, hover over the name.
Each tile color represents the level of the average risk score for the data stores that match a classification
policy in a data store group. The lower risk scores are shades of green and yellow, and higher risk scores are
shades of orange and red.
To expand or contract the indicator view, click the toggle icon in the top right corner of the indicator.
To view more details on the Sensitive Data by Data Store Group page, click the indicator label.
The indicator shows the number, protection status, risk score, residual risk cost, and sensitivity level of data
stores in each region. By default, the indicator displays the region with the most number of sensitive data
stores.
The following image shows the Sensitive Data for Location indicator on the Overview workspace:
The following table describes the information displayed on the Sensitive Data for Location indicator:
1 Region of the world. The color of the circle represents the View.
risk score level.
4 The aggregate risk score of all the data stores in the region. View.
5 The number of data stores with at least one sensitive field in Click the label to view the Data Stores
the selected region. list page filtered by the data stores
included in the selected region.
6 The total residual risk cost of the data stores in the selected View.
region.
8 Risk score graph. Plots the aggregate risk score for each -
month over a 12-month period. If there are no scans in the
current month, Data Privacy Management reuses the
aggregate risk score from the previous month.
To view the Sensitive Data Locations page, click the indicator label.
The indicator lists classification policies in descending order according to the number of target data stores.
For each classification policy, the indicator shows the number of target data stores that contain sensitive
You can perform the following actions from the Sensitive Data Proliferation indicator:
• To view more details about the classification policies and the proliferation of sensitive data, click the
indicator label.
• To view the exact number of source or target data stores represented by a bar graph, hover over the
source or target bar.
• To view more classification policies with proliferation of sensitive data, click and drag the scroll bar on
the right of the indicator.
The list view of the Top Data Domains indicator lists data domains from highest to lowest number of
sensitive fields.
To view the number of impressions for the past seven days in the UTC time zone, click the Week tab.
To view the number of impressions for the most recent 24-hour period in the UTC time zone, click the Day
tab.
The values in the Data Stores and Fields columns do not change when you change the time period.
You can perform the following actions from the list view:
• To view the Data Domains list page, click the Top Data Domains indicator label.
• To view details about a data domain, click a data domain name or the value in the Fields column.
• To view the list of data stores included in a data domain, click the Data Stores value.
• To view information about users who accessed sensitive records that match classification policies in a
data domain, click the Impressions value.
• To view more top data domains, click and drag the scroll bar to the right of the list.
• To display the Top Data Domains indicator trend view, click the Toggle icon in the top right corner.
The trend view of the Top Data Domains indicator shows the following details:
• The number of classification policy impressions that users accessed for the selected data domain and
time period.
• Below the number of impressions, the percentage and direction of change in impressions between the
previous two time periods. An up arrow indicates an increase, a down arrow indicates a decrease, and a
horizontal line indicates no change in impressions accessed between the two time periods.
• A graphical representation of the number of impressions and the direction of change in impressions over
a specified time period.
• Below the impression trend graph, the protection status of sensitive fields in the data domain as a
percentage from 0-100. The bar graph represents protected fields in green and unprotected fields in red.
To return to the Top Data Domains indicator list view, click the Toggle icon in the top right corner.
The list view of the Top Data Stores indicator lists data stores from highest to lowest risk score. You can
perform the following actions from this view:
• To view the Data Stores Grid page, click the Top Data Stores indicator label.
By default, the protection status view shows details for the first data store in the list view, and each data
store name shows the number of sensitive fields. To view protection status details for another data store,
click a data store name. To view more data stores in the list, click and drag the scroll bar.
The protection status view of the Top Data Stores indicator shows the following details:
• The risk score for each data store as a value from 0-100. If there were no scans of the data store in the
current month, Data Privacy Management reuses the risk score from the previous month.
• Below the risk score, the risk score delta from the previous month as a percentage.
• The risk score graph that plots the monthly risk score for the selected data store for a maximum of 12
months.
• The protection status of sensitive fields in the data store as a percentage from 0-100. The protection
status percentage appears in a bar graph above the data store list. The bar graph represents protected
fields in green and unprotected fields in red.
To return to the Top Data Stores indicator list view, click the Toggle icon in the top right corner.
The list view of the Top Departments indicator lists departments from highest to lowest risk score. For each
department, the indicator lists the name, the number of associated data stores, the average risk score, and
the total risk cost.
You can perform the following actions from the list view:
• To view more details about the top departments, click the indicator label.
• To view details about the data stores associated with a department, click the Data Stores value.
• To display the trend view, click the Toggle icon in the top right corner.
By default, the trend view shows details for the first department in the list view, and each department name
shows the associated risk score. To view details for another department, click a department name.
The trend view of the Top Departments indicator shows the following details:
The indicator also shows the department to which each user belongs, the number of data stores with at least
one sensitive field that each user accessed, and the number of records across each sensitive field that each
user accessed in the time period selected.
The following image shows the list view of the Top Users indicator:
You can perform the following actions on the Top Users indicator list view:
The following image shows the trend view of the Top Users indicator:
You can perform the following actions on the Top Users indicator trend view:
Information appears on the User Access indicator after you run data store scans or import data store details
to identify sensitive data and after you import user information.
By default, the user access metrics show the total number of users with access to sensitive fields or files, as
represented by the circle. The red arc represents the unprotected sensitive fields or files, and the green arc
represents the protected sensitive fields or files. The length of each arc is proportional to the number of
sensitive fields or files in that category. The metrics at the bottom of the indicator show the number of
protected and unprotected sensitive fields and files.
You can perform the following actions on the User Access indicator:
Some Security Dashboard indicators provide links to two or more pages in Data Privacy Management. The
links display as blue values in indicator columns. You can access some pages from more than one indicator.
Data Privacy Management filters the page results according to the type of information the indicator displays,
such as a data store or data domain name.
You can access the following pages from indicators on the Security Dashboard:
You can access the Data Domain Details page in one of the following ways:
Data Store The data store that contains each sensitive field. By default, the page lists the sensitive fields
by data store in alphabetical order.
Object The object or table name that contains each sensitive field.
Protection Extension For protected fields, lists the name of the protection extension.
Tags Lists any tag names associated with the data store.
The following image shows the Data Domain Details page for the Address data domain:
Actions
To view the Proliferation page for a data store, click the data store name.
To organize the list of sensitive fields by a property other than data store name, select one of the following
options from the Group By list:
DataDomainDetails.csv For each data domain, lists the data store name, schema name, object name,
field name, protection status, protection techniques, number of sensitive
fields, and tags.
DataDomainDetailsSummary.csv Lists the information in the summary metrics area of the page: the total
number of data stores, sensitive fields, and protected fields.
The summary metrics in the top right of the page list the total number of data domains and the total number
of data stores, sensitive fields, and sensitive files in the data domains.
The page shows information about each data domain in the following columns:
Protection Status The protection status of sensitive fields in each data domain, represented as a numeric
percentage of protected fields and a bar graph showing the protected percentage in green and
the unprotected percentage in red.
Data Stores The number of source data stores that contain fields that match the data domain.
Targets The number of target data stores that contain fields that match the data domain and that
proliferated from another data store.
Sensitive Files The number of sensitive files across all the fields that match each data domain.
Domain Impressions The total number of classification policy impressions for each data domain.
Activity Stores The number of data stores that contain fields that match the data domain and impressions that
users accessed.
By default, the page lists the data domains by the number of sensitive fields in descending order and displays
information from the past 30 days.
Actions
To view details about a data domain, click a data domain name or click the value in the Sensitive Fields or
Sensitive Files column.
To perform a manual action, select one or more data domains and then select an option from the Actions >
Take Action menu.
To view the list of source or target data stores in a data domain, click the value in the Data Stores or Targets
column. The Data Stores list page appears, filtered by the data domain name.
To view the list of activity stores in a data domain, click the value in the Activity Stores column. The Data
Stores list page appears, filtered by the user activity data domain name.
To view details about user activity events, click the value in the Impressions column. The User Activity page
appears.
To organize the list of data domains by a property other than the number of sensitive fields, click a column
name to sort by that property or select one of the following options from the Group By list:
• Today
• Day (24 hours)
• Week (7 days)
• 60 days
• 90 days
• 1 year
To export the page information, select Export from the Actions menu. Download a ZIP file named Data.zip
to your local machine. Extract the ZIP file to view the following CSV files:
DataDomainDrillDown.csv For each data domain, lists the data domain name and ID. The file also lists
the number of protected and unprotected fields, data stores, targets,
sensitive fields, sensitive files, domain impressions, activity stores, and
impressions.
DataDomainDrillDownSummary.csv Lists the information in the summary metrics area of the page: the total
number of data domains in the list, data stores, sensitive fields, and
sensitive files.
You can access the Data Stores grid page in the following ways:
• On the Overview workspace, click the Top Data Stores indicator label. The page displays the list of top
data stores.
• On the Top Users indicator, click a value in the Data Stores column. The page displays the list of data
stores the user can access.
• On the Sensitive Data Proliferation by Classification Policy page, click a value in the Sources or Targets
column. The page displays the list of source or target data stores that contain sensitive data that matches
the classification policy.
The page shows information about each data store in the following columns:
Data Store The data store names. The icon to the left of the name indicates whether the data store is an
application, big data, cloud, database integration, database management, file management, or
unknown data source.
Data Store Type The name of the type of data store application, big data, cloud, database integration, database
management, file management, or unknown data source. For example, the data store
applications in the example image have a data store type of Oracle.
Risk Score The average risk score of the sensitive fields and files in the data store. By default, the page
lists the data stores by the risk score in descending order.
Max Risk Score The maximum risk score of the sensitive fields and files in the data store. Appears if you
configure the dashboard settings.
Protection Status The protection status of sensitive fields in each data store, represented as a numeric
percentage of protected fields and a bar graph showing the protected percentage in green and
the unprotected percentage in red.
Targets The number of target data stores that proliferated from the data store.
Policy Impressions The total number of classification policy impressions for the data store.
Residual Risk The residual risk cost of the unprotected sensitive fields/files in the data store.
The following image shows the data store icon list and the data source type each icon represents:
To create a new risk simulation plan, select one or more data stores and then select Actions > Simulate Risk.
To view proliferation details about a data store, click a data store name or the value in the Targets column.
To view the Risk Score Details window, click the value in the Risk Score column.
To view the list of sensitive fields or sensitive files in a data store, click the value in the Sensitive Fields/Files
column. The Sensitive Fields page appears.
To organize the list of data stores by a property other than the risk score, click a column name to sort by that
property or select one of the following options from the Group By list:
• Department
• Data Owner
• Data Store Group
• Sensitivity Level
• Location
To filter the list, click the Filter icon in the Data Privacy Management header.
To export the page information, select Export from the Actions menu. Download a ZIP file named Data.zip
to your local machine. Extract the ZIP file to view the following CSV files:
DataStores.csv For each data store, lists the data store name, protection status, number of unprotected
fields, number of sensitive fields and records, number of targets, the risk score, residual
risk cost, total risk cost, risk cost unit of measure, severity level, location, department,
data owner, and the data store group.
If you configure the dashboard Risk Score settings to display the maximum risk score,
the column is included in the exported file.
DataStoresSummary.csv Lists the information in the summary metrics area of the page: the total number of data
stores and locations.
You can access the Data Stores list page in one of the following ways:
• On the Top Data Domains indicator, click a value in the Data Stores column. The page displays the list of
data stores included in the data domain.
• On the Top Departments indicator or the Departments page, click a value in the Data Stores column. The
page displays the list of data stores included in the department.
• On the Sensitive Data for Location indicator, click the number of data stores. The page displays the list of
data stores included in the region selected on the location indicator.
• On the Data Domains list page, click a value in the Data Stores or Targets column. The page displays the
list of data stores included in the data domain.
Data Store The data store names. The icon to the left of the name indicates whether the data store is an
application, big data, cloud, database integration, database management, file management, or
unknown data source.
Data Store Type The name of the application, big data, cloud, database integration, database management, file
management, or unknown data store type. For example, the data store applications in the
example image have a data store type of Oracle.
Risk Score The aggregate risk score of the sensitive fields and files in the data store. By default, the page
lists the data stores by the risk score in descending order.
If you configure the dashboard Risk Score settings to display the maximum risk score, the
column is included in the exported file.
Max Risk Score The maximum risk score of the sensitive fields and files in the data store. Appears if you
configure the dashboard settings.
Protection Status The protection status of sensitive fields in each data store, represented as a numeric
percentage of protected fields and a bar graph showing the protected percentage in green and
the unprotected percentage in red.
Targets The number of target data stores that proliferated from the data store.
Policy Impressions The total number of classification policy impressions for the data store.
Residual Risk The residual risk cost of the unprotected sensitive fields/files in the data store.
The following image shows the Data Stores list page filtered by the Salary data domain:
Actions
To list the data stores by a property other than the risk score, you can click a column name to sort by that
property, or you can select one of the following options from the Group By menu:
• Department
• Data Owner
• Data Store Group
• Sensitivity Level
• Location
To perform a manual action, select one or more data stores and then select an option from the Actions >
Take Action menu.
To view proliferation details about a data store, click a data store name or the value in the Targets column.
To view the Risk Score Details window, click the value in the Risk Score column.
To view the list of sensitive fields or sensitive files in a data store, click the value in the Sensitive Fields/Files
column. The Sensitive Fields page appears.
To export the page information, select Export from the Actions menu. Download a ZIP file named Data.zip
to your local machine. Extract the ZIP file to view the following CSV files:
TargetDataStores.csv For each data store shown on the page, lists the protection status, number of
protected sensitive fields/files, number of sensitive fields/files, number of
targets, risk score, residual risk cost, and total risk cost.
If you configure the dashboard Risk Score settings to display the maximum risk
score, the column is included in the exported file.
TargetDataStoresSummary.csv Shows the total number of data stores and locations as shown in the summary
metrics.
Departments Page
To access the Departments page, click the Top Departments indicator label on the Overview workspace. The
summary metrics in the top right of the page show the total number of protected and unprotected fields in
the department list, the number of departments, and the total number of data stores in the departments.
The page shows information about each department in the following columns:
Department The name of a department that includes one or more data stores. By default, the page lists the
departments by name in alphabetical order.
Risk Score The average risk score of the sensitive fields and files in the data stores included in the
department.
Max Risk Score The maximum risk score of the sensitive fields and files in the data stores included in the
department. Appears if you configure the dashboard settings.
Protection Status The protection status of sensitive fields in the data stores in each department, shown as a
numeric percentage of protected fields and a bar graph that represents protected fields in green
and unprotected fields in red.
Sensitive Fields The number of sensitive fields in the data stores included in the department.
Sensitive Files The number of sensitive files in the data stores included in the department.
Policy Impressions The total number of classification policy impressions for the data stores included in the
department.
Residual Risk The residual risk cost of the unprotected sensitive fields/files in the data stores included in the
department.
Actions
To view the list of data stores in a department, click the value in the Data Stores column.
To view the list of departments by a property other than the name, you can click a column name to sort by
that property, or you can select Sensitivity Level from the Group By menu. When you sort the list by
sensitivity level, Data Privacy Management groups the departments with data stores from highest to lowest
sensitivity level based on the criteria defined on the Settings workspace.
To filter the list, click the Filter icon in the Data Privacy Management header.
To export the page information, select Export from the Actions menu. Download a ZIP file named Data.zip
to your local machine. Extract the ZIP file to view the following CSV files:
Departments.csv For each department, lists the department name, number of unprotected and protected
fields, number of data stores, number of sensitive fields and records, number of
targets, the residual risk cost, and the total risk cost.
If you configure the dashboard Risk Score settings to display the maximum risk score,
the column is included in the exported file.
DepartmentsSummary.csv Lists the total number of departments and locations. The data appears in the summary
metrics area of the page.
Proliferation Page
The Proliferation page shows the movement of sensitive data from source to target data stores.
• On the Top Data Stores indicator, click the name of a data store in the Data Stores column. The page
shows the proliferation details for the selected data store.
• On the Proliferation by Classification Policy page, click a classification policy name. The page shows the
proliferation of sensitive data for all data stores included in the classification policy.
• On the Data Domain Details page, click the name of a data store. The page shows the proliferation details
for the selected data store.
• On the Data Stores grid page and the Data Stores list page, click a data store name or a value in the
Targets column. The page shows the proliferation details for the selected data store.
• On the User Access page, select a data store in the Data Stores pane and then click View. The page
shows the proliferation details for the selected data store.
Actions
To perform a manual action for the selected data store, select an option from the Actions > Take Action
menu.
To remove the data store icon legend, clear the Data Store Types box.
• Residual Risk
• Sensitivity Level
• Region
• Location
To change the default color code for data store hexagons, select one of the following options from the Color
menu: Sensitivity Level or Risk Score.
To view the Scan Results pane for different data stores, click the data store hexagon in the Proliferation
pane.
To view details about the sensitive fields/files in the data store shown in the Scan Results pane, click the
number of sensitive fields/files in blue. .
To view more details about the data store shown in the Scan Results pane, click the data store name in blue
at the top of the pane.
To view information on Axon Data Governance processes linked to the data store, click the arrow to expand
the Scan Results pane. The number of linked processes is indicated next to the data store name. Select the
Processes tab. The names of the processes are links that open the process page in Axon Data Governance.
The following image shows the Processes tab on the Scan Results pane with one process listed:
To filter the data stores shown on the page, click the Filter icon in the Data Privacy Management header.
To export the page information, select Export from the Actions menu. Download a ZIP file named Data.zip
to your local machine. Extract the ZIP file to view the following CSV files:
DataStoreProliferation.csv For the selected data stores on the page, lists the data store name, ID, parent
data store ID, risk score, residual risk cost, total risk cost, risk cost unit of
measure, region, location, severity level, number of protected fields, and number
of sensitive fields.
findConnection.csv Lists data stores and associated risk score, protection status, the number of
sensitive domains, the number of sensitive unprotected domains, and sensitivity
level.
The following image shows the Sensitive Data by Data Store Group page:
1. Scan Results pane. Includes key aggregate scan results for the data stores in a selected row or column.
2. Tile pane. Displays data stores that match a classification policy in a tile grid, with rows by data store group and
columns by classification policy. To view the Scan Results pane, click a classification policy name or a data store
group name.
3. Data store group summary metrics. Shows the total number of data store groups and matching classification
policies, the number of data stores in the data store groups, and the total risk cost of the data stores.
4. Actions menu
Actions
On the Scan Results pane, click the value for the Data Stores or Targets count. The Data Stores list page
shows details about the data stores in the data store group that matches the classification policy or the
target data stores to which sensitive fields proliferated.
To filter the page with a condition common to all pages accessed from the Overview workspace, click the
Filter icon in the Data Privacy Management header.
SensitiveDataByDataStoreGroupSummary.csv Lists the information in the summary metrics area of the page:
the total number of data store groups, classification policies,
data stores, and residual risk cost.
The Sensitive Data Proliferation by Classification Policy page lists the classification policies in alphabetical
order by default. For each policy, the page lists the number of data store groups that match the policy, the
number of source data stores, the number of target data stores, and a horizontal bar graph that shows the
ratio of sources to targets. The source data store bar in dark blue is located above the target data store bar in
light blue.
The following image shows the Sensitive Data Proliferation by Classification Policy page:
Actions
To sort the list of classification policies by the number of data store groups, sources, or targets, click the
column label.
To view details about the data store groups that match a classification policy, click the value in the Data
Store Groups column. The Sensitive Data Proliferation by Data Store Group page appears.
To view details about the data store sources included in the groups or the target data stores to which
sensitive fields proliferated, click the value in the Data Stores or Targets column. The Data Stores list page
appears.
To filter the page with a condition common to all pages accessed from the Overview workspace, click the
Filter icon in the Data Privacy Management header.
SensitiveDataProliferationByPolicy.csv Lists each classification policy name and the number of data
stores, sources, and targets that match the policy.
SensitiveDataProliferationSummary.csv Lists the information in the summary metrics area of the page:
the total number of classification policies, sources, targets, and
data store groups.
The Sensitive Data Proliferation by Data Store Group page lists the data store groups in alphabetical order
by default. For each data store group, the page lists the number of classification policies, the number of
source data stores, the number of target data stores, and a horizontal bar graph that shows the ratio of
sources to targets. The source data store bar in dark blue is located above the target data store bar in light
blue.
The following image shows the Sensitive Data Proliferation by Data Store Group page for the ALL
classification policy:
Actions
To sort the list of data store groups by the number of sources or targets, click the column label.
To view details about the classification policy that the data store group matches, click the value in the
Classification Policies column. The Sensitive Data Proliferation by Classification Policy page appears.
To view details about the data store sources included in the groups or the target data stores to which
sensitive fields proliferated, click the value in the Sources or Targets column. The Data Stores list page
appears.
To filter the page with a condition common to all pages accessed from the Overview workspace, click the
Filter icon in the Data Privacy Management header.
SensitiveDataProliferationByAppGroup.csv Lists each data store group name and the number of
classification policies, sources, and targets included in the
group.
SensitiveDataProliferationSummary.csv Lists the information in the summary metrics area of the page:
the total number of classification policies, sources, targets,
and data store groups.
The following image shows the Sensitive Data Locations page for the North America region:
1. Region tabs.
2. Sensitive data summary.
3. Actions menu.
4. Information shown on the workspace indicator.
Actions
To filter the page with a condition common to all pages accessed from the Overview workspace, click the
Filter icon in the Data Privacy Management header.
The summary metrics in the top right of the page show the number of sensitive and protected fields
associated with the data store, the risk score, the number of scans for the data store, and the number of
master categories associated with the fields in the list.
You can access the Sensitive Fields page in the following ways:
• On the Top Data Stores indicator, click a value in the Fields/Files column. The page displays the list of
sensitive fields identified for the data store.
• On the Data Stores grid page and the Data Stores list page, click a value in the Sensitive Fields/Files
column. The page displays the list of sensitive fields identified for the data store.
• On the Proliferation page, click the number of sensitive fields/files in the data store Scan Results pane.
The Sensitive Fields page shows information about each sensitive field in the following columns:
Column Description
Schema/Folder The schema or folder name that contains the sensitive field.
For Snowflake databases, the name appears in the following format: DatabaseName.SchemaName
This allows you to distinguish between schemas with the same name across different databases.
Object The object or table name that contains the sensitive field.
Field The name of the sensitive field. By default, the page lists the sensitive fields by name in
alphabetical order.
Protected Indicates whether the sensitive field is protected. Possible values are Yes and No.
Sensitive Indicates whether Data Privacy Management has identified the field as sensitive. Possible values
are Yes and No.
Conformance The conformance score that represents the percentage of values in the field that meet the data
match condition for the data domain. If the Run Profiling option in the scan job specifies a data
profile, Data Privacy Management evaluates each value in the sample set to determine the
conformance score.
Verified Indicates whether or not the data store owner verified the sensitivity and protection status of the
field. Possible values are Yes and No.
Category The master category associated with the data domain assigned to the sensitive field. If you override
the master category at the field level, the list displays the category that you assign to the field.
Purpose The master purpose associated with the data store. If you override the master purpose at the field
level, the list displays the purpose that you assign to the field.
The following image shows the Sensitive Fields page for the data store SR_ORCL_HDFC_CC_HR:
1. You imported data store details from a CSV file that identified the field as sensitive.
2. You specified a global sensitivity status for the field as always sensitive for a specific data domain. If a
classification policy includes the data domain, the field appears on the Sensitive Fields page even if the
Profiling job step in a scan did not identify the field as sensitive.
3. You specified a global sensitivity status for the field as Never Sensitive for one or more data domains,
but the Profiling job step identified the field as sensitive.
4. The conformance score is in the Auto Accept range, and the metadata and data profile results do not
conflict.
5. The conformance score is in the Validate range, but you enabled the Mark columns within the validate
range as sensitive property for the data domain, and the metadata and data profile results do not
conflict.
6. The conformance score is in the Validate range, but the scan job identified the field as protected.
Actions
By default, the page lists sensitive fields imported to Data Privacy Management and sensitive fields identified
in a scan job. To view only the sensitive fields imported on the Data Stores workspace, select Imported
Fields in the View By menu located below the summary metrics.
To change the sensitivity or protection status for a sensitive field, select the check box in the field row and
then click Edit in the Properties pane.
To manage the primary keys and unique keys for the tables that contain the identified sensitive columns,
select Actions > Manage Keys.
To perform a manual action, select one or more fields and then select an option from the Actions > Take
Action menu.
To filter the list, click the Filter icon in the Data Privacy Management header.
To export information about the sensitive fields, select Export > Sensitive Fields from the Actions menu. To
export data store details, select Export > Data Store Details from the Actions menu. You can update the
protection status information in the exported DataStoreDetails.csv file and then import the edited information
on the Data Stores workspace.
To override the master category and purpose association for a sensitive field, select Export > Categories and
Purpose from the Actions menu. You can then edit the exported file and import the updated file on the
Sensitive Fields page.
To update the sensitivity and protection status for the field, click Edit.
Property Description
Notes Optional. You can enter notes when you edit the protection or sensitivity status of a sensitive
field.
Metadata Match Indicates if the Metadata Profiling job step of the data store scan identified the field as
sensitive for the data domain. Values are Yes, No, or a hyphen (-). The hyphen indicates that
the Scan job did not include a metadata profile.
Data Match Indicates if the Data Profiling job step of the data store scan identified the field as sensitive
for the data domain. Values are Yes, No, or a hyphen (-). The hyphen indicates that the Scan
job did not include a data profile.
Proximal Domains A list of any data domains that are related to the data domain that includes the sensitive field,
if a Proximity Match condition is specified for the data domain.
Classification The method that determined the current sensitivity status of the selected field. Possible
Mechanism values are:
- Auto. The sensitivity status is based on the results of the Profiling job step.
- Auto (Always Sensitive). The field is globally sensitive for the data domain.
- Auto (Never Sensitive). The field is never sensitive for the data domain. However, the
Profiling job step identified that the field is sensitive for the data domain.
- Import. The user imported the sensitivity status from a CSV file.
- Manual. The user updated the sensitivity status by editing the Sensitive property in the
Properties pane.
Related The number of classification policies that include the field. To view or manage the policies,
Classification click the property value. The Classification Policies workspace opens.
Policies
Domain Impressions The number of data domain impressions for the sensitive field and related classification
policies.
Profile Matches The number of data profiles that match the data domain for the sensitive field in the
classification policy match conditions.
Data Domain Inferred The current time in the data domain location.
Time
Last Modified By The user who most recently edited the data domain that contains the sensitive field.
Last Modified Time The time the data domain that contains the sensitive field was most recently edited in the
time zone of the user who edited the data domain.
To edit fields that Data Privacy Management identified as protected or sensitive in a data store scan.
1. On the Sensitive Fields page, select the check box next to the field you want to update.
The Properties pane for the selected field appears on the right of the page.
2. Click Edit.
The following image shows an example of this view for the sensitive field CITY in the
EMPLOYEE_ADDRESSES object:
3. To change the sensitivity status, click Yes or No for the Sensitive property.
If you update the sensitivity information in Data Privacy Management, the change is updated in
Enterprise Data Catalog.
4. To change the protection status, click Yes or No for the Protected property. If you change the protection
status to Yes, you must select a configured protection extension to specify how the field is protected.
5. Optionally, enter any notes.
6. Click Save.
7. To update the values for sensitive and protected fields on the Overview workspace, you must first run
the Evaluate Classification Policies job. Click Manage > Classification Policies.
The Classification Policies workspace opens.
8. From the Actions menu, select Evaluate Classification Policies.
The Evaluate Classification Policies job updates the results on the Overview workspace.
You can export details for one or more selected fields in the list. If you do not select any fields, the exported
files provide details for all fields in the list.
1. On the Sensitive Fields page, select the check boxes next to the fields that you want to export if you do
not want to export information about the entire list.
2. From the Actions menu, select Export > Sensitive Fields.
Note: If you selected sensitive fields, the Export menu option appears below the solid line.
The Save As dialog box appears with a .zip file named Data.zip.
DataStoreDetails.csv For each sensitive field, lists the repository name, schema/folder name,
object name, field name/file type, whether the field is sensitive, the data
domain name, the conformance score, protection status, protection
technique, whether the field is verified, the impression count, the number
of matched rows, the data domain inferred time, notes, the user who last
updated the field, and the time of the last update.
Note: The following points apply to unstructured sources:
- The Rows Matched column is not applicable to scans run through the
remote agent.
- The Impression Count includes the total number of instances that the
data domain was identified in the data and not the unique instances.
DataStoreDetailsSummary.csv Lists the information in the summary metrics area of the page: the total
number of sensitive fields, protected sensitive fields, aggregate risk score,
and the number of data store scans.
Related Topics:
• “Sensitive Files Page” on page 555
1. From the Actions menu, select Export > Data Store Details.
The Save As dialog box appears with a .zip file named Data.zip.
2. Optionally, rename the .zip file. Save the file to a directory on your local machine.
3. Extract the file.
Seven CSV files appear in the directory.
4. Open the CSV files to view or update the details.
DataStoreDetails.csv For each sensitive field, lists the repository name, schema/folder name,
object name, field name.file type, whether the field is sensitive, the data
domain name, the conformance score, protection status, protection
technique, whether the field is verified, the impression count, the number
of matched rows, the data domain inferred time, notes, the user who last
updated the field, and the time of the last update.
Note: The following points apply to unstructured sources:
- The Rows Matched column is not applicable to scans run through the
remote agent.
- The Impression Count includes the total number of instances that the
data domain was identified in the data and not the unique instances.
DataStoreDetailsSummary.csv Lists the information in the summary metrics area of the page: the total
number of sensitive fields, protected sensitive fields, aggregate risk score,
and the number of data store scans.
PolicyViolations.csv For security policy violations that included sensitive fields, lists the
security policy ID, flag and read status, the date and time the violation
occurred, the security policy name, the name of the user associated with
the violation, the data stores included in the violation, the security policy
type, and the severity level.
Tasks.csv For tasks that include sensitive fields, lists the task name, data store
owner, extension, extension category, data stores, the task due date, and
the task status.
UbaAnomaly.csv For anomalous user behavior incidents that include sensitive fields, lists
user details such as the user name, location, title, department, country,
and user behavior group name. Lists anomaly details such as the ID,
severity level and description, anomaly score, date and time the anomaly
occurred, whether the anomaly is unread or flagged, and the total data
stores and anomaly factors.
UserActivityDrillDown.csv For user activity security policy violations that include sensitive fields, lists
the event ID, the duration in milliseconds, the name and IP address of the
user involved in the violation, the operation that caused the violation, the
affected data stores and data store IP addresses, and the total number of
data domains included in the violation.
Users.csv For users associated with sensitive fields, provides details about the user
names, title, email address, manager name, location, department,
organization, and user status.
You can create an import CSV file, or you can update and import one of the following files that you download
from Data Privacy Management:
• Scan report
• DataStoreDetails.csv file
The scan report includes fields that the Scan job could not automatically reject or identify as sensitive.
DataStoreDetails.csv File
The DataStoreDetails.csv file includes fields that the Scan job identified as sensitive or protected.
1. On the Sensitive Fields page, select Export > Data Store Details from the Actions menu.
2. Extract the DataStoreDetails.csv file from the .zip file that downloads.
3. Open the DataStoreDetails.csv file.
4. Update the values in the IsSensitive, ProtectionStatus, and Verified columns.
5. Save the file.
6. Select Manage > Data Stores to open the Data Stores workspace.
7. On the Actions menu, click Import > Protection Status.
8. On the Import Protection Status page, select the updated file you saved.
Related Topics:
• “Downloading Scan Job Reports” on page 306
• “Importing Protection Statuses” on page 109
You cannot restore the master category or purpose association to a field after you override it. If you want to
update the association to the master category or purpose, you must repeat the steps to update the
association of individual fields.
1. On the Sensitive Fields page, select Export > Categories and Purpose from the Actions menu.
Save the DataPrivacyFieldLinks_Summary.csv file that downloads.
2. Open the DataPrivacyFieldLinks_Summary.csv file.
3. Update the required values in the DataCategory and Purpose columns.
4. Save the file.
5. Select Import > Categories and Purpose from the Actions menu.
6. On the Import Categories and Purpose page, select the updated file.
If you choose to import a file that you create, the file must include all columns and headers that appear
in the downloaded file.
7. Optionally, select Replace Duplicates with Items Imported.
8. Click Import.
The import ignores rows that are unchanged and does not consider the values as overrides.
The summary metrics in the top right of the page show the number of sensitive files and data domains
associated with the data store, the number of domain impressions for the data store, and the number
classification policies and policy impressions associated with the files in the list.
You can access the Sensitive Files page in the following ways:
• On the Top Data Stores indicator, click a value in the Fields/Files column. The page displays the list of
sensitive fields identified for the data store.
• On the Data Stores grid page and the Data Stores list page, click a value in the Sensitive Fields/Files
column. The page displays the list of sensitive fields identified for the data store.
• On the Proliferation page, click the number of sensitive fields/files in the data store Scan Results pane.
The Sensitive Files page shows information about each sensitive file in the following columns in the default
Flat View:
Column Description
Location The file path of the directory that contains the file.
File Type The type of unstructured file, such as email, text, JSON, PDF, XML, or Compressed.
Domains The number of data domains that match data in the sensitive file.
Policies The number of classification policies that the scan job matched with the data in the sensitive
file.
Policy Impressions The number of classification policy impressions for the file.
Sensitive Indicates whether Data Privacy Management has identified the file as sensitive. Possible
values are Yes and No.
Protected Indicates whether the sensitive file is protected. Possible values are Yes and No.
Confidence The confidence level that the data domain is an exact match for the sensitive data in the file.
Possible values are High, Medium, and Low.
Verified Indicates whether or not the Confidence Level is verified. Possible values are Yes and No.
The following image shows the Sensitive Files page for the data store DR_Local_FS_10_4:
How Data Privacy Management Calculates Confidence Levels and Verified Status
The value in the Confidence column is calculated in the following ways for files that have classification policy
matches with data domains:
1. At the data store scan level, the Confidence Level is the maximum value or highest level for each
classification policy match.
2. At the classification policy level, the Confidence Level is the minimum value or lowest level for each
domain that matched the criteria of the classification policy.
3. At the data domain level, the Confidence Level is the maximum value or highest level of all Confidence
Levels for each data domain match for the file.
The value in the Verified column of Yes or No is determined by the Confidence level for the file as follows:
• If the Confidence Level is High and the file does not match a classification policy condition, the value is
Yes.
• If the Confidence Level is Medium or Love, the value is No.
Actions
By default, the page lists sensitive files in a single list in Flat View. To view the sensitive files in a hierarchy
beginning with the first level folder in the directory, select Tree View in the View menu located below the
summary metrics.
To change the sensitivity of data domains in the file or the protection status for a sensitive file, select the
check box in the file row and then click Edit in the Properties pane.
To perform a manual action, a field and then select an option from the Actions > Take Action menu.
To filter the list, click the Filter icon in the Data Privacy Management header.
To export information about the sensitive fields, select Export from the Actions menu and download the
Data.zip file to a local directory. Extract the file to view the following .csv files:
FileSystemDataStoreSummary.csv
Lists the total number of files scanned, the number of data domains, number of classification policies
used, the policy impression count, and the domain impression count.
FileSystemDataStoreDetails.csv
Lists the domain names and the corresponding impression count for each field.
FileSystemClassificationPolicyDetails.csv
Related Topics:
• “Exporting Sensitive Field Information” on page 551
To update the sensitivity of data domains or the protection status for the file, click Edit.
Property Description
Data Domains Lists the total number of data domains and domain impressions that match the file, and includes
the list of each data domain, impressions and sensitivity status for each data domain. You can
edit the sensitivity status for each data domain.
Policies Lists the number of classification policies and their names that match the sensitive file and the
number of total policy impressions for the file and the number of impressions for each policy.
Created On The date and time the sensitive file was created in the time zone of the internet browser used to
access Data Privacy Management.
Last Modified By The user who most recently edited the sensitive file.
Last Modified On The date and time the sensitive file was most recently edited in the time zone of the internet
browser used to access Data Privacy Management.
Last Accessed By The user who most recently accessed the sensitive file.
Last Accessed On The date and time the sensitive file was most recently accessed in the time zone of the internet
browser used to access Data Privacy Management.
The Sensitive Files page for Email Server data stores shows information about each sensitive file in the
following columns:
Column Description
Resource Type The type or level of the file in the email server. The following resource types apply to email
server file types:
- Email ID. The user email ID scanned. For example, [email protected]
- Email Folder. The email folder or sub folder. For example, Inbox, or Drafts.
- Email. The emails in the folder or subfolder.
- Content Body. The content of an email.
- <File type>. The type of file attached to an email. For example, PDF, Avro, or Email for email
attachments.
Domains The number of data domains that match data in the sensitive file.
Policies The number of classification policies that the scan job matched with the data in the sensitive
file.
Sensitive Indicates whether Data Privacy Management has identified the file as sensitive. Possible values
are Yes and No.
Protected Indicates whether the sensitive file is protected. Possible values are Yes and No.
You can click a specific row to open the Properties pane. The Properties pane includes file properties that
are not listed in the Sensitive Files page columns. For example, you can check the Confidence Level and
Verified status information.
Based on the resource type, you can view some or all of the following details on the Properties pane:
Property Description
Confidence The confidence level that the data domain is an exact match for the sensitive data in the file.
Level Possible values are High, Medium, and Low.
Verified Indicates whether or not the Confidence Level is verified. Possible values are Yes and No.
Data Domains Lists the total number of data domains and domain impressions that match the file, and includes
the list of each data domain, impressions and sensitivity status for each data domain. You can edit
the sensitivity status for each data domain.
Policies Lists the number of classification policies and their names that match the sensitive file and the
number of total policy impressions for the file and the number of impressions for each policy.
Content Type The type of content body. For example, HTML or text.
Last Modified The date and time the sensitive file was most recently edited in the time zone of the internet
On browser used to access Data Privacy Management.
Note: The Last Modified Date values on the Sensitive Files page differ based on how you attach
the file to a mail. If you attach a file directly, the date appears the same as the email receipt date.
If you attach a compressed file, it displays the date that the file was last modified on the disk.
Actions
To change the sensitivity of data domains in the file or the protection status for a sensitive file, click the
name in the file row and then click Edit in the Properties pane.
To filter the list, click the Filter icon in the Data Privacy Management header.
To export information about the sensitive fields, select Export from the Actions menu.
You can access the User Access page in the following ways:
The following image shows and example of the User Access page filtered by a classification policy:
1. Users, Data Stores, and User Groups Panels. By default, the lists are displayed in alphabetical order.
2. Clear Filter icon
3. Filter conditions
4. Clear Filter Condition icon
5. Filter
6. User access summary
7. Actions menu
To quickly find a user, data store, or user group, click the panel-specific Filter icon. Enter a full or partial name
in the text box and press ENTER.
Hover over a user, data store, or user group name to view more details.
To view additional details about a user, click the user name and then click View. The User Profile page
appears. For more information, see “User Profile Page” on page 564.
To view the Proliferation page for a data store, select a data store in the Data Stores panel and then click
View. For more information, see “Proliferation Page” on page 539.
To filter the page with a condition common to all pages accessed from the Overview workspace, click the
Filter icon in the Data Privacy Management header. For more information, see “Filtering Information on the
Security Dashboard” on page 570.
To export the page information, select Export from the Actions menu. Download a ZIP file named Data.zip
to your local machine. Extract the ZIP file to view the following CSV files:
PivotViewUserSummary.csv Contains the number of users, data stores, and user groups in Data Privacy
Management.
UserAccessPivotView.csv Contains the names of the users, data stores, and user groups in Data Privacy
Management.
You can access the User Activity page in the following ways:
• On the Data Domains list page, click the value in the Impressions column.
• On the User Profile page, click the number of impressions, the Top Data Stores Accessed indicator label,
or the Top Data Domains indicator label.
• On the Users page, click the value in the Impressions column.
1. Events. Displays a log of events triggered by the user you selected on the Users page. You can cancel the filter
condition to see events for all users. By default, the page lists events by most recent to least recent date.
2. Expanded event view
3. Event count
4. Clear Filter icon
5. Filter conditions
6. Clear Filter Condition icon
7. Time period. By default, the time period is the same time period on the Top Users indicator or the Users page,
measured by the UTC time standard.
8. Filter
9. Group By menu
10.Summary metric
11.Actions menu
To sort the list by a column other than the event date, click the column name.
To learn more about a user, click the user name. The User Profile page appears. For more information, see
“User Profile Page” on page 564.
To view the proliferation status of the data store involved in the event, click the Data Store name. The
Proliferation page appears. For more information, see “Proliferation Page” on page 539.
To change the time period, select one of the following options from the Display menu:
• Today
• Day (24 hours)
• Week (7 days)
• Month (30 days)
• 60 days
• 90 days
• 1 year
• Custom Dates. Select dates in the From and To fields.
• Data Domain
• Data Store Group
• Data Store Location
• User Department
• Classification Policy
• Protection Status
• User Name
• User Group
• User Location
To filter the list, click the Filter icon in the Data Privacy Management header. For more information, see “User
Activity Page Filter Conditions” on page 575.
To export the page information, select Export from the Actions menu. Download a ZIP file named Data.zip
to your local machine. Extract the ZIP file to view the following CSV files:
UserActivityDrillDownExport.csv Shows the details of each event that the user triggered. For
example, the CSV file shows the time of the event, the type
of operation, and the affected data store and column.
UserActivityDrillDownSummary.csv Shows the total number of events that the user triggered.
You can access the User Profile page in the following ways:
• On the User Access page, click a user name and then click View.
• On the User Activity page, click a user name.
• On the Users page, click a user name.
• On the Suppression Rules list page, click the name of the user who created a suppression rule.
1. User name
2. User detail summary
3. Top Data Stores Accessed
4. Top Data Domains Accessed
5. Time period. By default, the time period is Month (30 days).
6. Summary metrics
7. Actions menu
8. Security Policy Violations .Lists the security policy violations associated with the user in reverse chronological
order.
9. Anomalies. Lists the anomalies associated with the user in reverse chronological order.
To view a complete list of the data stores the user has accessed, click the Data Stores Accessed count in the
user detail summary. The Data Stores list page appears. For more information, see “Data Stores List
Page” on page 535.
To view a log of events the user triggered, click the Impressions count in the user detail summary, the Top
Data Stores Accessed indicator label, or the Top Data Domains Accessed indicator label. The User Activity
page appears. For more information, see “User Activity Page” on page 562.
• Today
• Day (24 hours)
• Week (7 days)
• Month (30 days)
• 60 days
• 90 days
• 1 year
• Custom Dates. Select dates in the From and To fields.
To view more details about a security policy violation, click the Violation ID in the Security Policy Violations
indicator.
To view more details about an anomaly, click the ID in the Anomalies indicator.
To filter the page with a condition common to all pages accessed from the Overview workspace, click the
Filter icon in the Data Privacy Management header. For more information, see “Filtering Information on the
Security Dashboard” on page 570.
To export the page information, select Export from the Actions menu. Download a ZIP file named Data.zip
to your local machine. Extract the ZIP file to view the following CSV files:
PolicyViolations.csv Shows the details of each security policy that the user
violated. For example, the CSV file shows the security policy
violation ID, date and time of the violation, the type and
severity level of the violation, and the affected security
policy and data store.
UserActivitySummaryByDataStoreAndTrend.csv Contains a list of the data stores ordered from high to low
number of impressions in the current period. Also includes
the total number of impressions and the trend change as a
percentage for each data store.
UserActivitySummaryByLocation.csv Contains the data store and user location. Also contains the
number of events that the user triggered.
UserActivitySummaryProperties.csv Contains the aggregate risk score for the user, total number
of protected fields the user accessed, the user's total risk
cost, the number of data stores and targets to which the
user has access, the number of sensitive fields and
sensitive records the user has accessed, the number of
events the user has triggered, and the risk cost per unit.
UsersExport.csv Shows details for the user such as the user location and the
number of data stores and sensitive fields that the user
accessed.
Related Topics:
• “Suppression Rules List” on page 367
Users Page
To access the Users page, click the Top Users indicator label on the Overview workspace. The page lists
users who have accessed sensitive data in data stores that you can access.
For each user, the page lists the user name, full name, department, location, the number of data stores the
user can access, the number of sensitive fields in the data stores, and the number of impressions the user
accessed for the time period specified on the page.
The summary metrics in the top right of the page show the total number of users in the current list, the
number of departments and locations for the users, the total number of data stores the users can access, the
number of sensitive fields in the data stores, and the total number of impressions that all users on the page
accessed.
1. Users. By default, the page lists users by the number of impressions in descending order.
2. Expanded view of user. Shows the user's job title, email address, manager name, the number of user groups that
include the user, the number of data stores the user can access, and a graph that shows the number of impressions
the user accessed over time. You can hover over a section of the graph to see the number of impressions the user
accessed on that day or month.
3. User count
4. Clear Filter icon
5. Filter conditions
6. Clear Filter Condition icon
7. Time period. By default, the page displays user activity over the last 30 days. Data Privacy Management uses the
UTC time standard for all time periods.
8. Filter
9. Summary metrics
10.Actions menu
11.Group By. The default grouping is None.
To sort the list by a column other than the number of impressions, click the column name.
To view the data stores that a user accessed, click the value in the Data Stores column. The Data Stores list
page appears. For more information, see “Data Stores List Page” on page 535.
To view the log of a user's activity, click the value in the Impressions column. The User Activity page
appears. For more information, see “User Activity Page” on page 562.
To learn more about a user, click the user name. The User Profile page appears. For more information, see
“User Profile Page” on page 564.
To view the names of the user groups and data stores the user can access, click the User Groups or Data
Store Access counts in the expanded view. The User Access page appears. For more information, see “User
Access Page” on page 561.
To change the time period, select one of the following options from the Display menu:
• Today
• Day (24 hours)
• User Department
• User Location
• Data Domain
• Data Store Group
• Data Store Location
• Classification Policy
To filter the list, click the Filter icon in the Data Privacy Management header. For more information, see
“Users Page Filter Conditions” on page 575.
To export the page information, select Export from the Actions menu. Download a ZIP file named Data.zip
to your local machine. Extract the ZIP file to view the following CSV files:
UsersExport.csv Lists the users. Shows details for each user such as the user location and the number of data
stores and sensitive fields that the user accessed.
UsersSummary.csv Shows totals such as the number of users, user locations, and sensitive fields that users
accessed.
Toggle between dashboards from the Actions menu. To switch to the Privacy Dashboard, click Switch to
Privacy Dashboard.
You can select which indicators display on the Security Dashboard and customize the order of the indicators.
The changes are visible only to the user who is logged in to Data Privacy Management. When you use the
On the Filter panes for the drill-down pages, a dashed line separates the global conditions from the page-
specific filter conditions. The page-specific filter conditions appear above the dashed line.
Note: The Filter pane on the Sensitive Fields page displays only two of the global filter conditions:
Classification Policy and Data Domain.
Risk Score Two sliders represent risk score ranges from 0 through 100. To view data stores with risk
scores that fall within a range, use one slider to specify the minimum value and the other slider
to specify the maximum value of the range. To view data stores with a specific risk score, drag
both sliders to the same value on the scale.
Classification Policy Select one or more classification policies. The selections also filter the number of sensitive
fields that display in the Top Data Domains and Top Data Stores indicators.
Process Select one or more processes. Filters based on data stores and risk associated with Axon Data
Governance processes linked to a data store.
Region Filters based on one or more geographic regions. When you select a country or data store
location without specifying a region, a square appears in the region check box to indicate that
the region contains one or more countries or data store locations that you did not select.
Country When you select a region first, Data Privacy Management automatically selects the countries in
the region. You can clear any of the selections. You can also include other countries outside
the region.
Data Store Location When you select a region first, Data Privacy Management automatically selects the data store
locations in with the region.
Data Store Enter a partial or complete data store name. The page displays the data stores that match the
name or name fragment.
Department The department associated with one or more data stores. Select a department from the list.
Residual Risk The total residual risk cost of the data stores in one or
more departments. To specify minimum and maximum
values, drag the left and right sliders. To view
departments with a specific residual risk cost, drag both
sliders to the same value.
The conditions above the dashed line on the Filter pane are specific to the Sensitive Fields page. The
conditions below the dashed line apply to all pages in the Overview workspace.
Filter Description
Condition
Schema/Folder Enter the full or partial name of a schema or folder. The page displays a list of sensitive fields in
schemas or folders that match the name or name fragment.
Object Enter the full or partial name of a table. The page displays a list of sensitive fields in tables that
match the name or name fragment.
Field Enter the full or partial name of a field. The page displays a list of sensitive fields that match the
name or name fragment.
Protected Select Yes to view only protected sensitive fields. Select No to view only unprotected sensitive
fields.
Sensitive Select Yes to view fields that Data Privacy Management identified as sensitive by the Profiling job
step in a data store scan or that were identified as sensitive by importing CSV files. Select No to
view fields that you manually identified as not sensitive.
Conformance Move the sliders on each end of the bar to specify the minimum and maximum values for a
conformance score range. The page displays a list of sensitive fields that match the conformance
score range.
Verified Select Yes to display a list of sensitive fields that are verified for one of the following reasons:
- The fields have a conformance score in the auto-accept range.
- The fields have a conformance score in the validate range, but you changed the verified status
value to Yes and imported the changes from a CSV file.
Select No to display a list of sensitive fields that are not verified for one of the following reasons:
- The fields have a conformance score in the validate range, and the Mark columns within the
validate range as sensitive property is enabled for the data domain. By default, the verification
status for these fields is No.
- The fields have a conformance score in the validate range, and you retained the default verified
status value of No in an imported CSV file.
Origin For Cloudera Navigator, Informatica Data Engineering Integration, Informatica Cloud, and
Informatica PowerCenter data stores, select a data store from which the sensitive fields
proliferated.
Target For Cloudera Navigator, Informatica Data Engineering Integration, Informatica Cloud, and
Informatica PowerCenter data stores, select a data store to which the sensitive fields proliferated.
Name Enter the full or partial name of a file. The page displays a list of sensitive fields that match the
name or name fragment.
Created On Select a data range to view files created during a specific period of time.
Last Modified On Select a data range to view files last modified during a specific period of time.
Last Accessed On Select a data range to view files last accessed during a specific period of time.
Has Attachment Applies to the Email Server category. Select Yes to view only files with attachments. Select No
to view only files without attachments.
Attachment File Select one or more file types to view only the selected types of files. For Email Server sources,
Type/File Type select one or more file types to view only files with the selected types of attachments.
Choose one or more types from the following file types:
- Avro
- Compressed Files
- Email
- Image
- JSON
- Microsoft Excel
- Microsoft PowerPoint
- Microsoft Word
- Parquet
- PDF
- Web Page Files
- XML
Protected Select Yes to view only protected sensitive files. Select No to view only unprotected sensitive
files.
Sensitive Select Yes to view files that Data Privacy Management identified as sensitive by the Profiling
job step in a data store scan or that were identified as sensitive by importing CSV files. Select
No to view files that you manually identified as not sensitive.
Domain Move the sliders on each end of the bar to specify the minimum and maximum values for a
Impressions domain impressions range. The page displays a list of sensitive files that match the domain
impressions range.
Verified Select Yes to display a list of sensitive files that are verified for one of the following reasons:
- The files have a conformance score in the auto-accept range.
- The files have a conformance score in the validate range, but you changed the verified
status value to Yes and imported the changes from a CSV file.
Select No to display a list of sensitive files that are not verified for one of the following
reasons:
- The files have a conformance score in the validate range, and the Mark columns within the
validate range as sensitive property is enabled for the data domain. By default, the
verification status for these files is No.
- The files have a conformance score in the validate range, and you retained the default
verified status value of No in an imported CSV file.
Confidence Level Choose a level to view files with a specific confidence level.
User Name Enter one user name. To open the User Profile page, click the user name.
Data Store Enter one data store name to view the list of users who accessed the data store.
Department Select a department to view the list of users in the selected department.
User Location Select a user location to view only users in that location.
User Activity Data Select one or more data domains to view the list of users that accessed sensitive data in
Domain specific data domains.
User Name Enter one user name. To open the User Profile page, click the user name.
Data Store Enter one data store name to view the list of users who accessed the data store.
Department Select a department to view the list of users in the selected department.
User Location Select a user location to view only users in that location.
User Activity Data Select one or more data domains to view the list of users that accessed sensitive data in
Domain specific data domains.
Save the Data.zip compressed file to a directory on your machine. If you export the file again, the file name
includes an incremental number such as Data(1).zip.
Extract the Data.zip file to view the CSV files that contain information displayed on each indicator on the
Security Dashboard.
The Data.zip file contains CSV files. Each CSV file contains the information in a workspace indicator. If you
filter the workspace before you export, the information in the CSV files matches the active filter conditions.
The following table lists the CSV files in the Data.zip file that you export from the Security Dashboard:
DataRiskCost.csv Contains the number of sensitive records and fields. Also contains the
risk cost for the current and previous months.
DataStoreStatusBar.csv Contains the number of data stores for each section of the Discovery
bar. Also contains the total number of data stores with the database
repository type.
ProtectedData.csv Contains the number of protected sensitive fields and the number of
unprotected sensitive fields. Also shows the numbers of protected and
unprotected sensitive fields for the previous month.
RiskScoreByMonth.csv Contains the risk score for the current and previous months.
SensitiveDataByLocationDetails.csv Contains the number of data stores and key scan results for each
region.
SensitiveDataProliferation.csv Contains the number of source and target data stores for each
classification policy.
SensitiveDataVsGroups.csv Contains the aggregate risk score for data stores in each data store
group that match each classification policy.
SensitivityLevelOfDataStores.csv Contains the number of data stores for each sensitivity level in the
current and previous months.
TopDataDomains.csv Contains a list of the data domains ordered from high to low number
of sensitive fields. Also includes the number of data stores in each
data domain, the number of protected and unprotected sensitive fields,
and the number of events triggered for each data domain.
TopDataStores.csv Contains a list of the data stores ordered from high to low risk scores.
Also includes the risk cost, number of sensitive fields, repository ID
and data owner name for each data store.
TopDepartments.csv Contains a list of departments ordered from high to low risk scores.
Also includes risk cost and number of protected and unprotected
fields for each department.
TopUsers.csv Contains a list of users. Contains the name and user name of the user.
Also contains the department the user belongs to, the number of data
stores the user can access, and the number of events the user
triggered.
Summary Workspaces
This chapter includes the following topics:
You access the summary workspaces from the Summary Analytics menu.
The Data Privacy Management Service updates the scan results on the summary workspaces after a scan job
completes. You can export the information from a summary workspace to your local machine.
578
You can click a specific value within an indicator to see the number of data stores and the scan results of the
data stores.
For example, from the By Sensitivity Level indicator, you can click the band that represents the Restricted
level. The scan results pane appears with the aggregate scan metrics for the data stores that match a
restricted classification policy. You can also click on the target count on the scan results pane to see the list
of target data stores on the Data Stores page. The target data stores contain sensitive data from source data
stores. The source data stores match a restricted classification policy.
The indicators on the Data Store Summary workspace act as a filter for the page. When you click on a bar or
band within one indicator, the Data Privacy Management Servicefilters the information in the other indicators
by that category.
You can use the export option to download the results you view on the Data Store Summary workspace for
further analyses.
Each indicator represents a group of data stores that share a common factor such as the same location,
proliferation status, sensitivity level, or department. Bars or bands within an indicator represent the
different categories for that factor. Numbers on a bar indicate the number of data stores for the
category.
For example, the By Classification Policy indicator lists the different classification policies. The number
of data stores that match each classification policy appears on the bar for that classification policy.
For example, the bands that form the circle in the By Sensitivity Level indicator represent the different
sensitivity levels that the Data Privacy Management Service can assign to a data store.
You can click a band or bar to filter the entire workspace by that category or level. The Data Privacy
Management Service filters the results in the other indicators to match the active filter.
In addition, the Details pane appears with key metrics such as the aggregate risk score, residual risk
cost, number of source and target data stores, and overall protection status for the active filter.
When you click the Targets count, the Data Stores page appears. The page shows the list of target data
stores to which sensitive fields proliferate from source data stores. The source data stores for these
targets belong to the group represented by the bar or band you selected.
Shows the number of data stores and the number of locations represented on the Data Stores Summary
workspace.
Actions menu
Use the Actions menu to export the results from the page. When you select the export option, Data
Privacy Management downloads a .zip file that contains CSV files. The CSV files contain information
that you view in each indicator.
Related Topics:
• “Data Stores Grid Page” on page 533
To view scan results for a department, click a department within any indicator. The scan results pane
appears with the aggregate scan results of the data stores in the department. You can click the target count
on the scan results pane to view the list of target data stores associated with the source data stores in the
department that you selected.
Shows the scan results by department for a scan factor. The following image shows the By Residual
Risk Cost indicator with a department selected:
Click a department bar to view details for the data stores in a department. The Details pane appears with
key metrics such as the aggregate risk score, residual risk cost, number of source and target data stores,
and overall protection status for the active filter.
When you click the Targets count, the Data Stores page appears. The page shows the list of target data
stores to which sensitive fields proliferate from source data stores. The source data stores for these
targets belong to the department you selected.
Department summary
Shows the number of departments and the number of data stores associated with a department.
Actions menu
Use the Actions menu to export the scan results from the page. When you select the export option, Data
Privacy Management downloads a .zip file that contains CSV files. The CSV files contain scan results by
department for a scan factor.
Event information by a category appears within an indicator. For instance, when you view the By User
indicator, you see the number of events created by each user starting with the user with the most events.
You can click a factor within an indicator to filter the results in the other indicators by that factor. For
example, you want to see the number of events triggered by user Matt G. When you click on Matt G in the By
User indicator, the Data Privacy Management Service updates the User Activity Summary workspace to show
the number of events by Matt G organized by categories such as data store group, sensitivity level, and
location.
1. Indicators
2. Filter condition
3. Summary
4. Actions menu
5. Time period
6. Details pane
Each indicator shows the number of events by a specific category. Data Privacy Management displays
the results within an indicator in different ways. For example, in the By User indicator, you see the
number of events next to each user. In the By Sensitivity Level indicator, you see the number of events
for a sensitivity level represented in a doughnut chart.
To filter the User Activity Summary workspace, click on a factor within an indicator. The Data Privacy
Management Service updates the results in the other indicators by that factor. The Data Privacy
Management Service also shows the details for the factor in the details pane.
Summary
Actions
From the Actions menu, you can export the information on the page. When you export, Data Privacy
Management downloads a compressed file named Data.zip to your local machine. The Data.zip file
contains CSV files. The CSV files contain the information displayed in the summary, indicator, and
details sections of the page.
Time period
By default, the User Activity Summary workspace displays user activity over the last 30 days. To change
the time period, select an option from the Display menu.
The following image shows the time-period section and the options in the Display menu:
To specify a predefined time period, select an option such as 90 days, from the Display menu. To
customize the time period, click Display > Custom Dates and specify the start and end dates.
Details pane
The Details pane appears when you click a bar or band within an indicator. The pane shows key metrics
such as the residual risk score, number of source and target data stores, and protection status. For
example, you filter the workspace by the user department Finance. The details pane appears with the
aggregate metrics for the data stores that the users in the Finance department accessed.
• Data Stores. When you click the link, the Data Stores page appears with a list of the data stores in the
group.
• Targets. When you click the link, the Data Stores page appears with a list of source data stores for
the targets.
• Impressions: When you click the link, the User Activity page appears with a list of events associated
with the data stores in the group.
• By User
• By Data Domain
• By Sensitivity Level
• By Department
• By Data Store Group
• By Risk
• By Location
• By Data Store
• By Day of the Week
By default, the workspace displays all events in every category. You can filter the information on the
workspace by any factor within an indicator. For example, you can filter by the PCI data domain to identify
any unusual activity on sensitive fields that match the PCI data domain.
To export information, select the Export option from the Actions menu.
When you export information, the Data Privacy Management Service saves a compressed file to the default
download directory on your machine. The name of the compressed file is Data.zip. If you export the file
again or export the file from a different page, the Data Privacy Management Service downloads a new file.
The file name includes an incremental number such as Data(1).zip.
The Data.zip file contains CSV files. The information in the CSV files matches the information on the page
you export from. When you export from a summary workspace, the Data.zip file contains CSV files. Each
CSV file contains the information displayed on an indicator on the workspace.
• Protection, 589
• Encryption Rules, 594
• Risk Simulation Plans, 601
588
Chapter 26
Protection
This chapter includes the following topics:
Protection Overview
In Data Privacy Management, you can protect sensitive data with Persistent Data Masking, and encryption
rules and keys. You can also decrypt encrypted data.
In Data Privacy Management, you scan data stores to identify sensitive data. Then you create protection
extensions on the Extensions workspace. Protection extensions configure connection properties to the
protection applications once so that you can easily reuse the properties when you configure protection tasks
and security policies.
You can also add protection extensions to data domains and specify the default protection rules that Data
Privacy Management applies to the sensitive fields in each data store that is included in the data domain.
You specify an active protection extension that Data Privacy Management supports for the protected data
store when you configure protection tasks. A protection task applies the default rules or access conditions of
the associated protection extension to sensitive fields in a data store. Each protection task is associated
with one data store.
• Encryption
• Persistent Data Masking - Big Data
• Persistent Data Masking - Remote Domain
You create encryption rules on the Encryption Rules workspace. To add encryption rules to data domains and
encryption protection tasks, you must associate encryption keys with the rules.
You can generate and manage 16-, 24-, or 32-byte PKCS #11 encryption keys with the Soft Hardware Security
Module (SoftHSM) key management tool that is included with Data Privacy Management. For more
information about generating encryption keys in the SoftHSM key management tool, see the "Encryption and
Decryption Management" chapter in the Informatica Data Privacy Management Installation and Configuration
Guide.
589
Related Topics:
• “Protecting Data” on page 400
• “Security Policy Properties” on page 450
• “Task Status” on page 492
• “Protection Job” on page 299
• “Adding or Updating a Protection Extension on the Data Domain Details Page” on page 74
Protection Prerequisites
Data Privacy Management uses several components to protect sensitive data and to display protection
information.
Verify that your role and permissions enable you to work with protection components.
The following table lists the default protection permissions for Data Privacy Management custom roles:
Data Owner View, add, and manage encryption rules and tasks. View extensions and protection
remote agents.
Operator View and run tasks. View encryption rules and protection remote agents.
Security Analyst View and manage encryption rules and tasks. View extensions and protection remote
agents. Manage risk simulation plans.
Security Manager View and manage tasks. View extensions, encryption rules, protection remote agents,
and subject registry. Manage risk simulation plans.
Technical Administrator View and delete tasks. View and manage extensions, encryption rules, and protection
remote agents.
For more information about Data Privacy Management roles and protection permissions, see the
Informatica Data Privacy Management Administrator Guide.
• Encryption
• Persistent Data Masking (PDM) - Big Data
• Persistent Data Masking (PDM) - Remote Domain
Protection extensions save connection details to the protection method services for easy re-use in data
domains, security policies, and protection tasks.
• On the Scans workspace, run Scan jobs for data stores to detect sensitive columns based on data
domain conditions, classification policy configurations, and default conformance criteria settings.
After Scan jobs identify sensitive fields, you can create, configure, and schedule protection tasks.
• On the Data Stores workspace, import Catalog metadata from a CSV file for the sensitive fields in a
data store. Then, import protection status for the sensitive fields from a CSV file. The sensitive fields
in the protection status file must contain Catalog metadata.
Related Topics:
• “Supported Data Store Types for Protection Extension Plugins” on page 66
• “How Data Privacy Management Determines Sensitivity Status” on page 249
• “Importing Catalog Metadata” on page 115
• “Importing Protection Statuses” on page 109
You can protect data on demand from several views in Data Privacy Management. You can also add a
protection action to a data store or user activity security policy. In the event of a security policy violation,
Data Privacy Management creates a protection task for each data store in the policy that has an active,
supported protection extension defined on the Extensions workspace.
Data protection occurs in the following general sequence in Data Privacy Management:
The following image shows an example process flow for several Data Privacy Management roles that have
privileges to protect sensitive data:
For more information about data protection roles and privileges, see the Informatica Data Privacy
Management Administrator Guide.
Related Topics:
• “Tasks Overview” on page 480
• “Task Status” on page 492
• “Extension Management” on page 73
Encryption Rules
This chapter includes the following topics:
For each encryption rule, you can create an encryption technique to change metadata, preserve metadata,
and preserve the data format.
When you associate an encryption protection extension with a data domain on the Data Domains workspace,
you specify default encryption rules and encryption keys to mask sensitive string data in data stores that
contain the data domain columns.
When you configure encryption protection tasks on the Tasks workspace, you specify encryption extensions,
encryption rules, and encryption keys to mask sensitive fields when the encryption tasks run.
You can generate and manage 16-, 24-, or 32-byte PKCS #11 encryption keys with the Soft Hardware Security
Module (SoftHSM) key management tool that is included with Data Privacy Management. For more
information about generating encryption keys in the SoftHSM key management tool, see the "Configuring
Encryption and Decryption" section in the Informatica Data Privacy Management Administrator Guide.
Related Topics:
• “Decrypting a Closed Encryption Task” on page 503
• “Adding or Updating a Protection Extension on the Data Domain Details Page” on page 74
• “Configuring Encryption Rules and Encryption Keys” on page 501
594
Encryption Rules Workspace
On the Encryption Rules workspace, you can create and manage encryption rules. The workspace includes a
page that displays a list of encryption rules and a detail page for each encryption rule.
You can use filters to quickly find the encryption rules that you want to manage.
To access the Encryption Rules workspace, click Manage > Encryption Rules.
The following image shows an example of the Encryption Rules list page:
1. Encryption rules
2. Encryption rule count
3. Clear Filter icon
4. Filter conditions
5. Clear Filter Condition icon
6. Encryption rule properties
7. Filter icon
8. New encryption rule button
9. Actions menu
• Click an encryption rule name. To edit the properties, click Edit on the Encryption Rule Details page.
• Select the check box next to an encryption rule and then select Open or Edit from the Actions menu.
You can specify encryption rules in data domains and encryption protection task properties.
The following image shows the encryption rule properties on the New_Encryption_Rule page:
Property Description
Name A name for the encryption rule that helps you easily identify the rule. The name cannot exceed
255 characters, contain a space, or contain the following special characters: ~!$%^&*()+
Description Optional. Long description of the encryption rule that cannot exceed 255 characters.
Do Not Encrypt Optional. Displays when you select the Preserve format and metadata encryption technique.
Characters Enter a list of characters that encryption protection tasks configured with the encryption rule
do not encrypt.
The following image shows a sample configuration of properties for an encryption rule that uses the Preserve
Format and Metadata encryption technique:
When you create or edit an encryption rule, you specify an encryption technique. The following table
describes the encryption technique options:
Encryption Description
Technique
Preserve format Preserves the format and length of the source data. Replaces each alphanumeric and special
and metadata character value with another value that retains the same case, numeric format, or special character
format. Provides the option to specify characters that the encryption rule will not encrypt. You
cannot exclude the following characters from encryption: <>&
For example, an encryption rule that preserves format and metadata for the Email Address data
domain, but does not exclude any characters from encryption, might change a column value of
[email protected] to Mpz849#dje!kuw.
In the same Email Address example, if you specify that Data Privacy Management cannot encrypt
the "@" and "." characters, the email address [email protected] might become [email protected]
after encryption.
Encryption rules specify how Data Privacy Management masks sensitive string data in Cloudera Hive data
stores when associated encryption protection tasks run. You can create, edit, copy delete, export, import, and
view encryption rules.
Important: You cannot delete an encryption rule that is associated with at least one data domain, an active
task, or a task that ran successfully.
The export file contains the same format that Data Privacy Management requires to import encryption rules.
When you import the revised CSV file, the Import job adds and updates the encryption rules in the Model
repository.
The following image shows the text view of a sample export file:
5. Optionally, update the encryption rule properties in the CSV file and save the file.
You can then import the file on the Encryption Rules workspace.
By default, Replace Duplicates with Items Imported is enabled. This option prevents creating duplicate
encryption rules by updating existing rules with new details from the imported file.
3. Click in the empty field.
The Open dialog box appears.
4. Browse to the directory that contains the CSV file with the encryption rule details you want to import,
select the file, and then click Open.
The file name appears in the field on the Import Encryption Rules page. A message indicates the number
of encryption rules that Data Privacy Management found in the CSV file.
5. Click Select.
6. Click Import.
For each risk simulation plan, you can adjust the protection status of sensitive data domains in scanned data
stores to simulate the impact on the potential cost to your organization in the event of a security breach.
The risk simulation indicators at the top of each plan detail page show both the current and estimated risk
score, protection status, and residual risk cost in the currency selected for your organization on the Settings
workspace. The current and estimated values for each data store in the plan appear in the columns on the
page.
After you save a risk simulation plan, you can edit the plan to protect more data domains, or select fewer
data domains to protect. You cannot remove protection from all data domains in a data store after you apply
and save the plan. To simulate and compare risk, create a new plan that includes the data store and do not
apply protection to any data domains.
You can use filters to quickly find the risk simulation plan that you want to manage.
To access the Risk Simulation Plans workspace, click Manage > Risk Simulation Plans.
601
Risk Simulation Plans List Page
The Risk Simulation Plans list page shows the risk simulation plans for scanned data stores in your
organization. By default, you view the Risk Simulation Plans list page when you access the Risk Simulation
Plans workspace.
The following image shows an example of the Risk Simulation Plans list page:
Access the Risk Simulation Plan Details page in one of the following ways:
• Click a risk simulation plan name. To edit the properties, click Edit on the Risk Simulation Plan Details
page.
• Select the check box next to a risk simulation plan and then select Open or Edit from the Actions menu.
The following image shows an example of a Risk Simulation Plan Details page:
Property Description
Risk Score The Current risk score shows the aggregate risk score of data stores in the plan, with a value
Indicator between 0 and 100. Data Privacy Management calculates risk scores based on the following
metrics for sensitive data in each scanned data store:
- Sensitivity level
- Protection status
- Number of sensitive fields
- Number of sensitive records
- Number of targets
- Residual risk cost
- Number of active users
- Number of active and inactive users with activity in Data Privacy Management
You can change the weight of each risk score factor from the Settings workspace. Data Privacy
Management recalculates the risk score of the scanned data stores and refreshes the values on
the Risk Score indicator.
The Estimated risk score shows the predicted risk score if you implement the selected data
protection changes in the plan.
Protection The Current protection status shows the percentage of sensitive fields in the data store that are
Status Indicator currently protected.
The Estimated protection status shows the percentage of sensitive fields in the data store that are
protected in the simulation.
For data store types that protection extensions do not support, the value is 0.
Residual Risk The Current residual risk cost shows the amount of money it would take to resolve the problems of
Cost Indicator a data breach for sensitive data that is currently unprotected in the data stores included in the
simulation plan.
The Estimated residual risk cost shows the predicted risk cost if you implement the selected data
protection changes in the simulation plan.
Selected Data The column shows the number of data domains selected in the simulation plan.
Domains
Protection The column shows the method used to protect sensitive data in each data store included in the
plan.
On the Risk Simulation Plan Details page, you can select sensitive data domains to protect for each scanned
data store included in the plan.
The following table describes the risk simulation plan protection properties:
Property Description
Data Domain Required. Select one or more sensitive data domains for the selected data store to simulate the impact
that protecting data will have on the risk score, protection status, and residual risk cost indicators.
1. To create a new risk simulation plan, access the New Risk Simulation Plan page in one of the following
ways:
• Click Manage > Risk Simulation Plans. The Risk Simulation Plans workspace appears. Click New.
• On the Security Dashboard, click the Top Data Stores indicator label. The Data Stores Grid page
appears. Select one or more data stores to include in the plan. From the Actions menu, select
Simulate Risk.
Before you can simulate risk, scan the data stores you want to include in the simulation plan. The Scan job
discovers and classifies sensitive and personal data domains in data stores.
1. On the Risk Simulation Plans workspace, select the checkbox next to the remote agent that you want to
edit.
2. From the Actions menu, select Edit.
The <Risk_Simulation_Plan_Name> page appears.
3. To edit the plan name, description, or tags, click the Propeties tab.
4. On the Protection tab, you can add or remove sensitive data domains to protect for each data store in
the plan.
The risk indicators update with the changes.
Note: You cannot remove protection from all data domains that you selected and saved when you
created the plan. You can add data domains to simulate protection and risk scoring if you did not protect
all sensitive data domains when you created the plan. You can remove protection simulation for all but
one data domain in a data store. To simulate risk for a data store without protecting sensitive data,
create a new risk simulation plan and compare the results on the list page.
5. To add data stores to the plan, select Actions > Add Data Stores and select one or more scanned data
stores from the list.
6. To remove a data store from the plan, select the data store row. From the Actions menu, select Remove
Data Store.
A confirmation prompt appears. Click Yes, I'm Sure.
7. When finished, click Save.
607
Chapter 29
Subject Registry
This chapter includes the following topics:
Subject Registry maps individuals with their data and provides a search tool to quickly locate an individual
whose data you collect and store. You can search for the individual by name or any other identifying detail.
After you identify the correct individual, Subject Registry provides details of the data stores, schemas, tables,
fields, and files that contain the individual's data.
Use the results to promptly respond to and stay in compliance with data privacy regulations such as GDPR,
CCPA, the right-to-be-forgotten, data portability, and breach notification.
608
The following image shows the Subject Registry workspace with a sample search result:
The following image shows the details for an individual's sensitive data:
Before you use Subject Registry, you must configure the database, define the entity, and define the SQL
query. In addition, for unstructured data stores, you must create a connection to a remote agent on the
Remote Agents workspace. Then, run a Subject Registry scan to build the index and identity map.
For more information on the prerequisite tasks for Subject Registry, see the Informatica Data Privacy
Management Administrator Guide.
Legal Hold
You can apply a legal hold flag to a data subject in the Subject Registry. A legal hold flag indicates an
organization's legal notification to retain specific data.
Based on a legal hold notification, you might choose to reject some subject requests such as the right-to-be-
forgotten requests.
You can apply and remove legal hold on a data subject from the Subject Registry page. If applied, a legal hold
notification appears next to the subject name on search results pages and subject details pages.
You can export legal hold information when you export details from the search results page or the Subject
Details page. Legal hold information is included in DSAR reports. If the subject is under legal hold, a prompt
appears to confirm if you want to generate a DSAR report.
For example, some privacy policies might apply to data subjects who are residents of countries in the
European Union but not to other data subjects. Identify residency information for data subjects and apply
specific privacy policies that are relevant to a data subject.
You can use more than one column as criteria for residency. For example, if you configure city and state
columns, both values appear in the Residency field separated by a comma.
To view Residency information on the Subject Registry search results page and on Subject Details pages,
configure the Residency attribute in the Subject Registry configuration files. On the Subject Details page,
residency information appears in the Additional Details section. Residency information also appears in DSAR
reports.
The order that you use in the configuration file determines the order of appearance of values.
For information on how to configure Residency in the Subject Registry configuration files, see the Informatica
Data Privacy Management Administrator Guide.
Category
A category is a kind or a grouping that you can use to classify data.
Data privacy regulations require that you associate subjects and subject information that you collect with a
category. For example, you can assign a data subject a subject category such as Employee or Vendor.
Personal data that you collect might fall under different categories such as Address, Family, and
Beneficiaries.
The subject category appears in the Subject Type field on Subject Details pages that you access from the
Subject Registry. Personal data categories appear in data store information on a Subject Details page. Click
the Fields/Files link to view the information. DSAR reports include the information to indicate the subject type
and the category of each personal data field.
You associate a category with a data domain. When a scan associates a sensitive field with a data domain,
the field is associated with the category. You import a list of data domain-to-master category assignments
from the Subject Registry page. If required, you can override the master category associated with specific
fields from the Sensitive Fields page.
Purpose
Purpose indicates the objective or reason for storing specific data. You can assign a purpose to a data store.
Data privacy regulations require that you collect data from a subject for a purpose and use the data only for
that purpose. For example, for payroll requirements you collect financial information such as bank account
details. You can assign the Payroll purpose to the data store and use the data for payroll requirements. The
purpose appears in data store information on a Subject Registry Subject Details page. Click the Fields/Files
link to view the information. DSAR reports include the purpose to indicate the reason for collecting the
specific data.
To access the Subject Registry workspace, click the Subject Registry icon on the Data Privacy Management
header.
The following image shows the Subject Registry icon on the header:
• Search
• Search results
• Summary metrics
The following image shows the Subject Registry workspace before you enter the search criteria:
The following image shows the Subject Registry workspace with a sample search and search results:
1. Search
2. Actions menu
3. Summary metrics
4. Search results
The following table describes the fields in the Subject RegistrySearch section:
Search Select the entity that contains the data subject you want to find. When you configure a JSON file, you
<Entity> specify an entity to include in the search.
For example, if you create two JSON files, one for a Customers entity and another for an Employees
entity, both entities appear as options in the list.
Entity names also display as the Subject Type on the Subject Details page.
Main Search Displays the main search criteria that you specify in the JSON file, and any match criteria that you
Field specify as Exact in the JSON file. For example, you specified the FullName field to search by an
individual's whole or partial first or last name.
Show Displays the optional match criteria that you specified in the JSON file. For example, you specified
Optional the City and PinCode fields to filter individuals by the city and zip or pin code in their home address.
Fields
You can import lists of category and purpose associations. You can export the summary metrics, search
results, and lists of master purposes and categories.
To export information on the Subject Registry workspace, select Export from the Actions menu.
To export search results, select Export > Search Results from the Actions menu. Download the Data.zip file
to a local directory. Extract the Data.zip file to view the CSV files that contain information about the
summary metrics and search results.
The following table describes the CSV files that you extract when you export search results from the Subject
Registry workspace:
SubjectRegistryPreferredRecord.csv For the data subjects that appear in the search results, lists
the information that you configured to display in columns on
the Subject Registry workspace. For example, the file might
contain columns that list each individual's name, company,
city, age, Social Security number, and match score.
To export data category information, select Export > Categories from the Actions menu. To export data store
purpose information, select Export > Purposes from the Actions menu.
DataCategoriesMasterList.csv Contains a list of data categories and the data domain associated with each
data category.
DataPurposeMasterList.csv Contains a list of data stores and the purpose associated with each data
store.
1. Create a .csv file with the following column headers and data:
• DataCategory. Enter the names of master data categories.
• DataDomain. For each data category, enter the name of the data domain to which you want to
associate the category. The data domain must exist. If the data domain does not exist, the import of
the row fails, and the import job continues.
• Action. Enter D to delete an association with the data domain. If you do not enter anything, the
system performs an upsert of associations. An upsert updates or inserts the value based on whether
the field contains a value or is blank.
2. From the Actions menu, click Import > Categories.
3. Select the file from which you want to import data.
The Data Privacy Management Service validates the import file, such as the file extension and column
headers.
If the file is valid, the user interface displays a count of how many records are in the file.
If the file is not valid, the user interface displays an error message. For example, an error can occur if the
file does not include valid column headings.
4. Click Import.
The Import job begins.
To check the status of the Import job, go to the Jobs workspace and view the job log for the Import File job
step. The job log shows how many records the job processed, rejected, skipped, deleted, inserted, or updated.
If the job rejected one or more rows in the CSV file, you can download a list of the rejected rows.
1. Create a CSV file with the following column headers and data:
• DataStore. Enter the names of data stores.
• Purpose. For each data store, enter the purpose to which you want to associate the data store. The
data store must exist. If the data store does not exist, the import of the row fails, and the import job
continues.
2. From the Actions menu, click Import > Purpose.
To access the Subject Details page, click the name of a data subject that appears in the search results on the
Subject Registry workspace. If the subject is under legal hold, a legal hold indicator appears to the right of
the subject name.
The following image shows the Subject Details page for the data subject John Porry, who is under legal hold:
Section Description
Summary The Subject Details summary in the top right of the page lists the number of data stores, data domains,
and fields or files that contain sensitive data about the individual.
Click the Download DSAR Report icon to download reports that Data Privacy Management created for
the subject. Select one of the following report formats:
- Detailed CSV
- Summary CSV
- Summary PDF
- Summary PDF Without Personal Data
Subject Contains information about the data subject that you configured to display in columns on the Subject
Details Registry workspace. The example image lists the FullName, CompanyName, Address, City,
PhoneNumber, and Salary information for the data subject.
Additional Contains additional information about the subject. Includes the subject type. The subject type is the
Details name of the entity file that you use in the Subject Registry search. Residency information appears if
the Subject Registry configuration includes the Residency attribute.
Requests Displays a list of tasks created about the subject, such as DSAR reports or service management
tickets. For each task, displays the task name, description, type, name of the user who created the
task, status, due date, completed date, and download options.
DSAR report task names begin with the prefix DSAR. Data subject task names begin with the prefix
DSR.
To view task details and properties on the Tasks workspace, click the task name.
For each DSAR report task in the list, the Download Options column displays one of the following
messages:
- Available for download.
- Available for download until mm/dd/yyyy.
- Not available for download.
Data Stores Lists the data stores that contain sensitive information about the subject. For each data store, also
lists the number of data domains and fields or files with sensitive data about the subject, the country,
the data owner name, and the third-party share status. If the information is shared with third parties,
lists the names of the third parties.
You can use the Group by menu to group the list by data domain or data owner name. By default, data
stores are listed by name in alphabetical order. To order the list by data domain names, select Data
Domain. To order the list by data owner names, select Data Owner.
To view the names of the data domains, click the arrow to the left of the data store name. To view the
Proliferation page for the data store, click the data store name. To view the Subject Registry Data
Store Details page, click the value in the Fields/Files column.
You access the Subject Registry Data Store Details page from the Subject Details page. On the Data Stores
pane, click the value in the Fields/Files column for a data store.
The following table describes the details about sensitive fields listed on the page:
Schema / Folder Schema or folder name in the data store that includes the sensitive information about the data
subject.
Object Name of the table or object that contains the sensitive field or file.
Field / File Field or file name that contains sensitive information about the data subject.
Data Domains Name of the data domain that includes the sensitive field or file.
Category Name of the personal data category associated with the sensitive field or file.
Purpose The purpose for collecting the sensitive information about the data subject.
To download a CSV file that contains the information displayed on the page, select Export from the Actions
menu. The default file name is SubjectRegistryFieldFiles.csv.
To verify that the report is ready to download, click the Refresh icon in the Requests pane on the Subject
Details page. When the DSAR task to create the report is complete, the task appears in the list and the
Download DSAR Report icon is enabled in the Summary panel.
Click the Download DSAR Report icon and select one of the following report formats:
• Detailed CSV
• Summary CSV
• Summary PDF
For example, if you download a DSAR report for an Employee named John Porry and you select the Summary
CSV report format, the report has the following file name: John_Porry_Employees_Summary_CSV.csv
The following table describes the data subject information included in the reports:
Subject Description
Detail
Subject Type The name of the entity file that includes the data subject.
Subject The data subject's name in the format configured for the IsSubjectName attribute in the Subject
Name Registry configuration files.
If the IsSubjectName attribute is not configured or the configuration files do not contain the
FullName, FirstName, or LastName data domains, DSAR reports do not include the data subject's
name.
Residency The location where the data subject resides, in the format configured for the Residency attribute in
the Subject Registry configuration files. If the Residency attribute is set for more than one data
domain, the report lists the locations separated by a comma (,).
If multiple golden data stores are configured in the Subject Registry entity file and the Residency
attribute is empty in the primary golden data store, DSAR reports list the Residency attribute value in
the non-primary golden data stores. However, if the data subject does not exist in the non-primary
golden data stores, DSAR reports do not list a Residency attribute value.
Legal Hold Indicates whether the data subject is under a legal hold. Values are Yes or No.
Data Name of the data category associated with the sensitive field in the data domain in the same row.
Category Category names are case sensitive. Multiple values are separated by a comma (,) in the DSAR reports.
For unstructured data stores configured for Subject Registry, Data Privacy Management does not
associate data domains with categories. PDF report formats for unstructured data stores list the
value of the Data Category property as Uncategorized.
Data Domain Name of the data domain that contains personal data about the individual.
Value For each data domain, lists the sensitive data associated with the individual.
If the value is empty for a data domain, the DSAR reports on the Subject Details page do not include
the Value property. However, the DSAR report file that you access from the DSAR task Properties
page includes the property with an empty value.
Purpose The reason that you collected the sensitive data about the individual.
If a data domain and associated category match exists in multiple data stores, the DSAR reports list
the purposes separated by a comma (,).
Unstructured data stores configured for Subject Registry cannot specify a purpose.
Shared Indicates whether the data is shared with a third party. Values are Yes or No.
Shared With If the data is shared, lists all third parties that have access to the personal data.
If a data domain and associated category match exists in multiple data stores, the DSAR reports list
the third parties separated by a comma (,).
The following table describes the additional data subject information in the Detailed CSV report format:
Data Source/File Name of the data store that contains subject data.
System
Schema/Folder Schema or folder name in the data store that contains subject data.
Table/File Table or file name in the schema or folder that contains subject data.
Column/File Type Name of the column or file that contains subject data.
Time of Retrieval The date and timestamp that you downloaded the DSAR report in the following format: Day
Mth dd hh:mm:ss IST yyyy.
For example: Thu Feb 13 07:44:07 IST 2020
You can customize the look and style of the PDF report format. Customize the template to change the image
that appears in the report or to move or delete subject information fields that appear by default. You can
delete or move fields around but cannot add fields that do not appear in the report by default.
Property Value
DSAR_HEADER_LOGO_PATH Enter the complete path and file name of the custom
image to use in the template header.
DSAR_POWERED_BY_LOGO_PATH Add the complete path and file name of the custom
image to use in the template footer.
DSAR_TEMPLATE_PATH Add the complete path and file name of the template
file that you create to customize the report format.
For information about how to add and modify custom properties, see the Data Privacy Management
Service chapter in the Informatica Data Privacy Management Administrator Guide.
5. Restart the Data Privacy Management Service.
Download a DSAR PDF report to view the updated template style.
To export details about a data subject, select Export from the Actions menu. A new browser tab opens. A
Save As dialog box that contains a .zip file named Data.zip appears. Download the .zip file to a directory on
your machine and extract the CSV files that contain information about the data subject.
SubjectRegistryEntityAttributes.csv Lists the information about the data subject that you configured to
display in columns on the Subject Registry workspace. For example,
the file might contain columns that list the individual's name,
company, city, age, Social Security number, and risk score. The legal
hold status also appears as a column in the file.
Important: Domain discovery scans for subject registry must include schema and objects that are marked as
sensitive in an SQL file. If the SQL file does not include the schema and objects that are marked as sensitive,
the information in the DSAR reports might be incorrect.
When the DSAR task finishes and the report is ready to view, Data Privacy Management sends an email to the
address configured to receive DSAR task notifications. The email contains a link to the DSAR task Properties
page on the Tasks workspace.
You can download the DSAR report in one of the following ways:
• Click the link in the notification email to download the report from the DSAR task Properties page on the
Tasks workspace. The report has the following default file name: DSAR_REPORT.csv
• On the Subject Details page, click the DSAR Report Download icon to select a CSV or PDF file format and
download the report. The icon always downloads the most recent DSAR report created for the data
subject.
If the DSAR task failed for any data stores, Data Privacy Management sends a failure notification email to the
configured email address. The email contains details only about the first data store that failed in the DSAR
task.
To create a ServiceNow ticket for a data subject, at least one Active extension that uses the DSR ServiceNow
Plugin must be configured on the Extensions workspace.
To run a custom script for a data subject, at least one Active extension that uses the DSR Custom Plugin
must be configured on the Extensions workspace.
Sending an Email
On the Subject Details page, you can create a task that sends an email to fulfill a data subject request.
To send an email to fulfill a data subject request, at least one Active extension that uses the DSR Email
Plugin must be configured on the Extensions workspace.
Related Topics:
• “Email Action Properties” on page 375
Privacy Dashboard
This chapter includes the following topics:
The Privacy Dashboard is the default dashboard that opens when you login to Data Privacy Management.
The dashboard opens on the Overview workspace. The Overview workspace displays an enterprise-level
summary of subject data and workspace indicators for key elements of subject data in the Subject Registry.
Workspace indicators display high-level information on key elements of subject data. Some dashboard
indicators contain links for accessing and managing the metric details from other pages. For example, you
can click the name of a subject request in the Subject Requests by Due Date indicator to view the task details
page.
Each indicator drills down to open a page with more detailed information. You can sort, filter and view
information, and perform tasks from these pages.
From the Actions menu, you can export workspace details as a .zip file that contains a CSV file for each
indicator. You can also switch to the Security Dashboard, refresh the view, and customize the dashboard
display.
626
Enterprise-Level Summary of Subject Data
Review the enterprise-level summary to quickly view information on subject data in the Subject Registry.
The following image shows the summary indicators on the Overview workspace of the Privacy Dashboard:
You can view the total number of subjects and subject types in the Subject Registry. You can view
information on subject requests raised in a 12-month period from the current month. You can also view the
distribution of subjects across data stores and high-level information on how much data is shared with a
third party.
Some indicators update when you run a scan on golden data stores. To view updated information, from the
Actions menu click Refresh.
Subjects Indicator
The Subjects indicator gives you a quick insight into the total number of subjects in the Subject Registry, and
into the trend in subject count and application of legal holds on subjects. It displays the total number of
subjects across all golden data stores in the Subject Registry. It also indicates the number of subjects that
have legal holds applied.
The data updates when you run a subject registry scan that includes the Discover Subjects option.
• The figure to the right of the total number of subjects indicates the change in the number of subjects from
the previous scan. The subjects trend appears as zero if there were no subjects identified in the previous
scan.
• The figure to the right of the legal hold information indicates the change in the number of subjects with
legal holds applied from the previous scan. The legal holds trend appears as zero if there were no legal
holds identified in the previous scan.
Both values are in percent. If you apply a filter, trend values do not appear on the indicator.
To view data in the Subjects indicator, you must have run a subject registry scan with the Discover Subjects
option.
Note: Although you can update legal hold information on the Subject Details page, the information in the
summary indicator updates only when you run a scan on golden data stores.
The total count appears on the right with the trend in comparison with the previous month. The trend value is
in percent.
Rest the mouse over a month in the graph to view the number of requests raised in the month. The count
includes DSAR and DSR request types in any status including deleted requests.
Each shade on the graph represents a subject type. Rest the mouse over an area on the graph to view the
name of the subject type and the number of subjects it contains. The total number of subjects also appears
in the indicator.
The subject type information updates when you add or delete an entity file in the Subject Registry. The total
number of subjects updates when you run a scan on golden data stores.
To view data in the Data Stores indicator, you must have run a subject registry scan with the Discover
Subjects option. The indicator updates each time you run a subject registry scan with the Discover Subjects
or Link Subjects option.
The indicator also displays how many of these data stores are marked as shared with a third party. You can
indicate if a data store contains data that is shared when you create the data store.
Overview workspace indicators appear below the summary indicators. You can use the indicators to drill
down and access detailed information about subject data. You can click entries in some indicators to open
and manage the data in related pages.
The indicator lists the data stores with the total number of subjects, data domains, and sensitive fields in
each data store.
The list is sorted in descending order of the subject count. The number of data domains and sensitive fields
appears as zero if you have not run a domain discovery scan. You can use filters to view specific information
or refresh the page to view updated information after you run a scan.
You can click entries in the indicator to view the following pages:
To view more information about the data stores with subjects, click the indicator label to open the Data
Stores by Subjects page.
Related Topics:
• “Proliferation Page” on page 539
• “Sensitive Fields Page” on page 546
Summary information on the total number of data stores and the number of data stores shared with a third
party appears on the upper right corner of the page.
You can perform the following actions on the Data Stores by Subjects page:
You can click entries in specific columns to view the following pages:
You can click the column header to sort the data on any column except the Data Store Type column.
Apply filters
The Filter icon opens the Filter pane that includes page-specific filters in addition to the global filters.
The Data Stores by Subjects page includes the following page-specific filters:
Data Store Type Select one or more data store types to view the data stores based on type.
Third-Party Shared Choose a value to filter data based on whether the data stores are shared with a third party
or not.
Export information
Click Actions > Export to export data store information. The Export option downloads a Data.zip file
that includes the following .csv files:
Click Actions > Take Action to view manual action options. You can choose from the following options:
You can choose to view requests by week, month, or year. The X axis tracks the time period, and the Y axis
tracks the number of requests. Different shades in the bar graph indicate different subject request types.
Move the pointer over each bar to view different request types with the count in each time frame.
For example, the Week tab tracks the last seven days on the X axis and the number of requests on the Y axis.
The Month tab tracks the month by the current week and the last four weeks. A week is calculated from
Sunday through Saturday.
The Year tab tracks data over a 12-month period that includes the current month.
Click the indicator label to open a page of detailed subject request information sorted by type.
An indicator of the total number of requests appears on the left upper corner. By default, the list also includes
closed and completed requests. You can choose to exclude closed and completed requests in the list.
The page includes the request name, subject name, subject type, request type, and information on when and
who last updated the request. The status and due date information also appears.
The page includes an All Requests tab that lists requests created and assigned to any user. The My Requests
tab lists requests created by you and requests assigned to you.
Choose from options in the Display list to filter on specified time periods, or enter a From and To date.
The dates correspond to the date of creation of the request.
You can click the column header to sort the data on any column.
Apply filters
The Filter icon opens the Filter pane that includes page-specific filters in addition to the global filters.
Request Name Enter the request name. You can enter a partial or complete name. The page updates with
data that matches the name or name fragment.
Subject Name Choose a value to filter data based on whether the data stores are shared with a third party or
not.
Request Type Select one or more request types to view specific request types.
Status Choose one or more status values to filter data on. If you include closed or completed tasks
in the filter, the Display Closed and Completed Requests option on the page is ignored.
Due Date Select a date range to view tasks with a due date within the specified date range.
Export information
Click Actions > Export to export the information in a .csv file. The Export option downloads a
SubjectRequests.csv file. If you apply a filter, select data stores on the page, or sort the data, only the
filtered or selected data is exported and in the order of sorting.
When you create a data store on the Data Stores workspace, you can associate the data store with a
geographic region. The Subject Data by Location indicator shows data stores that contain subject data by
region. The indicator includes tabs for each region. The ? tab includes data stores that you do not assign to a
region.
Use the region tabs to view information on each region. Click the title to expand the indicator. You can click a
region on the map to expand it and drill down further to view the location and data store and subject
information.
The indicator shows the number of data stores in each region and the number of subjects in the data store.
By default, the indicator displays the region with the most number of data stores with subjects.
Data stores that are not part of the Subject Registry or that do not contain any subjects do not appear in the
indicator.
You can perform the following actions on the expanded Subject Data Locations page:
An indicator of the total number of data stores appears on the left upper corner.
Summary information on the total number of data stores and the number of data stores shared with a third
party appears on the upper right corner of the page.
You can perform the following actions on the Subject Data by Locations page:
Access related pages
You can click entries in specific columns to view the following pages:
You can click the column header to sort the data on any column except the Data Store Type column.
Apply filters
The Filter icon opens the Filter pane that includes page-specific filters in addition to the global filters.
The Subject Data by Locations page includes the following page-specific filters:
Data Store Type Select one or more data store types to view the data stores based on type.
Third-Party Shared Choose a value to filter data based on whether the data stores are shared with a third party
or not.
Export information
Click Actions > Export to export data store information. The Export option downloads a Data.zip file
that includes the following .csv files:
Click Actions > Take Action to view manual action options. You can choose from the following options:
The indicator includes the name of the task, type of task, status, due date, and the last updated date. Tasks
that are in closed or completed status do not appear in the indicator.
Tasks include DSAR and DSR tasks. The Type field indicates the specific action that you choose in the task.
For a DSAR task, the Due Date column indicates the date of creation of the task. For DSR tasks, the column
indicates the due date.
You can perform the following actions on the Subject Requests by Due Date indicator:
You can click a task name to open the specific task details page. The task page that opens depends on
the type of task. You can review the details on the Task Details page and take further action as required.
Apply filters
Related Topics:
• “DSAR Task Details Page” on page 486
• “Custom Task Details Page” on page 486
• “Service Management Task Details Page” on page 490
• “Email Task Details Page” on page 487
An indicator of the total number of requests appears on the left upper corner. By default, the list does not
include closed and completed requests. Only requests that are open, in progress, failed, or completed with
errors appear in the list. You can choose to include closed and completed requests in the list.
In addition to information that appears on the indicator on the Privacy Dashboard, the page includes the
subject name, subject type, and information on when and who last updated the request.
The page includes an All Requests tab that lists requests created and assigned to any user. The My Requests
tab lists requests created by you and requests assigned to you.
You can click the column header to sort the data on any column.
Apply filters
The Filter icon opens the Filter pane that includes page-specific filters in addition to the global filters.
The Subject Requests by Due Date Details page includes the following page-specific filters:
Request Name Enter the request name. You can enter a partial or complete name. The page updates with
data that matches the name or name fragment.
Subject Name Choose a value to filter data based on whether the data stores are shared with a third party or
not.
Request Type Select one or more request types to view requests based on the request type.
Status Choose one or more status values to filter data on. If you include closed or completed tasks
in the filter, the Display Closed and Completed Requests option on the page is ignored.
Due Date Select a date range to view tasks with a due date within the specified date range.
Export information
Click Actions > Export to export the information in a .csv file. The Export option downloads a
SubjectRequests.csv file. If you apply a filter, select data stores on the page, or sort the data, only the
filtered or selected data is exported and in the order of sorting.
You can perform the following tasks from the Actions menu:
• To customize the workspace indicators that display on the dashboard, click the Customize Dashboard
option.
- Click On or Off to display or hide an indicator and then click Save to save the changes.
- To change the order of display of the indicators, rest the cursor over the indicator that you want to move.
You can also click and hold on the indicator. The Move icon appears. Drag the indicator to the required
location on the page. Adjacent indicators move to adjust to the changed location.
• To export information for each indicator and summary indicator on the workspace as a CSV file, click
Export. Extract the ZIP file to view the CSV files.
• To refresh the dashboard data, Refresh.
• To switch to the Security dashboard, click Switch to Security Dashboard.
On the Filter panes for the drill-down pages, a dashed line separates the global conditions from the page-
specific filter conditions. The page-specific filter conditions appear above the dashed line.
When you apply filters, the subject data in all indicators updates based on the criteria.
Data Store Name Enter the data store name. You can enter a partial or complete name. The dashboard data
updates with data that matches the name or name fragment.
Classification Policy Select one or more classification policies to view subject data related to classification
policies.
Region Filters based on one or more geographic regions. When you select a country or data store
location without specifying a region, a dash appears in the region check box to indicate that
the region contains one or more countries or data store locations that you did not select.
Country When you select a region first, Data Privacy Management automatically selects the countries in
the region. You can clear any of the selections. You can also include other countries outside
the region.
Data Store Location When you select a region first, Data Privacy Management automatically selects the data store
locations in with the region.
The .zip file contains CSV files. Each CSV file contains the information in a workspace indicator. If you
customize the workspace before you export, the information in the CSV files matches the active indicators.
The following table lists the CSV files in the DPM-PrivacyDashBoardExport<timestamp>.zip file that you
export from the Overview workspace:
DataStores.csv Contains the number of data stores that include subject data and the
number of these data stores that are shared with a third party. Includes
both golden and transaction data stores.
SubjectDataByLocation.csv Contains the number of data stores for each region with the number of
subjects in each data store.
SubjectRequests.csv Contains the number of subject requests raised each month for the last
year. Also includes the change from the previous month to the current
month. The change is indicated in percent.
SubjectRequestsByDueDate.csv Contains a list subject requests. Also includes the type of request, status,
due date and the date it was last updated.
SubjectRequestsByType_<Week/ Contains a list of all subject request types raised during the week, month,
Month/Year>.csv or year. Also includes the number of each request type raised.
Subjects.csv Contains the total number of subjects in the Subject Registry and the
number of subjects under legal hold. Also includes information on the
trends in change of subject count and legal holds.
SubjectTypes.csv Contains a list of subject types in the Subject Registry and the number of
subjects in each subject type.
TopDataStoresBySubjects.csv Contains a list of up to 25 data stores ordered from high to low subject
count. Also includes the number of data domains and files/fields.
Troubleshooting
This appendix includes the following topics:
Data Stores
If you encounter a problem when creating, editing, importing, or synching a data store, review the following
list of previously reported issues and their solutions before contacting Informatica Global Customer Support.
When I import a data store, the import job issues a warning and the data store is not created or
updated.
This issue occurs either if the Enterprise Unified Metadata server is down or the data store already exists in
Enterprise Unified Metadata.
Verify that the Enterprise Unified Metadata server is up. Verify that the data store details in the import file are
correct. Import the file again.
Review the list of data stores in Enterprise Unified Metadata to see if the secondary data store is on the list.
If the secondary data store is still on the list, then wait for some time and unmerge again.
When I create an Amazon Redshift data store, the test connection fails.
This issue occurs when the Amazon Redshift jar file is not available in the $INFA_HOME/java/jre/lib/ext
folder.
Copy the Amazon Redshift jar file to the $INFA_HOME/java/jre/lib/ext folder. Then, restart the domain.
When I test the connection to an SAP data store, I get the error Cannot load metadata adapter.
Verify that you installed SAP JCo correctly. For information on how to install SAP JCo, see the Informatica
Data Privacy Management Administrator Guide.
637
When I test the connection to an SAP data store, I get the error Connect to SAP gateway failed.
To resolve this error, verify that you entered correct values for the SAP connection properties, especially the
System Number property.
Jobs
If you encounter a problem with jobs in Data Privacy Management, consult the job logs before contacting
Informatica Global Customer Support. A job log contains informational, warning, and error messages that a
job encountered.
Each job step has a separate log. The logs include information about the tasks performed in each job step
and messages that determine the cause of errors. You can view and download the logs for each job step
from the Job Details page on the Jobs workspace.
The following list describes previously reported job errors and solutions:
The profiling job step in a Database Scan job fails due to an error in the underlying transport
layer.
The profiling job step of a database scan job fails when the Informatica domain and services use the Secure
Sockets Layer (SSL) protocol for secure communication.
The following excerpt shows some of the errors in the Data Privacy Management log files:
2015-12-09 15:57:35,444 INFO [DpmProfilerImpl]
[JOB_TYPE=SCAN:JOB_ID=2:JOB_STEP_ID=8:REPO_ID=1]--[Administrator] Data Store
Name=Oracle1_12c: Starting the data profile for the data store.
2015-12-09 15:57:35,771 INFO [EdrConnectionManager]
[JOB_TYPE=SCAN:JOB_ID=2:JOB_STEP_ID=8:REPO_ID=1]--[Administrator] Data Store
Name=DS_1_DPM: Creating an EDR connection named DS_1_DPM.
2015-12-09 15:57:35,834 INFO [JobProcessor] [Administrator] Get job step details
successful.
2015-12-09 15:57:36,127 INFO [JobProcessor] [Administrator] Getting job step logs for
the job step id {0} successful.
2015-12-09 15:57:37,420 INFO [JobProcessor] [Administrator] Get job step details
successful.
2015-12-09 15:57:37,880 INFO [JobProcessor] [Administrator] Getting job step logs for
the job step id {0} successful.
2015-12-09 15:57:43,735 INFO [JobProcessor] [Administrator] Get job step details
successful.
2015-12-09 15:57:43,803 INFO [JobProcessor] [Administrator] Getting job step logs for
the job step id {0} successful.
2015-12-09 15:57:45,460 ERROR [OSGIClientFactoryImpl] [DTF_0001] An error occurred in
the underlying transport layer: [[sendMSG]: Channel is in [7 : CLOSED] state.]
com.informatica.pcsf.datatransport.DataTransportException: [DTF_0001] An error occurred
in the underlying transport layer: [[sendMSG]: Channel is in [7 : CLOSED] state.]
at
com.informatica.pcsf.datatransport.impl.DataTransportChannelImpl.sendRequest(DataTranspor
tChannelImpl.java:75)
at
com.informatica.pcsf.servicesframework.client.impl.ClientFactoryImpl.sendHeartbeatToRecip
ient(ClientFactoryImpl.java:979)
at com.informatica.pcsf.servicesframework.client.impl.ClientFactoryImpl.access
$3(ClientFactoryImpl.java:957)
at com.informatica.pcsf.servicesframework.client.impl.ClientFactoryImpl
$HeartbeatSender.run(ClientFactoryImpl.java:939)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.beepcore.beep.core.BEEPException: [sendMSG]: Channel is in [7 : CLOSED]
state.
at org.beepcore.beep.core.ChannelImpl.sendMsgInternal(ChannelImpl.java:399)
Jobs 639
com.informatica.ds.ms.service.MappingExecutionWaitForCompletionState.onEvent(MappingExecu
tionWaitForCompletionState.java:50)
at com.informatica.ds.ms.service.StateMachine.dispatch(StateMachine.java:32)
at com.informatica.ds.ms.service.MappingServiceImpl
$FutureNotificationTask.run(MappingServiceImpl.java:1416)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: com.informatica.jdtm.oop.OOPPPUncheckedException: [JDTM_10014] Internal
error. The DTM process encountered an exception because of the following error:
[com.informatica.dtm.transport.DTFUncheckedException: [JDTM_10005] Internal error. The
Integration Service could not create the DTM process. Contact Informatica Global
Customer Support.]. Contact Informatica Global Customer Support.
at com.informatica.jdtm.oop.DTMProcess.throwSevereException(DTMProcess.java:1016)
at com.informatica.jdtm.oop.DTMProcess.joinOnCreationTask(DTMProcess.java:512)
at
com.informatica.jdtm.oop.MappingDispatcher.findOrCreateProcess(MappingDispatcher.java:144
)
at com.informatica.jdtm.oop.MappingDispatcher.createDTM(MappingDispatcher.java:201)
at com.informatica.powercenter.sdk.dtm.OOPPPDTM.<init>(OOPPPDTM.java:45)
at
com.informatica.platform.ldtm.executor.edtm.ExecutorRuntime.init(ExecutorRuntime.java:168
)
at
com.informatica.platform.ldtm.executor.edtm.EdtmExecutor.initializeRuntime(EdtmExecutor.j
ava:517)
at
com.informatica.platform.ldtm.executor.edtm.EdtmExecutor.createAndExeJdtm(EdtmExecutor.ja
va:408)
at com.informatica.platform.ldtm.executor.edtm.EdtmExecutor.run(EdtmExecutor.java:267)
at com.informatica.platform.ldtm.executor.ExecutionEngine
$SubmittedRunnable.run(ExecutionEngine.java:830)
... 5 more
2015-12-09 11:43:02.830 <MappingServiceFutureNotifyThreadPool_Thread_0> WARNING:
[MPSVCCMN_10093] The Mapping Service Module failed to run the job with ID
[5znE9J47EeWi5PHEi22VVw].
2015-12-09 11:57:19.252 <pool-6-thread-5> WARNING: [Worchcc_10001] The workflow database
connection is not configured.
2015-12-09 11:57:19.452 <pool-6-thread-4> WARNING: [Worchcc_10001] The workflow database
connection is not configured.
2015-12-09 11:57:47.569 <pool-6-thread-2> WARNING: [Worchcc_10001] The workflow database
connection is not configured.
2015-12-09 11:57:47.613 <pool-6-thread-5> WARNING: [Worchcc_10001] The workflow database
connection is not configured.
2015-12-09 11:58:38.700 <pool-6-thread-4> WARNING: [Worchcc_10001] The workflow database
connection is not configured.
2015-12-09 11:58:38.713 <pool-6-thread-2> WARNING: [Worchcc_10001] The workflow database
connection is not configured.
2015-12-09 11:58:56.718 <Update and Monitor Service Process Controller Thread> INFO:
[DS_10067] Deinitializing the Integration Service web application.
2015-12-09 11:58:56.723 <Update and Monitor Service Process Controller Thread> INFO:
[MPSVCCMN_10000] [MappingService] Mapping Service cancel operation completed..
2015-12-09 11:58:56.757 <Update and Monitor Service Process Controller Thread> INFO:
[WorchSS_10006] The stop action on the application listener was aborted.
2015-12-09 11:58:57.197 <Update and Monitor Service Process Controller Thread> SEVERE:
Internal error. The DTM process encountered an exception because of the following error:
[com.informatica.jdtm.oop.OOPPPUncheckedException: [JDTM_10014] Internal error. The DTM
process encountered an exception because of the following error:
[com.informatica.dtm.transport.DTFUncheckedException: [JDTM_10005] Internal error. The
Integration Service could not create the DTM process. Contact Informatica Global
Customer Support.]. Contact Informatica Global Customer Support.]. Contact Informatica
Global Customer Support.
2015-12-09 11:58:57.199 <Update and Monitor Service Process Controller Thread> INFO:
Shutting down statistics manager. Forced= 'true'.
2015-12-09 11:58:57.200 <Update and Monitor Service Process Controller Thread> INFO:
Statistics manager shut down.
2015-12-09 11:59:06.194 <MC_Thread_Factory-Thread_1> WARNING: [MonSDK_10045] Thread
[MC_Thread_Factory-Thread_1] is interrupted.
The profiling job step of a Database Scan job fails with an incorrect syntax message.
The profiling job step of a database scan job fails under the following conditions:
Update the odbc.ini files on the machines that host the Data Integration Service. Configure the following
entry:
EnableQuotedIdentifiers=1
Then, resume the job.
The SAP Scan job step fails because the user cannot open a file in the following
directory: /usr/sap/interfaces/.
The Scan job fails because the user does not have the complete set of authorizations. The following excerpt
shows a sample error message:
2017-08-09 22:12:19,102 [MonitorExecutionStatus] ERROR
com.infa.products.edc.scanners.profilescanner.BatchProfileExecutor- ParentName:
SAPSchema , SourceName: T006J , ErrorMessage: (java.util.concurrent.ExecutionException:
com.informatica.sdk.dtm.ExecutionException: [LDTM_0072] The following error occurred
while executing the RFC module: [ No authorization to open file /usr/sap/
interfaces/ ])
2017-08-09 22:12:19,102 [MonitorExecutionStatus] ERROR
com.infa.products.edc.scanners.profilescanner.BatchProfileExecutor- ParentName:
SAPSchema , SourceName: T006T , ErrorMessage: (java.util.concurrent.ExecutionException:
To resolve this issue, verify that the SAP user has the required authorizations. For the required list of user
authorizations, see the Informatica Data Privacy Management Administrator Guide.
The SAP Scan job step fails because the user cannot open a table.
The Scan job fails because the user does not have the authorization to access tables. The following excerpt
shows a sample error message:
2017-08-09 17:21:23,698 [MonitorExecutionStatus] ERROR
com.infa.products.edc.scanners.profilescanner.BatchProfileExecutor- ParentName:
Jobs 641
SAPSchema , SourceName: T006A , ErrorMessage: (java.util.concurrent.ExecutionException:
com.informatica.sdk.dtm.ExecutionException: [LDTM_0072] The following error occurred
while executing the RFC module: [User ( DPMUSER ) is not authorized to access the
specified table T006A])
To resolve this issue, verify that the SAP user has the required authorizations. For the required list of user
authorizations, see the Informatica Data Privacy Management Administrator Guide.
• The number and types of child jobs created for a master Subject Registry job differ based on the scan
configuration.
• If one or more domain discovery scans fail, the status of the poll domain discovery job and the master
Subject Registry Scan job are set to failed.
• If the master job fails because a poll domain discovery job fails, identify and fix issues in the domain
discovery configuration and manually resume individual domain discovery jobs. After they finish
successfully, resume the master Subject Registry job.
• If a golden data store subject scan fails, the status of the subject scan job, the Scan Golden Data Stores
job, and the master job are set to failed. Identify and fix the issues in the subject scan and then resume
the master job. The master job triggers the required jobs to finish the scan.
• If a transaction data store subject scan fails, the status of the specific subject scan and the Scan
Transaction Data Stores job are set to failed and the rest of the scans continue. The status of the master
job is then set to failed. Identify and fix the issue in the subject scan and then resume the master job. The
master job triggers the failed jobs.
Protection
If you encounter a problem while performing tasks to protect sensitive data inData Privacy Management,
review the following list of previously reported issues and their solutions before contacting Informatica
Global Customer Support.
A protection job that uses a Persistent Data Masking protection extension fails at the
ExportProjectXML job step.
The ExportProjectXML job step log file lists the following error or a similar error: cannot insert NULL into
(ILM_PLAN_PROPERTY" ;.& quot;VALUE"). This problem might occur for one of the following
reasons:
• The TDM_CONNECTION is not created in the Test Data Management (TDM) application.
Before running the first protection job that includes a Persistent Data Masking protection extension login
to TDM. The first time you log in, TDM creates the TDM_CONNECTION and sets it as a storage
connection.
• A primary key or unique index is not configured for the sources on which data masking is applied.
Verify that the sources with data masking all contain a primary key or unique index.
• One or all of the following data attributes might be missing: precision, datatype, and scale.
Verify that the source data contains the required attributes for masking.
In the PowerCenter Workflow Manager, edit the ODBC connections and remove the SET QUOTED_IDENTIFIER
ON property from the SQL environment variable if the Table section in the job step log file lists the following
error:
ERROR : Sat Sep 09 02:06:20 2017 [pmilmcmnlibs_1762] : TRANSF_1_1_1 [ERROR] SET
QUOTED_IDENTIFIER ON
ORA-00922: missing or invalid option
Database driver error...
Function Name : executeDirect
SQL Stmt : SET QUOTED_IDENTIFIER ON
Oracle Fatal Error
Database driver error...
Function Name : ExecuteDirect
Oracle Fatal Error
A protection job that uses a Persistent Data Masking protection extension fails at the
ExecuteWorkflow job step with invalid data.
In the Data Privacy Management Jobs workspace, view the Job Details and select the ExecuteWorkflow job
step. In the Tables pane for the job ID, download the session log for the table that caused the job to fail. The
session log contains the following message: The session reached the error threshold for the Data
Masking transformation.
This error occurs because some data in the table is not in a valid format. PowerCenter validates data before
applying the masking rules for columns with Email address, Phone, IP address, SSN, Credit Card, and URL
datatypes. To remedy this error, apply one of the following methods:
• Verify that columns with these datatypes contain data that is in the correct format.
• Update the Stop on Error value to a higher number if ignoring errors is an acceptable option.
• Update the Stop on Error value to 0 to ignore all data errors.
• Update the error and null handling parameters in the masking rule so that the protection extension will
proceed when it encounters this problem.
A protection job that uses a Persistent Data Masking protection extension fails when the job
includes metadata with special characters or letters in camel case.
This problem might occur because the protection job was not configured to enable special characters or
letters in camel case.
In the Data Privacy Management Tasks workspace, select the task and then select Actions > Schedule
Protection Job. In the Advanced Settings section on the Runtime Configuration page, ensure that the Enable
Special Characters in Metadata check box is selected.
Scans
If you encounter a problem during a scan, review the following list of previously reported issues and their
solutions before contacting Informatica Global Customer Support.
Scans 643
When scanning an unstructured data store, the scan job fails at the Load job step even though the
test connection for the data store is successful.
This issue occurs because the source directory is not mounted on the Data Privacy Management and Hadoop
nodes.
Mount the source directory on the Data Privacy Management and Hadoop nodes. Verify that the Path
property on the data store UI page shows the correct mount point.
When scanning an unstructured data store, the Load job step keeps running even though the test
connection for the data store is successful.
This issue might be because of one of the following reasons:
• The CIFS or NFS mount servers that hosts the source might be down.
• If the server is CIFS, the password for the user account that accesses the unstructured data might have
changed.
Check if the CIFS and NFS mount server is down. If the server is CIFS and the password has changed,
remount the directory on the Hadoop cluster.
When scanning a Microsoft OneDrive or SharePoint data store with a remote agent, I get a 429
error.
This issue occurs when there is more than one request to scan the data store at the same time.
For Microsoft OneDrive data stores, perform the following steps to increase the retry count and retry interval
of the Scan job, and then run the scan again:
1. Create a file named onedrive-connector.properties and save the file in the following directory:
siagent/WEB-INF/classes/
2. Add the following two properties in the file:
od.retryCount=<number of times to retry the failed request>
od.retryWaitTime=<number of milliseconds between retries>
For Microsoft SharePoint data stores, perform the following steps to increase the retry count and retry
interval of the Scan job, and then run the scan again:
1. Create a file named sharepoint-connector.properties and save the file in the following directory:
siagent/WEB-INF/classes/
2. Add the following two properties in the file:
sp.retryCount=<number of times to retry the failed request>
sp.retryWaitTime=<number of milliseconds between retries>
Also, if files or folders in Microsoft OneDrive or SharePoint data stores contain a pound symbol (#) at the
beginning of a file or folder name, the Browse job step might fail or enter an infinite loop.
Remove the pound symbol (#) from all file and folder names included in the Scan job, and run the scan again.
When I run a PCRS scan, the Collect Connections job step completes with a warning. The data
store is not created in Enterprise Unified Metadata.
This issue occurs when Data Privacy Management cannot connect to Enterprise Unified Metadata.
Verify that the connection properties to Enterprise Unified Metadata on the data store UI page are correct.
Then scan the data store again.
When I select the Discover Subject or Link Subject options from the Scan Type list, I get a warning
message.
Data Privacy Management validates the Subject Registry configuration in the following ways:
• Data Privacy Management verifies settings that have a predefined set of values and numeric value
settings that are bound by a specific range. For example, Data Privacy Management verifies that the
MatchType setting is specified with a valid value such as "Fuzzy." If the setting is not valid, a warning
message appears.
• Data Privacy Management verifies references to other settings in the entity file. For example, it verifies
that the field name that is specified in the MatchFields section of the entity file, such as FieldName,
matches a field that is defined in EntityFields section of the entity file. If Data Privacy Management cannot
find a match, a warning message appears.
• Data Privacy Management verifies the configuration against other objects defined in Data Privacy
Management. For example, it verifies that the FieldName specified in the EntityFields section of the entity
file is a valid data domain name. If the data domain that matches the field name is not present, a warning
message appears.
When a warning message appears, review the following Subject Registry configuration settings in the entity
file:
• Verify that settings that have predefined values are set to a valid value.
• Verify that numeric value settings are specified within the valid range.
• Verify references to other settings within the entity file.
• Verify that data domain names and other object names are valid.
• Overview, 646
• Updating the Keystore and Truststore with Informatica Certificates, 646
• Updating the Keystore and Truststore with OpenSSL-Generated Certificates, 647
Overview
When you need to change the current keystore and truststore certificates for a business reason or because
the certificates have expired, import the new certificates to maintain secure communication. You can import
either the default certificate from Informatica or a certificate you created using OpenSSL.
Note: To create new keystore and truststore files, refer to the Informatica How-To Article How to Create
Keystore and Truststore Files for Secure Communication in the Informatica Domain.
646
Use the following commands:
• keytool -delete -alias infa_dflt -keystore
$INFA_HOME/tomcat/conf/Default.keystore -storepass changeit
• keytool -delete -alias infa_dflt -keystore
$INFA_HOME/services/shared/security/infa_truststore.jks -storepass
pass2038@infaSSL
4. Generate or locate the new certificate.
Use the following command to generate the new certificate:
keytool -genkey -alias infa_dflt -keyalg RSA -keypass changeit -storepass changeit -
keystore $INFA_HOME/tomcat/conf/Default.keystore -dname CN=<host
name>,OU=<organizational unit>,O=<organization>,L=<city>,S=<state>,C=<two-letter
country code>
Example:
keytool -genkey -alias infa_dflt -keyalg RSA -keypass changeit -storepass changeit -
keystore $INFA_HOME/tomcat/conf/Default.keystore -dname
CN=psvilxilmt01.informatica.com,OU=Data Privacy
Management,O=Informatica,L=RedwoodCity,S=California,C=US
5. Import the new certificate to the Default.keystore file.
Use the following command:
keytool -import -noprompt -file <new certificate name> -alias infa_dflt -keystore
$INFA_HOME/tomcat/conf/Default.keystore -storepass changeit
648 Appendix B: Updating Keystore and Truststore Certificates to Maintain Secure Communication
Index
649
data domains DSAR task
conformance score 209 details 487
copying 216
creating 211
data match condition 206
deleting 217
E
delinking associations 216 email task
details page 201 subject registry 625
editing 216 encryption
exporting 212 overview 589
importing 212 tasks, configuring 501
management 210 tasks, decrypting 503
metadata match condition 204 techniques 598
overview 199 encryption rules
properties 202 copying 600
synchronizing 217 creating 598
workspace 200 deleting 598
data privacy details 595
data breach reports 117 editing 598
data privacy management exporting 599
overview 23 importing 600
data store details managing 598
exporting 552 overview 594
data store groups properties 596
creating 62 workspace 595
managing 63 encryption tasks
managing assignments 63 importing protection status 109
system-defined 62 extensions
data stores common properties 68
bulk update 96 creating 74
copying 93 custom 69
creating 91 deleting 75
deleting 123 editing 75
details page 88 email 69
editing 94 managing 73
exporting aliases 118 overview 65
exporting configurations 119 plugins 66
fixing 94 protection, adding to data domains 74
importing 97 protection, encryption 70
importing aliases 106 protection, PDM - Big Data 70
importing Catalog metadata 115 protection, PDM - Remote Domain 71
importing catalog resources protection, supported data stores 66
importing catalog resources service management 72
data stores 116 system log 73
importing connection assignments 114 types 66
importing data owners 108 workspace 68
importing lineage 112
importing protection status 109
list page 87
location assignments 59
G
management 90 global sensitivity status
overview 86 exporting 242
proliferation 539 importing 237–239
protection status 522
resetting classification results 95
status filters 89
summary 89
I
testing a remote agent connection 92 import
decryption locations 55
rollback task 503 importing data store connections 114
DSAR
task properties 482
DSAR report
downloading 622
J
errors 622 jobs
formats 617 big data 288
preparing 622 cloud 290
data integration 291
650 Index
jobs (continued) master purpose (continued)
database management 293 override 555
downloading rejected records 307
evaluate security policy 295
exporting 306
file management 295
N
filters 282 Never Sensitive
import catalog results 297 scope 235
incremental scan 298
orchestration 298
overview 281
pausing 305
O
protection 299 override
protection, scheduling 505 category 555
remote agent scan job report 307 purpose 555
resuming 305 overview workspace
scan job report 306 data domain details page 529
stopping 305 data domains list page 531
subject data report purge 302 data stores grid page 533
sync catalog updates 304 data stores list page 535
task status 492 departments page 538
terminating 306 export files 576
troubleshooting 638 exporting data 576
types 284 indicators, discovery bar 512
workspace 281 indicators, protection status 514
indicators, residual risk 515
indicators, risk score 512
Index 651
protection (continued)
prerequisites 590 S
process flow 591 scans
remote agents 78 Amazon Redshift 264
protection extensions Amazon S3 264
adding to data domains 74 Azure Data Lake 264
protection task big data 260
scheduling, PDM BDE 505 Cassandra 276
protection tasks cloud 264
configuring 499 Cloudera Navigator 260
creating 385, 400 copying 255
details 489 creating 251
encryption rule properties 596 data integration 267
encryption techniques 598 database management 271
extension plugins 66 deleting 255
extensions 66 editing 255
list page 481 exporting 255
managing 499 file management 273
overview 480 file system 273
permissions 590 Google Drive 273
prerequisites 590 HDFS 260
process flow 591 Hive 260
properties 484 IBM Db2 271
reassigning 498 Informatica Cloud 267
sensitive fields page 546 Informatica Data Engineering Integration 267
status 492 Informatica PowerCenter 267
troubleshooting 642 JDBC 271
managing 251
Microsoft Azure Blob Storage 264
R Microsoft Azure Data Warehouse 264
Microsoft Azure Database 264
remote agents Microsoft OneDrive 273
copying 82 Microsoft SharePoint 273
creating 81 Microsoft SQL Server 271
editing 82 Netezza 271
exporting 82 noSQL 276
filter 78 Oracle 271
guidelines 77 overview 243
importing 84 properties 248
list page 78 running 252
managing 81 Salesforce 264
overview 76 SAP 257
properties 80 SAP HANA 271
protection 78, 81 Snowflake 264
publishing 84 SQL Server Integration Services 267
subject registry 79, 81 Sybase 271
risk simulation plans Teradata 271
copying 605 scope
creating 604 Always Sensitive 235
creating, data stores grid page 533 Never Sensitive 235
details page 602 security dashboard
editing 606 customizing 569
exporting 605 data stores grid page 533
list page 602 data stores list page 535
managing 604 enterprise-level summary 511
overview 601 export files 576
properties 603 exporting data 576
workspace 601 filtering 570
rules indicators, sensitive data by data store groups 517
subset 235 indicators, sensitive data by location 518
super 235 indicators, sensitive data proliferation 519
rworkspaces indicators, top data domains 520
remote agents 77 indicators, top data stores 522
indicators, top departments 524
indicators, top users 526
indicators, user access 528
managing 569
sensitive fields page 546, 555
652 Index
security policies subject registry (continued)
action placeholders 380 troubleshooting 645
copying 455 subset rules 235
creating 453 super rules 235
decryption 448 suppression rules
details 449 workspace 366
editing 455 system tasks
properties 450 properties 484
types 447
workspace 449
security policy violations
workspace 463
T
sensitive columns tasks
managing 232 attachments, adding 495
sensitive data attachments, deleting 495
locations 545 custom 486
sensitive data by data store group page 542 data subject, running 499
sensitive fields deleting 496
editing properties 551 DSAR details 487
exporting 551 editing 496, 497, 503
importing 553 email 487
proliferation 539 encryption, configuring 501
properties pane 548 exporting 497
sensitive fields page filter 482
email server 559 list page 481
filter 573 managing 494
sensitive files marking as closed 498
properties pane 557 overview 480
sensitive files page PDM - Big Data, editing 502
filter 574 PDM - Remote Domain, editing 502
service management task properties 484
subject registry 623 protection 489
settings protection, configuring 499
agent scan 41 protection, decrypting 503
anomaly factor weights 44 protection, marking as completed 503
anomaly severity ranges 46 protection, marking as configured 504
Axon integration 47 protection, reconfiguring 504
conformance ranges 45 protection, rejecting 505
custom risk score factors 43 protection, scheduling jobs 505
dashboard 41 protection, viewing configuration 508
data privacy 41 reassigning 498
risk cost 45 service management 490
risk score factors 42 statuses 492
sensitivity levels 42 system log 491
theme color 40 system, running 499
settings workspace workspace 481
properties 39 troubleshooting
subect details page subject registry 645
actions 620
subject details
exporting 620
subject details page 615
U
subject registry user interface 29
categories 613 users
category 610 page 567
custom task 624 user access page 561
data store details page 616 user activity page 562
email task 625 user profile page 564
exporting data 613 Users
legal hold 609 workspace 312
overview 608
purpose 610, 613
remote agent details 79
residency 610
W
search 612 workspaces
ServiceNow task 623 actions 373
subject details 615 always sensitive/never sensitive 233
summary 613 anomaly detection 345
Index 653
workspaces (continued) workspaces (continued)
classification policies 218 risk simulation plans 601
data domains 200 scans 246
data store groups 60 security policies 449
data stores 87 security policy violations 463
encryption rules 595 settings 36
extensions 68 subject registry 611
jobs 281 suppression rules 366
locations 51 tasks 481
overview 510
654 Index