In 100 ApplicationServiceGuide en
In 100 ApplicationServiceGuide en
0)
This product includes software licensed under the terms at http://www.tcl.tk/software/tcltk/license.html, http://www.bosrup.com/web/overlib/?License, http://
www.stlport.org/doc/ license.html, http://asm.ow2.org/license.html, http://www.cryptix.org/LICENSE.TXT, http://hsqldb.org/web/hsqlLicense.html, http://
httpunit.sourceforge.net/doc/ license.html, http://jung.sourceforge.net/license.txt , http://www.gzip.org/zlib/zlib_license.html, http://www.openldap.org/software/release/
license.html, http://www.libssh2.org, http://slf4j.org/license.html, http://www.sente.ch/software/OpenSourceLicense.html, http://fusesource.com/downloads/licenseagreements/fuse-message-broker-v-5-3- license-agreement; http://antlr.org/license.html; http://aopalliance.sourceforge.net/; http://www.bouncycastle.org/licence.html;
http://www.jgraph.com/jgraphdownload.html; http://www.jcraft.com/jsch/LICENSE.txt; http://jotm.objectweb.org/bsd_license.html; . http://www.w3.org/Consortium/Legal/
2002/copyright-software-20021231; http://www.slf4j.org/license.html; http://nanoxml.sourceforge.net/orig/copyright.html; http://www.json.org/license.html; http://
forge.ow2.org/projects/javaservice/, http://www.postgresql.org/about/licence.html, http://www.sqlite.org/copyright.html, http://www.tcl.tk/software/tcltk/license.html, http://
www.jaxen.org/faq.html, http://www.jdom.org/docs/faq.html, http://www.slf4j.org/license.html; http://www.iodbc.org/dataspace/iodbc/wiki/iODBC/License; http://
www.keplerproject.org/md5/license.html; http://www.toedter.com/en/jcalendar/license.html; http://www.edankert.com/bounce/index.html; http://www.net-snmp.org/about/
license.html; http://www.openmdx.org/#FAQ; http://www.php.net/license/3_01.txt; http://srp.stanford.edu/license.txt; http://www.schneier.com/blowfish.html; http://
www.jmock.org/license.html; http://xsom.java.net; http://benalman.com/about/license/; https://github.com/CreateJS/EaselJS/blob/master/src/easeljs/display/Bitmap.js;
http://www.h2database.com/html/license.html#summary; http://jsoncpp.sourceforge.net/LICENSE; http://jdbc.postgresql.org/license.html; http://
protobuf.googlecode.com/svn/trunk/src/google/protobuf/descriptor.proto; https://github.com/rantav/hector/blob/master/LICENSE; http://web.mit.edu/Kerberos/krb5current/doc/mitK5license.html; http://jibx.sourceforge.net/jibx-license.html; https://github.com/lyokato/libgeohash/blob/master/LICENSE; https://github.com/hjiang/jsonxx/
blob/master/LICENSE; https://code.google.com/p/lz4/; https://github.com/jedisct1/libsodium/blob/master/LICENSE; http://one-jar.sourceforge.net/index.php?
page=documents&file=license; https://github.com/EsotericSoftware/kryo/blob/master/license.txt; http://www.scala-lang.org/license.html; https://github.com/tinkerpop/
blueprints/blob/master/LICENSE.txt; http://gee.cs.oswego.edu/dl/classes/EDU/oswego/cs/dl/util/concurrent/intro.html; https://aws.amazon.com/asl/; https://github.com/
twbs/bootstrap/blob/master/LICENSE; https://sourceforge.net/p/xmlunit/code/HEAD/tree/trunk/LICENSE.txt; https://github.com/documentcloud/underscore-contrib/blob/
master/LICENSE, and https://github.com/apache/hbase/blob/master/LICENSE.txt.
This product includes software licensed under the Academic Free License (http://www.opensource.org/licenses/afl-3.0.php), the Common Development and Distribution
License (http://www.opensource.org/licenses/cddl1.php) the Common Public License (http://www.opensource.org/licenses/cpl1.0.php), the Sun Binary Code License
Agreement Supplemental License Terms, the BSD License (http:// www.opensource.org/licenses/bsd-license.php), the new BSD License (http://opensource.org/
licenses/BSD-3-Clause), the MIT License (http://www.opensource.org/licenses/mit-license.php), the Artistic License (http://www.opensource.org/licenses/artisticlicense-1.0) and the Initial Developers Public License Version 1.0 (http://www.firebirdsql.org/en/initial-developer-s-public-license-version-1-0/).
This product includes software copyright 2003-2006 Joe WaInes, 2006-2007 XStream Committers. All rights reserved. Permissions and limitations regarding this
software are subject to terms available at http://xstream.codehaus.org/license.html. This product includes software developed by the Indiana University Extreme! Lab.
For further information please visit http://www.extreme.indiana.edu/.
This product includes software Copyright (c) 2013 Frank Balluffi and Markus Moeller. All rights reserved. Permissions and limitations regarding this software are subject
to terms of the MIT license.
See patents at https://www.informatica.com/legal/patents.html.
DISCLAIMER: Informatica LLC provides this documentation "as is" without warranty of any kind, either express or implied, including, but not limited to, the implied
warranties of noninfringement, merchantability, or use for a particular purpose. Informatica LLC does not warrant that this software or documentation is error free. The
information provided in this software or documentation may include technical inaccuracies or typographical errors. The information in this software and documentation is
subject to change at any time without notice.
NOTICES
This Informatica product (the "Software") includes certain drivers (the "DataDirect Drivers") from DataDirect Technologies, an operating company of Progress Software
Corporation ("DataDirect") which are subject to the following terms and conditions:
1. THE DATADIRECT DRIVERS ARE PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
2. IN NO EVENT WILL DATADIRECT OR ITS THIRD PARTY SUPPLIERS BE LIABLE TO THE END-USER CUSTOMER FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, CONSEQUENTIAL OR OTHER DAMAGES ARISING OUT OF THE USE OF THE ODBC DRIVERS, WHETHER OR NOT
INFORMED OF THE POSSIBILITIES OF DAMAGES IN ADVANCE. THESE LIMITATIONS APPLY TO ALL CAUSES OF ACTION, INCLUDING, WITHOUT
LIMITATION, BREACH OF CONTRACT, BREACH OF WARRANTY, NEGLIGENCE, STRICT LIABILITY, MISREPRESENTATION AND OTHER TORTS.
Part Number: IN-SVG-10000-0001
Table of Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Informatica Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Informatica My Support Portal. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Informatica Documentation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Informatica Product Availability Matrixes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Informatica Web Site. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Informatica How-To Library. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Informatica Knowledge Base. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Informatica Support YouTube Channel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Informatica Marketplace. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Informatica Velocity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Informatica Global Customer Support. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Table of Contents
Table of Contents
Table of Contents
Output Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Process Where DTM Instances Run. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
In the Data Integration Service Process. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
In Separate DTM Processes on the Local Node. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
In Separate DTM Processes on Remote Nodes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Single Node. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Grid. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Logs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Table of Contents
Table of Contents
Table of Contents
10
Table of Contents
Table of Contents
11
12
Table of Contents
Chapter 12: High Availability for the PowerCenter Integration Service. . . . . . . . . 274
High Availability for the PowerCenter Integration Service Overview. . . . . . . . . . . . . . . . . . . . . 274
Resilience. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
PowerCenter Integration Service Client Resilience. . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
External Component Resilience. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
Restart and Failover. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
Running on a Single Node. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
Running on a Primary Node. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
Running on a Grid. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278
Recovery. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278
Stopped, Aborted, or Terminated Workflows. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
Running Workflows. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
Suspended Workflows. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
PowerCenter Integration Service Failover and Recovery Configuration. . . . . . . . . . . . . . . . . . 279
Table of Contents
13
14
Table of Contents
Table of Contents
15
16
Table of Contents
Table of Contents
17
18
Table of Contents
Table of Contents
19
20
Table of Contents
Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449
Table of Contents
21
Preface
The Informatica Application Service Guide is written for Informatica users who need to configure application
services. The Informatica Application Service Guide assumes you have basic working knowledge of
Informatica and details of the environment in which the application services run.
Informatica Resources
Informatica My Support Portal
As an Informatica customer, the first step in reaching out to Informatica is through the Informatica My Support
Portal at https://mysupport.informatica.com. The My Support Portal is the largest online data integration
collaboration platform with over 100,000 Informatica customers and partners worldwide.
As a member, you can:
Search the Knowledge Base, find product documentation, access how-to documents, and watch support
videos.
Find your local Informatica User Group Network and collaborate with your peers.
Informatica Documentation
The Informatica Documentation team makes every effort to create accurate, usable documentation. If you
have questions, comments, or ideas about this documentation, contact the Informatica Documentation team
through email at [email protected]. We will use your feedback to improve our
documentation. Let us know if we can contact you regarding your comments.
The Documentation team updates documentation as needed. To get the latest documentation for your
product, navigate to Product Documentation from https://mysupport.informatica.com.
22
Informatica Marketplace
The Informatica Marketplace is a forum where developers and partners can share solutions that augment,
extend, or enhance data integration implementations. By leveraging any of the hundreds of solutions
available on the Marketplace, you can improve your productivity and speed up time to implementation on
your projects. You can access Informatica Marketplace at http://www.informaticamarketplace.com.
Informatica Velocity
You can access Informatica Velocity at https://mysupport.informatica.com. Developed from the real-world
experience of hundreds of data management projects, Informatica Velocity represents the collective
knowledge of our consultants who have worked with organizations from around the world to plan, develop,
deploy, and maintain successful data management solutions. If you have questions, comments, or ideas
about Informatica Velocity, contact Informatica Professional Services at [email protected].
Preface
23
The telephone numbers for Informatica Global Customer Support are available from the Informatica web site
at http://www.informatica.com/us/services-and-training/support-services/global-support-centers/.
24
Preface
CHAPTER 1
Analyst Service
This chapter includes the following topics:
Configuration Prerequisites, 27
25
26
Data Integration Services. The Analyst Service manages the connection to a Data Integration Service that
runs profiles, scorecards, and mapping specifications in the Analyst tool. The Analyst Service also
manages the connection to a Data Integration Service that runs workflows.
Model Repository Service. The Analyst Service manages the connection to a Model Repository Service
for the Analyst tool. The Analyst tool connects to the Model repository database to create, update, and
delete projects and objects in the Analyst tool.
Search Service. The Analyst Service manages the connection to the Search Service that enables and
manages searches in the Analyst tool. The Analyst Service identifies the associated Search Service
based on the Model Repository Service associated with the Analyst Service.
Metadata Manager Service. The Analyst Service manages the connection to a Metadata Manager Service
that runs data lineage for scorecards in the Analyst tool.
Profiling warehouse database. The Analyst tool identifies the profiling warehouse database. The Data
Integration Service writes profile data and scorecard results to the database.
Flat file cache directory. The Analyst Service manages the connection to the directory that stores
uploaded flat files that you import for reference tables and flat file data sources in the Analyst tool.
Business Glossary export file directory. The Analyst Service manages the connection to the directory that
stores the business glossary as a file after you export it from the Analyst tool.
Business Glossary asset attachment directory. The Analyst Service identifies the directory that stores any
attachment that an Analyst tool user attaches to a Business Glossary asset.
Informatica Analyst. The Analyst Service defines the URL for the Analyst tool.
Configuration Prerequisites
Before you configure the Analyst Service, you can complete the prerequisite tasks for the service. You can
also choose to complete these tasks after you create an Analyst Service.
Perform the following tasks before you configure the Analyst Service:
Create and enable the associated Data Integration Services, Model Repository Service, and Metadata
Manager Service.
Identify a directory for the flat file cache to upload flat files.
Identify a keystore file to configure the Transport Layer Security protocol for the Analyst Service.
Data Integration Services. You can associate up to two Data Integration Services with the Analyst Service.
Associate a Data Integration Service to run mapping specifications, profiles, and scorecards. Associate a
Data Integration Service to run workflows. You can associate the same Data Integration Service to run
mapping specifications, profiles, scorecards, and workflows.
Model Repository Service. When you create an Analyst Service, you assign a Model Repository Service to
the Analyst Service. You cannot assign the same Model Repository Service to another Analyst Service.
Metadata Manager Service. You can associate a Metadata Manager Service with the Analyst Service to
perform data lineage analysis on scorecards.
Search Service. The Analyst Service determines the associated Search Service based on the Model
Repository Service associated with the Analyst Service. If you modify the Analyst Service, you must
recycle the Search Service.
Configuration Prerequisites
27
Attachments Directory
Create a directory to store attachments that the Business Glossary data steward adds to Glossary assets.
For example, you can create a directory named "BGattachmentsdirectory" in the following location:
<InformaticaInstallationDir>\server
Keystore File
A keystore file contains the keys and certificates required if you enable secure communication and use the
HTTPS protocol for the Analyst Service.
You can create the keystore file when you install the Informatica services or you can create a keystore file
with keytool. keytool is a utility that generates and stores private or public key pairs and associated
certificates in a file called a keystore. When you generate a public or private key pair, keytool wraps the
public key into a self-signed certificate. You can use the self-signed certificate or use a certificate signed by a
certificate authority.
Note: You must use a certified keystore file. If you do not use a certified keystore file, security warnings and
error messages for the browser appear when you access the Analyst tool.
Complete. Allows the jobs to run to completion before disabling the service.
Abort. Tries to stop all jobs before aborting them and disabling the service.
Note: The Model Repository Service and the Data Integration Service must be running before you recycle the
Analyst Service.
28
General Properties
Logging Options
Run-Time Properties
Custom Properties
If you update any of the properties, recycle the Analyst Service for the modifications to take effect.
29
Logging Options
Logging options include properties for the severity level for Service logs. Configure the Log Level property to
set the logging level. The following values are valid:
Fatal. Writes FATAL messages to the log. FATAL messages include nonrecoverable system failures that
cause the service to shut down or become unavailable.
Error. Writes FATAL and ERROR code messages to the log. ERROR messages include connection
failures, failures to save or retrieve metadata, service errors.
Warning. Writes FATAL, WARNING, and ERROR messages to the log. WARNING errors include
recoverable system failures or warnings.
Info. Writes FATAL, INFO, WARNING, and ERROR messages to the log. INFO messages include system
and service change messages.
Trace. Write FATAL, TRACE, INFO, WARNING, and ERROR code messages to the log. TRACE
messages log user request failures.
Debug. Write FATAL, DEBUG, TRACE, INFO, WARNING, and ERROR messages to the log. DEBUG
messages are user request logs.
Run-time Properties
Run-time properties include the Data Integration Service associated with the Analyst Service and the flat file
cache directory.
The Analyst Service has the following run-time properties:
30
Temporary directory to store the Microsoft Excel export file before the Analyst tool makes it available for
download via the browser.
Advanced Properties
31
Custom Properties
Environment Variables
If you update any of the process properties, restart the Analyst Service for the modifications to take effect.
32
m for megabytes.
g for gigabytes.
Default is 768 megabytes. Specify 2 gigabytes if you run the Analyst Service on a 64-bit machine.
JVM Command Line Options
Java Virtual Machine (JVM) command line options to run Java-based programs. When you configure the
JVM options, you must set the Java SDK classpath, Java SDK minimum memory, and Java SDK
maximum memory properties.
To enable the Analyst Service to communicate with a Hadoop cluster on a particular Hadoop distribution,
add the following property to the JVM Command Line Options:
-DINFA_HADOOP_DIST_DIR=<Hadoop installation directory>\<HadoopDistributionName>
For example, to enable the Analyst Service to communicate with a Hadoop cluster on Cloudera CDH 5.2,
add the following property:
-DINFA_HADOOP_DIST_DIR=..\..\services\shared\hadoop\cloudera_cdh5u2
2.
3.
33
4.
5.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
On the Domain Navigator Actions menu, click New > Analyst Service.
The New Analyst Service window appears.
3.
4.
5.
Select Enable Service to enable the service after you create it.
6.
Click Next.
7.
8.
9.
Click Next.
10.
11.
12.
13.
Click Finish.
If you did not choose to enable the service earlier, you must recycle the service to start it.
34
CHAPTER 2
35
Reference tables
You use reference tables to verify the accuracy or structure of input data values in data quality
transformations.
The Content Management Service also compiles rule specifications into mapplets.
Use the Administrator tool to administer the Content Management Service. Recycle the Content Management
Service to start it.
36
Update a probabilistic model or a classifier model on the master Content Management Service machine.
When you update a model, the master Content Management Service updates the corresponding model
file on any node that you associate with the Model repository.
Note: If you add a node to a domain and you create a Content Management Service on the node, run the
infacmd cms ResyncData command. The command updates the node with probabilistic model files or
classifier model files from the master Content Management Service machine.
Reference tables
The Content Management Service identifies the database that stores data values for the reference table
objects in the associated Model repository.
Rule specifications
The Content Management Service manages the compilation of rule specifications into mapplets. When
you compile a rule specification in the Analyst tool, the Analyst Service selects a Content Management
Service to generate the mapplet. The Analyst tool uses the Model Repository Service configuration to
select the Content Management Service.
Synchronization Operations
The master Content Management Service stores a list of the Content Management Services in the domain.
When the master Content Management Service synchronizes with the domain services, the master Content
Management Service copies the current model files sequentially to each domain node. If a node is
unavailable, the master Content Management Service moves the node to the end of the list and synchronizes
with the next node on the list. After the synchronization operation copies the files to all available Content
Management Service machines, the operation ends.
To verify that a synchronization operation succeeded on a node, browse the directory structure on the node
and find the probabilistic or classifier model files. Compare the files with the files on the master Content
Management Service machine.
37
Informatica uses the following directory paths as the default locations for the files:
[Informatica_install_directory]/tomcat/bin/ner
[Informatica_install_directory]/tomcat/bin/classifier
The file names have the following extensions:
Probabilistic model files: .ner
Classifier model files: .classifier
Note: The time required to synchronize the model files depends on the number of files on the master Content
Management Service machine. The ResyncData command copies model files in batches of 15 files at a time.
The user name that the Content Management Service uses to communicate with the Model repository has
the Administrator role on the associated Model Repository Service.
All Data Integration Services associated with the Model repository are available.
The reference data warehouse stores data for the reference table objects in a single Model repository.
Note: The purge operation reads the Model repository that the current Content Management Service
identifies, and it deletes any reference table that the Model repository does not use. If the reference data
warehouse stores reference data for any other Model repository, the purge operation deletes all tables that
belong to the other repository. To prevent accidental data loss, the purge operation does not delete tables if
the Model repository does not contain a reference table object.
38
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
3.
To prevent accidental data loss, the purge operation does not delete tables if the Model repository does not
contain a reference table object.
Note: To delete unused reference table at the command prompt, run the infacmd cms Purge command.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
In the Domain Navigator, select Content Management Service > Disable to stop the service.
When you disable the Content Management Service, you must choose the mode to disable it in. You can
choose one of the following options:
3.
Complete. Allows the jobs to run to completion before disabling the service.
Abort. Tries to stop all jobs before aborting them and disabling the service.
Click the Recycle button to restart the service. The Data Integration Service must be running before you
recycle the Content Management Service.
You recycle the Content Management Service in the following cases:
Recycle the Content Management Service after you add or update address reference data files or
after you change the file location for probabilistic or classifier model data files.
Recycle the Content Management Service and the associated Data Integration Service after you
update the address validation properties, reference data location, identity cache directory, or identity
index directory on the Content Management Service.
When you update the reference data location on the Content Management Service, recycle the
Analyst Service associated with the Model Repository Service that the Content Management Service
uses. Open a Developer tool or Analyst tool application to refresh the reference data location stored
by the application.
39
General properties
Multi-service options
Logging options
Custom properties
If you update a property, restart the Content Management Service to apply the update.
General Properties
General properties for the Content Management Service include the name and description of the Content
Management Service, and the node in the Informatica domain that the Content Management Service runs on.
You configure these properties when you create the Content Management Service.
The following table describes the general properties for the service:
Property
Description
Name
Name of the service. The name is not case sensitive and must be unique
within the domain. It cannot exceed 128 characters or begin with @. It also
cannot contain spaces or the following special characters:
`~%^*+={}\;:'"/?.,<>|!()][
You cannot change the name of the service after you create it.
40
Description
Node
Node on which the service runs. If you change the node, you must recycle the
Content Management Service.
License
Multi-Service Options
The Multi-service options indicate whether the current service is the master Content Management Service in
a domain.
The following table describes the single property under multi-service options:
Property
Description
Master CMS
Description
Data Integration Service associated with the Content Management Service. The
Data Integration Service reads reference data configuration information from the
Content Management Service.
Recycle the Content Management Service if you associate another Data
Integration Service with the Content Management Service.
Username
User name that the Content Management Service uses to connect to the Model
Repository Service.
To perform reference table management tasks in the Model repository, the user
that the property identifies must have the Model Repository Service
Administrator role. The reference table management tasks include purge
operations on orphaned reference tables.
Not available for a domain with Kerberos authentication.
Password
Password that the Content Management Service uses to connect to the Model
Repository Service.
Not available for a domain with Kerberos authentication.
Database connection name for the database that stores reference data values
for the reference data objects defined in the associated Model repository.
The database stores reference data object row values. The Model repository
stores metadata for reference data objects.
41
Description
Path to the directory that stores reference data during the import process.
Logging Options
Configure the Log Level property to set the logging level.
The following table describes the Log Level properties:
Property
Description
Log Level
Configure the Log Level property to set the logging level. The following values are
valid:
- Fatal. Writes FATAL messages to the log. FATAL messages include nonrecoverable
system failures that cause the service to shut down or become unavailable.
- Error. Writes FATAL and ERROR code messages to the log. ERROR messages
include connection failures, failures to save or retrieve metadata, service errors.
- Warning. Writes FATAL, WARNING, and ERROR messages to the log. WARNING
errors include recoverable system failures or warnings.
- Info. Writes FATAL, INFO, WARNING, and ERROR messages to the log. INFO
messages include system and service change messages.
- Trace. Write FATAL, TRACE, INFO, WARNING, and ERROR code messages to the
log. TRACE messages log user request failures.
- Debug. Write FATAL, DEBUG, TRACE, INFO, WARNING, and ERROR messages to
the log. DEBUG messages are user request logs.
42
Identity properties
Advanced properties
Custom properties
If you update any of the Content Management Service process properties, restart the Content Management
Service for the modifications to take effect.
Note: The Content Management Service does not currently use the Content Management Service Security
Options properties.
Description
HTTP Port
Unique HTTP port number for the Reporting and Dashboards Service. Default is 8105.
Recycle the service if you change the HTTP port number.
HTTPS Port
HTTPS port number that the service runs on when you enable the Transport Layer Security
(TLS) protocol. Use a different port number than the HTTP port number.
Recycle the service if you change the HTTPS port number.
Keystore File
Path and file name of the keystore file that contains the private or public key pairs and
associated certificates. Required if you enable TLS and use HTTPS connections for the
service.
Keystore
Password
SSL Protocol
43
Description
License
License key to activate validation reference data. You might have more than one key, for
example, if you use batch reference data and geocoding reference data. Enter keys as a
comma-delimited list. The property is empty by default.
Reference Data
Location
Location of the address reference data files. Enter the full path to the files. Install all address
reference data files to a single location. The property is empty by default.
Full Pre-Load
Countries
List of countries for which all batch, CAMEO, certified, interactive, or supplementary
reference data is loaded into memory before address validation begins. Enter the threecharacter ISO country codes in a comma-separated list. For example, enter DEU,FRA,USA.
Enter ALL to load all data sets. The property is empty by default.
Load the full reference database to increase performance. Some countries, such as the
United States, have large databases that require significant amounts of memory.
Partial Pre-Load
Countries
List of countries for which batch, CAMEO, certified, interactive, or supplementary reference
metadata and indexing structures are loaded into memory before address validation begins.
Enter the three-character ISO country codes in a comma-separated list. For example, enter
DEU,FRA,USA. Enter ALL to partially load all data sets. The property is empty by default.
Partial preloading increases performance when not enough memory is available to load the
complete databases into memory.
No Pre-Load
Countries
Full Pre-Load
Geocoding
Countries
List of countries for which all geocoding reference data is loaded into memory before address
validation begins. Enter the three-character ISO country codes in a comma-separated list. For
example, enter DEU,FRA,USA. Enter ALL to load all data sets. The property is empty by
default.
Load all reference data for a country to increase performance when processing addresses
from that country. Some countries, such as the United States, have large data sets that
require significant amounts of memory.
Partial Pre-Load
Geocoding
Countries
List of countries for which geocoding reference metadata and indexing structures are loaded
into memory before address validation begins. Enter the three-character ISO country codes in
a comma-separated list. For example, enter DEU,FRA,USA. Enter ALL to partially load all
data sets. The property is empty by default.
Partial preloading increases performance when not enough memory is available to load the
complete databases into memory.
No Pre-Load
Geocoding
Countries
44
List of countries for which no geocoding reference data is loaded into memory before address
validation begins. Enter the three-character ISO country codes in a comma-separated list. For
example, enter DEU,FRA,USA. Default is ALL.
Property
Description
Full Pre-Load
Suggestion List
Countries
List of countries for which all suggestion list reference data is loaded into memory before
address validation begins. Enter the three-character ISO country codes in a commaseparated list. For example, enter DEU,FRA,USA. Enter ALL to load all data sets. The
property is empty by default.
Load the full reference database to increase performance. Some countries, such as the
United States, have large databases that require significant amounts of memory.
Partial Pre-Load
Suggestion List
Countries
List of countries for which the suggestion list reference metadata and indexing structures are
loaded into memory before address validation begins. Enter the three-character ISO country
codes in a comma-separated list. For example, enter DEU,FRA,USA. Enter ALL to partially
load all data sets. The property is empty by default.
Partial preloading increases performance when not enough memory is available to load the
complete databases into memory.
No Pre-Load
Suggestion List
Countries
List of countries for which no suggestion list reference data is loaded into memory before
address validation begins. Enter the three-character ISO country codes in a commaseparated list. For example, enter DEU,FRA,USA. Default is ALL.
Full Pre-Load
Address Code
Countries
List of countries for which all address code lookup reference data is loaded into memory
before address validation begins. Enter the three-character ISO country codes in a commaseparated list. For example, enter DEU,FRA,USA. Enter ALL to load all data sets. The
property is empty by default.
Load the full reference database to increase performance. Some countries, such as the
United States, have large databases that require significant amounts of memory.
Partial Pre-Load
Address Code
Countries
List of countries for which the address code lookup reference metadata and indexing
structures are loaded into memory before address validation begins. Enter the threecharacter ISO country codes in a comma-separated list. For example, enter DEU,FRA,USA.
Enter ALL to partially load all data sets. The property is empty by default.
Partial preloading increases performance when not enough memory is available to load the
complete databases into memory.
No Pre-Load
Address Code
Countries
List of countries for which no address code lookup reference data is loaded into memory
before address validation begins. Enter the three-character ISO country codes in a commaseparated list. For example, enter DEU,FRA,USA. Default is ALL.
Preloading
Method
Determines how the Data Integration Service preloads address reference data into memory.
The MAP method and the LOAD method both allocate a block of memory and then read
reference data into this block. However, the MAP method can share reference data between
multiple processes. Default is MAP.
Max Result
Count
Maximum number of addresses that address validation can return in suggestion list mode.
Set a maximum number in the range 1 through 100. Default is 20.
Memory Usage
Number of megabytes of memory that the address validation library files can allocate. Default
is 4096.
Max Address
Object Count
Maximum number of address validation instances to run at the same time. Default is 3. Set a
value that is greater than or equal to the Maximum Parallelism value on the Data Integration
Service.
Max Thread
Count
Maximum number of threads that address validation can use. Set to the total number of cores
or threads available on a machine. Default is 2.
45
Property
Description
Cache Size
Size of cache for databases that are not preloaded. Caching reserves memory to increase
lookup performance in reference data that has not been preloaded.
Set the cache size to LARGE unless all the reference data is preloaded or you need to reduce
the amount of memory usage.
Enter one of the following options for the cache size in uppercase letters:
- NONE. No cache. Enter NONE if all reference databases are preloaded.
- SMALL. Reduced cache size.
- LARGE. Standard cache size.
Default is LARGE.
SendRight
Report Location
Location to which an address validation mapping writes a SendRight report and any log file
that relates to the report. You generate a SendRight report to verify that a set of New Zealand
address records meets the certification standards of New Zealand Post. Enter a local path on
the machine that hosts the Data Integration Service that runs the mapping.
By default, address validation writes the report file to the bin directory of the Informatica
installation. If you enter a relative path, the Content Management Service appends the path to
the bin directory.
By default, the Content Management Service applies the ALL value to the options that indicate no data
preload. If you accept the default options, the Data Integration Service reads the address reference data
from files in the directory structure when the mapping runs.
The address validation process properties must indicate a preload method for each type of address
reference data that a mapping specifies. If the Data Integration Service cannot determine a preload policy
for a type of reference data, it ignores the reference data when the mapping runs.
The Data Integration Service can use a different method to load data for each country. For example, you
can specify full preload for United States suggestion list data and partial preload for United Kingdom
suggestion list data.
The Data Integration Service can use a different preload method for each type of data. For example, you
can specify full preload for United States batch data and partial preload for United States address code
data.
Full preload settings supersede partial preload settings, and partial preload settings supersede settings
that indicate no data preload.
For example, you might configure the following options:
Full Pre-Load Geocoding Countries: DEU
No Pre-Load Geocoding Countries: ALL
The options specify that the Data Integration Service loads German geocoding data into memory and
does not load geocoding data for any other country.
46
The Data Integration Service loads the types of address reference data that you specify in the address
validation process properties. The Data Integration Service does not read the mapping metadata to
identify the address reference data that the mapping specifies.
Identity Properties
The identity properties specify the location of the identity population files and the default locations of the
temporary files that identity match analysis can generate. The locations on each property are local to the
Data Integration Service that runs the identity match mapping. The Data Integration Service must have write
access to each location.
The following table describes the identity properties:
Property
Description
Reference
Data
Location
Cache
Directory
The path identifies a parent directory. Install the population files to a directory with the name
default below the directory that the property specifies.
Path to the directory that contains the temporary data files that the Data Integration Service
generates during identity analysis. The Data Integration Service creates the directory at run time
if the Match transformation in the mapping does not specify the directory.
The property sets the following default path:
./identityCache
You can specify a relative path, or you can specify a fully qualified path to a directory that the
Data Integration Service can write to. The relative path is relative to the tomcat/bin directory
on the Data Integration Service machine.
Index
Directory
Path to the directory that contains the temporary index files that the Data Integration Service
generates during identity analysis. Identity match analysis uses the index to sort records into
groups before match analysis. The Data Integration Service creates the directory at run time if the
Match transformation in the mapping does not specify the directory.
The property sets the following default location:
./identityIndex
You can specify a relative path, or you can specify a fully qualified path to a directory that the
Data Integration Service can write to. The relative path is relative to the tomcat/bin directory
on the Data Integration Service machine.
47
Advanced Properties
The advanced properties define the maximum heap size and the Java Virtual Manager (JVM) memory
settings.
The following table describes the advanced properties for service process:
Property
Description
Amount of RAM allocated to the Java Virtual Machine (JVM) that runs the
service. Use this property to increase the memory available tp the service.
Append one of the following letters to the value to specify the units:
-
b for bytes
k for kilobytes
m for megabytes
g for gigabytes
Note: If you use Informatica Developer to compile probabilistic models, increase the default maximum heap
size value to 3 gigabytes.
NLP Options
The NLP Options property provides the location of probabilistic model and classifier model files on the
Informatica services machine. Probabilistic models and classifier models are types of reference data. Use the
models in transformations that perform Natural Language Processing (NLP) analysis.
The following table describes the NLP Options property:
Property
Description
NER File
Location
Path to the probabilistic model files. The property reads a relative path from the following
directory in the Informatica installation:
/tomcat/bin
The default value is ./ner, which indicates the following directory:
/tomcat/bin/ner
Classifier File
Location
Path to the classifier model files. The property reads a relative path from the following directory
in the Informatica installation:
/tomcat/bin
The default value is ./classifier, which indicates the following directory:
/tomcat/bin/classifier
48
2.
3.
4.
5.
Set the location for the service. You can create the service in a folder on the domain. Click Browse to
create a folder.
6.
Select the node that you want the service to run on.
7.
Specify a Data Integration Service and Model Repository Service to associate with the Content
Management Service.
8.
Enter a username and password that the Content Management Service can use to connect to the Model
Repository Service.
9.
Select the database that the Content Management Service can use to store reference data.
10.
Click Next.
11.
Optionally, select Enable Service to enable the service after you create it.
Note: Do not configure the Transport Layer Security properties. The properties are reserved for future
use.
12.
Click Finish.
If you did not choose to enable the service, you must recycle the service to start it.
49
CHAPTER 3
50
Runs profiles and generates previews for profiles in the Analyst tool and the Developer tool.
Runs scorecards for the profiles in the Analyst tool and the Developer tool.
Runs SQL data services and web services in the Developer tool.
Caches data objects for mappings and SQL data services deployed in an application.
Runs SQL queries that end users run against an SQL data service through a third-party JDBC or ODBC
client tool.
Create and configure a Data Integration Service in the Administrator tool. You can create one or more Data
Integration Services on a node. Based on your license, the Data Integration Service can be highly available.
Set up the databases that the Data Integration Service connects to.
If the domain uses Kerberos authentication and you set the service principal level at the process level,
create a keytab file for the Data Integration Service.
51
The following table describes the database connections that you must create before you create the Data
Integration Service:
Database
Connection
Description
To access the data object cache, create the data object cache connection for the Data
Integration Service.
Workflow database
To store run-time metadata for workflows, create the workflow database connection for
the Data Integration Service.
Profiling warehouse
database
To create and run profiles and scorecards, create the profiling warehouse database
connection for the Data Integration Service.
To create and run profiles and scorecards, select this instance of the Data Integration
Service when you configure the run-time properties of the Analyst Service.
2.
3.
4.
52
5.
On the New Data Integration Service - Step 1 of 14 page, enter the following properties:
Property
Description
Name
Name of the service. The name is not case sensitive and must be unique within the
domain. It cannot exceed 128 characters or begin with @. It also cannot contain spaces
or the following special characters:
`~%^*+={}\;:'"/?.,<>|!()][
6.
Description
Location
Domain and folder where the service is created. Click Browse to choose a different
folder. You can move the service after you create it.
License
Assign
Select Node to configure the service to run on a node. If your license includes grid, you
can create a grid and assign the service to run on the grid after you create the service.
Node
Backup Nodes
If your license includes high availability, nodes on which the service can run if the
primary node is unavailable.
Model
Repository
Service
Username
User name that the service uses to access the Model Repository Service. Enter the
Model repository user that you created. Not available for a domain with Kerberos
authentication.
Password
Password for the Model repository user. Not available for a domain with Kerberos
authentication.
Security
Domain
LDAP security domain for the Model repository user. The field appears when the
Informatica domain contains an LDAP security domain. Not available for a domain with
Kerberos authentication.
Click Next.
The New Data Integration Service - Step 2 of 14 page appears.
7.
Enter the HTTP port number to use for the Data Integration Service.
8.
Accept the default values for the remaining security properties. You can configure the security properties
after you create the Data Integration Service.
9.
10.
11.
Click Next.
The New Data Integration Service - Step 3 of 14 page appears.
12.
Set the Launch Job Options property to one of the following values:
53
In the service process. Configure when you run SQL data service and web service jobs. SQL data
service and web service jobs typically achieve better performance when the Data Integration Service
runs jobs in the service process.
In separate local processes. Configure when you run mapping, profile, and workflow jobs. When the
Data Integration Service runs jobs in separate local processes, stability increases because an
unexpected interruption to one job does not affect all other jobs.
If you configure the Data Integration Service to run on a grid after you create the service, you can
configure the service to run jobs in separate remote processes.
13.
Accept the default values for the remaining execution options and click Next.
The New Data Integration Service - Step 4 of 14 page appears.
14.
If you created the data object cache database for the Data Integration Service, click Select to select the
cache connection. Select the data object cache connection that you created for the service to access the
database.
15.
Accept the default values for the remaining properties on this page and click Next.
The New Data Integration Service - Step 5 of 14 page appears.
16.
For optimal performance, enable the Data Integration Service modules that you plan to use.
The following table lists the Data Integration Service modules that you can enable:
17.
Module
Description
Runs workflows.
Click Next.
The New Data Integration Service - Step 6 of 14 page appears.
You can configure the HTTP proxy server properties to redirect HTTP requests to the Data Integration
Service. You can configure the HTTP configuration properties to filter the web services client machines
that can send requests to the Data Integration Service. You can configure these properties after you
create the service.
18.
Accept the default values for the HTTP proxy server and HTTP configuration properties and click Next.
The New Data Integration Service - Step 7 of 14 page appears.
The Data Integration Service uses the result set cache properties to use cached results for SQL data
service queries and web service requests. You can configure the properties after you create the service.
19.
Accept the default values for the result set cache properties and click Next.
The New Data Integration Service - Step 8 of 14 page appears.
54
20.
If you created the profiling warehouse database for the Data Integration Service, select the Profiling
Service module.
21.
If you created the workflow database for the Data Integration Service, select the Workflow Orchestration
Service module.
22.
23.
Click Next.
The New Data Integration Service - Step 11 of 14 page appears.
24.
If you created the profiling warehouse database for the Data Integration Service, click Select to select
the database connection. Select the profiling warehouse connection that you created for the service to
access the database.
25.
26.
Click Next.
The New Data Integration Service - Step 12 of 14 page appears.
27.
Accept the default values for the advanced profiling properties and click Next.
The New Data Integration Service - Step 14 of 14 page appears.
28.
If you created the workflow database for the Data Integration Service, click Select to select the database
connection. Select the workflow database connection that you created for the service to access the
database.
29.
Click Finish.
The domain creates and enables the Data Integration Service.
After you create the service through the wizard, you can edit the properties or configure other properties.
General Properties
The general properties of a Data Integration Service includes name, license, and node assignment.
The following table describes the general properties for the service:
General Property
Description
Name
Name of the service. The name is not case sensitive and must be unique within
the domain. It cannot exceed 128 characters or begin with @. It also cannot
contain spaces or the following special characters:
`~%^*+={}\;:'"/?.,<>|!()][
You cannot change the name of the service after you create it.
Description
License
55
General Property
Description
Assign
Node
Grid
Name of the grid on which the Data Integration Service runs if the service runs on
a grid. Click the grid name to view the grid configuration.
Backup Nodes
If your license includes high availability, nodes on which the service can run if the
primary node is unavailable.
Description
Service that stores run-time metadata required to run mappings and SQL data
services.
User Name
User name to access the Model repository. The user must have the Create Project
privilege for the Model Repository Service.
Not available for a domain with Kerberos authentication.
Password
56
Execution Options
The following table describes the execution options for the Data Integration Service:
Property
Description
Runs jobs in the Data Integration Service process, in separate DTM processes on
the local node, or in separate DTM processes on remote nodes. Configure the
property based on whether the Data Integration Service runs on a single node or a
grid and based on the types of jobs that the service runs.
Choose one of the following options:
- In the service process. Configure when you run SQL data service and web service
jobs on a single node or on a grid where each node has both the service and
compute roles.
- In separate local processes. Configure when you run mapping, profile, and workflow
jobs on a single node or on a grid where each node has both the service and
compute roles.
- In separate remote processes. Configure when you run mapping, profile, and
workflow jobs on a grid where nodes have a different combination of roles. If you
choose this option when the Data Integration Service runs on a single node, then the
service runs jobs in separate local processes.
Maximum number of jobs that each Data Integration Service process can run
concurrently. Jobs include data previews, mappings, profiling jobs, SQL queries,
and web service requests. For example, a Data Integration Service grid includes
three running service processes. If you set the value to 10, each Data Integration
Service process can run up to 10 jobs concurrently. A total of 30 jobs can run
concurrently on the grid. Default is 10.
Maximum amount of memory, in bytes, that the Data Integration Service can
allocate for running all requests concurrently when the service runs jobs in the
Data Integration Service process. When the Data Integration Service runs jobs in
separate local or remote processes, the service ignores this value. If you do not
want to limit the amount of memory the Data Integration Service can allocate, set
this property to 0.
If the value is greater than 0, the Data Integration Service uses the property to
calculate the maximum total memory allowed for running all requests concurrently.
The Data Integration Service calculates the maximum total memory as follows:
Maximum Memory Size + Maximum Heap Size + memory required for loading
program components
Default is 0.
Note: If you run profiles or data quality mappings, set this property to 0.
57
Property
Description
Maximum Parallelism
Maximum number of parallel threads that process a single mapping pipeline stage.
When you set the value greater than 1, the Data Integration Service enables
partitioning for mappings, column profiling, and data domain discovery. The
service dynamically scales the number of partitions for a mapping pipeline at run
time. Increase the value based on the number of CPUs available on the nodes
where jobs run.
In the Developer tool, developers can change the maximum parallelism value for
each mapping. When maximum parallelism is set for both the Data Integration
Service and the mapping, the Data Integration Service uses the minimum value
when it runs the mapping.
Default is 1. Maximum is 64.
Note: Developers cannot change the maximum parallelism value for each profile.
When the Data Integration Service converts a profile job into one or more
mappings, the mappings always use Auto for the mapping maximum parallelism.
The file path to the Kerberos keytab file on the machine on which the Data
Integration Service runs.
Temporary Directories
Directory for temporary files created when jobs are run. Default is <home
directory>/disTemp.
Enter a list of directories separated by semicolons to optimize performance during
profile operations and during cache partitioning for Sorter transformations.
You cannot use the following characters in the directory path:
* ? < > " | , [ ]
Home Directory
Root directory accessible by the node. This is the root directory for other service
directories. Default is <Informatica installation directory>/
tomcat/bin. If you change the default value, verify that the directory exists.
You cannot use the following characters in the directory path:
* ? < > " | , [ ]
Cache Directory
Directory for index and data cache files for transformations. Default is <home
directory>/cache.
Enter a list of directories separated by semicolons to increase performance during
cache partitioning for Aggregator, Joiner, or Rank transformations.
You cannot use the following characters in the directory path:
* ? < > " | , [ ]
Source Directory
Directory for source flat files used in a mapping. Default is <home directory>/
source.
If the Data Integration Service runs on a grid, you can use a shared directory to
create one directory for source files. If you configure a different directory for each
node with the compute role, ensure that the source files are consistent among all
source directories.
You cannot use the following characters in the directory path:
* ? < > " | , [ ]
58
Property
Description
Target Directory
Default directory for target flat files used in a mapping. Default is <home
directory>/target.
Enter a list of directories separated by semicolons to increase performance when
multiple partitions write to the flat file target.
You cannot use the following characters in the directory path:
* ? < > " | , [ ]
Directory for reject files. Reject files contain rows that were rejected when running
a mapping. Default is <home directory>/reject.
You cannot use the following characters in the directory path:
* ? < > " | , [ ]
The PowerCenter Big Data Edition home directory on every data node created by
the Hadoop RPM install. Type /
<PowerCenterBigDataEditionInstallationDirectory>/
Informatica.
Hadoop Distribution
Directory
The directory containing a collection of Hive and Hadoop JARS on the cluster from
the RPM Install locations. The directory contains the minimum set of JARS
required to process Informatica mappings in a Hadoop environment. Type /
<PowerCenterBigDataEditionInstallationDirectory>/
Informatica/services/shared/hadoop/
[Hadoop_distribution_name].
The Hadoop distribution directory on the Data Integration Service node. The
contents of the Data Integration Service Hadoop distribution directory must be
identical to Hadoop distribution directory on the data nodes. Type
<Informatica Installation directory/Informatica/services/
shared/hadoop/[Hadoop_distribution_name].
Description
The number of milliseconds that the Data Integration Service waits before cleaning
up cache storage after a refresh. Default is 3,600,000.
Cache Connection
The database connection name for the database that stores the data object cache.
Select a valid connection object name.
59
Property
Description
Maximum Concurrent
Refresh Requests
Maximum number of cache refreshes that can occur at the same time. Limit the
concurrent cache refreshes to maintain system resources.
Indicates that the Data Integration Service can use cache data for a logical data
object used as a source or a lookup in another logical data object during a cache
refresh. If false, the Data Integration Service accesses the source resources even
if you enabled caching for the logical data object used as a source or a lookup.
For example, logical data object LDO3 joins data from logical data objects LDO1
and LDO2. A developer creates a mapping that uses LDO3 as the input and
includes the mapping in an application. You enable caching for LDO1, LDO2, and
LDO3. If you enable nested logical data object caching, the Data Integration
Service uses cache data for LDO1 and LDO2 when it refreshes the cache table for
LDO3. If you do not enable nested logical data object caching, the Data Integration
Service accesses the source resources for LDO1 and LDO2 when it refreshes the
cache table for LDO3.
Default is False.
Logging Properties
The following table describes the log level properties:
Property
Description
Log Level
Configure the Log Level property to set the logging level. The following values are
valid:
- Fatal. Writes FATAL messages to the log. FATAL messages include nonrecoverable
system failures that cause the service to shut down or become unavailable.
- Error. Writes FATAL and ERROR code messages to the log. ERROR messages
include connection failures, failures to save or retrieve metadata, service errors.
- Warning. Writes FATAL, WARNING, and ERROR messages to the log. WARNING
errors include recoverable system failures or warnings.
- Info. Writes FATAL, INFO, WARNING, and ERROR messages to the log. INFO
messages include system and service change messages.
- Trace. Write FATAL, TRACE, INFO, WARNING, and ERROR code messages to the
log. TRACE messages log user request failures.
- Debug. Write FATAL, DEBUG, TRACE, INFO, WARNING, and ERROR messages to
the log. DEBUG messages are user request logs.
Deployment Options
The following table describes the deployment options for the Data Integration Service:
Property
Description
Determines whether to enable and start each application after you deploy it to a
Data Integration Service. Default Deployment mode affects applications that you
deploy from the Developer tool, command line, and Administrator tool.
Choose one of the following options:
- Enable and Start. Enable the application and start the application.
- Enable Only. Enable the application but do not start the application.
- Disable. Do not enable the application.
60
Description
Allow Caching
Allows data object caching for all pass-through connections in the Data Integration
Service. Populates data object cache using the credentials from the connection
object.
Note: When you enable data object caching with pass-through security, you might
allow users access to data in the cache database that they might not have in an
uncached environment.
Modules
By default, all Data Integration Service modules are enabled. You can disable some of the modules.
You might want to disable a module if you are testing and you have limited resources on the computer. You
can save memory by limiting the Data Integration Service functionality. Before you disable a module, you
must disable the Data Integration Service.
The following table describes the Data Integration Service modules:
Module
Description
Runs SQL queries from a third-party client tool to an SQL data service.
Runs workflows.
Description
Authenticated user name for the HTTP proxy server. This is required if the
proxy server requires authentication.
Password for the authenticated user. The Service Manager encrypts the
password. This is required if the proxy server requires authentication.
61
Description
Allowed IP Addresses
Denied IP Addresses
Security protocol that the Data Integration Service uses. Select one of the
following values:
- HTTP. Requests to the service must use an HTTP URL.
- HTTPS. Requests to the service must use an HTTPS URL.
- HTTP&HTTPS. Requests to the service can use either an HTTP or an HTTPS
URL.
When you set the HTTP protocol type to HTTPS or HTTP&HTTPS, you
enable Transport Layer Security (TLS) for the service.
You can also enable TLS for each web service deployed to an application.
When you enable HTTPS for the Data Integration Service and enable TLS
for the web service, the web service uses an HTTPS URL. When you enable
HTTPS for the Data Integration Service and do not enable TLS for the web
service, the web service can use an HTTP URL or an HTTPS URL. If you
enable TLS for a web service and do not enable HTTPS for the Data
Integration Service, the web service does not start.
Default is HTTP.
62
Description
The prefix for the names of all result set cache files stored on disk. Default is
RSCACHE.
Enable Encryption
Indicates whether result set cache files are encrypted using 128-bit AES
encryption. Valid values are true or false. Default is true.
Description
Maximum
Notification
Thread Pool
Size
Maximum number of concurrent job completion notifications that the Mapping Service Module
sends to external clients after the Data Integration Service completes jobs. The Mapping
Service Module is a component in the Data Integration Service that manages requests sent to
run mappings. Default is 5.
Maximum
Memory Per
Request
The behavior of Maximum Memory Per Request depends on the following Data Integration
Service configurations:
- The service runs jobs in separate local or remote processes, or the service property Maximum
Memory Size is 0 (default).
Maximum Memory Per Request is the maximum amount of memory, in bytes, that the Data
Integration Service can allocate to all transformations that use auto cache mode in a single
request. The service allocates memory separately to transformations that have a specific cache
size. The total memory used by the request can exceed the value of Maximum Memory Per
Request.
- The service runs jobs in the Data Integration Service process, and the service property Maximum
Memory Size is greater than 0.
Maximum Memory Per Request is the maximum amount of memory, in bytes, that the Data
Integration Service can allocate to a single request. The total memory used by the request cannot
exceed the value of Maximum Memory Per Request.
Default is 536,870,912.
Requests include mappings and mappings run from Mapping tasks within a workflow.
Description
Profiling Warehouse
Database
Maximum Ranks
Maximum Patterns
63
Property
Description
Maximum Profile
Execution Pool Size
Maximum DB Connections
Location where the Data Integration Service exports profile results file.
Maximum amount of memory, in bytes, that the Data Integration Service can
allocate for each mapping run for a single profile request.
If the Data Integration Service and Analyst Service run on different nodes, both
services must be able to access this location. Otherwise, the export fails.
Default is 536,870,912.
64
Property
Description
Pattern Threshold
Percentage
Maximum # Value
Frequency Pairs
Maximum length of a string that the Profiling Service can process. Default is 255.
Maximum Numeric
Precision
Maximum Concurrent
Profile Jobs
The maximum number of concurrent profile threads used to run a profile on flat
files and relational sources. If left blank, the Profiling Service plug-in determines
the best number based on the set of running jobs and other environment factors.
Maximum Concurrent
Columns
Maximum number of columns that you can combine for profiling flat files in a single
execution pool thread. Default is 5.
Maximum Concurrent
Profile Threads
The maximum number of concurrent execution pool threads used to run a profile
on flat files. Default is 1.
Number of threads of the Maximum Execution Pool Size that are for priority
requests. Default is 1.
SQL Properties
The following table describes the SQL properties:
Property
Description
Number of milliseconds that the DTM instance stays open after it completes the last request.
Identical SQL queries can reuse the open instance. Use the keep alive time to increase
performance when the time required to process the SQL query is small compared to the
initialization time for the DTM instance. If the query fails, the DTM instance terminates.
Must be greater than or equal to 0. 0 means that the Data Integration Service does not keep
the DTM instance in memory. Default is 0.
You can also set this property for each SQL data service that is deployed to the Data
Integration Service. If you set this property for a deployed SQL data service, the value for the
deployed SQL data service overrides the value you set for the Data Integration Service.
Table Storage
Connection
Relational database connection that stores temporary tables for SQL data services. By default,
no connection is selected.
Maximum
Memory Per
Request
The behavior of Maximum Memory Per Request depends on the following Data Integration
Service configurations:
- The service runs jobs in separate local or remote processes, or the service property Maximum
Memory Size is 0 (default).
Maximum Memory Per Request is the maximum amount of memory, in bytes, that the Data
Integration Service can allocate to all transformations that use auto cache mode in a single
request. The service allocates memory separately to transformations that have a specific cache
size. The total memory used by the request can exceed the value of Maximum Memory Per
Request.
- The service runs jobs in the Data Integration Service process, and the service property Maximum
Memory Size is greater than 0.
Maximum Memory Per Request is the maximum amount of memory, in bytes, that the Data
Integration Service can allocate to a single request. The total memory used by the request cannot
exceed the value of Maximum Memory Per Request.
Default is 50,000,000.
Skip Log Files
Prevents the Data Integration Service from generating log files when the SQL data service
request completes successfully and the tracing level is set to INFO or higher. Default is false.
Property
Description
Workflow Connection
The connection name of the database that stores the run-time configuration data
for the workflows that the Data Integration Service runs. Select a database on the
Connections view.
Create the workflow database contents before you run a workflow. To create the
contents, use the Actions menu options for the Data Integration Service in the
Administrator tool.
Note: After you create the Data Integration Service, recycle the service before you
create the workflow database contents.
65
Description
DTM Keep
Alive Time
Number of milliseconds that the DTM instance stays open after it completes the last request.
Web service requests that are issued against the same operation can reuse the open instance.
Use the keep alive time to increase performance when the time required to process the request
is small compared to the initialization time for the DTM instance. If the request fails, the DTM
instance terminates.
Must be greater than or equal to 0. 0 means that the Data Integration Service does not keep the
DTM instance in memory. Default is 5000.
You can also set this property for each web service that is deployed to the Data Integration
Service. If you set this property for a deployed web service, the value for the deployed web
service overrides the value you set for the Data Integration Service.
Logical URL
Prefix for the WSDL URL if you use an external HTTP load balancer. For example,
http://loadbalancer:8080
The Data Integration Service requires an external HTTP load balancer to run a web service on a
grid. If you run the Data Integration Service on a single node, you do not need to specify the
logical URL.
Maximum
Memory Per
Request
The behavior of Maximum Memory Per Request depends on the following Data Integration
Service configurations:
- The service runs jobs in separate local or remote processes, or the service property Maximum
Memory Size is 0 (default).
Maximum Memory Per Request is the maximum amount of memory, in bytes, that the Data
Integration Service can allocate to all transformations that use auto cache mode in a single request.
The service allocates memory separately to transformations that have a specific cache size. The
total memory used by the request can exceed the value of Maximum Memory Per Request.
- The service runs jobs in the Data Integration Service process, and the service property Maximum
Memory Size is greater than 0.
Maximum Memory Per Request is the maximum amount of memory, in bytes, that the Data
Integration Service can allocate to a single request. The total memory used by the request cannot
exceed the value of Maximum Memory Per Request.
Default is 50,000,000.
Skip Log
Files
Prevents the Data Integration Service from generating log files when the web service request
completes successfully and the tracing level is set to INFO or higher. Default is false.
66
Description
HTTP Port
Unique HTTP port number for the Data Integration Service process when the
service uses the HTTP protocol.
Default is 8095.
HTTPS Port
Unique HTTPS port number for the Data Integration Service process when the
service uses the HTTPS protocol.
When you set an HTTPS port number, you must also define the keystore file
that contains the required keys and certificates.
67
Description
Maximum Concurrent
Requests
Maximum number of HTTP or HTTPS connections that can wait in a queue for
this Data Integration Service process. Default is 100.
Keystore File
Path and file name of the keystore file that contains the keys and certificates
required if you use HTTPS connections for the Data Integration Service. You
can create a keystore file with a keytool. keytool is a utility that generates and
stores private or public key pairs and associated certificates in a keystore file.
You can use the self-signed certificate or use a certificate signed by a certificate
authority.
If you run the Data Integration Service on a grid, the keystore file on each node
in the grid must contain the same keys.
Keystore Password
Truststore File
Path and file name of the truststore file that contains authentication certificates
trusted by the Data Integration Service.
If you run the Data Integration Service on a grid, the truststore file on each node
in the grid must contain the same keys.
Truststore Password
SSL Protocol
68
Property
Description
Maximum number of bytes allowed for the total result set cache file storage.
Default is 0.
Maximum number of bytes allocated for a single result set cache instance in
memory. Default is 0.
Maximum number of bytes allocated for the total result set cache storage in
memory. Default is 0.
Maximum number of result set cache instances allowed for this Data
Integration Service process. Default is 0.
Advanced Properties
The following table describes the Advanced properties:
Property
Description
Amount of RAM allocated to the Java Virtual Machine (JVM) that runs the Data
Integration Service. Use this property to increase the performance. Append one of
the following letters to the value to specify the units:
-
b for bytes.
k for kilobytes.
m for megabytes.
g for gigabytes.
Java Virtual Machine (JVM) command line options to run Java-based programs.
When you configure the JVM options, you must set the Java SDK classpath, Java
SDK minimum memory, and Java SDK maximum memory properties.
Logging Options
The following table describes the logging options for the Data Integration Service process:
Property
Description
Log Directory
SQL Properties
The following table describes the SQL properties:
Property
Description
Maximum # of Concurrent
Connections
Limits the number of database connections that the Data Integration Service
can make for SQL data services. Default is 100.
69
Environment Variables
You can configure environment variables for the Data Integration Service process.
The following table describes the environment variables:
Property
Description
Environment Variable
Execution Options
The default value for each execution option on the Compute view is defined by the same execution option on
the Properties view. When the Data Integration Service runs on multiple nodes, you can override the
execution options to define different values for each node with the compute role. The DTM instances that run
on the node use the overridden values.
You can override the following execution options on the Compute view:
Home Directory
Temporary Directories
Cache Directory
Source Directory
Target Directory
When you override an execution option for a specific node, the Administrator tool displays a green checkmark
next to the overridden property. The Edit Execution Options dialog box displays a reset option next to each
overridden property. Select Reset to remove the overridden value and use the value defined for the Data
Integration Service on the Properties view.
70
The following image shows that the Temporary Directories property has an overridden value in the Edit
Execution Options dialog box:
Related Topics:
Environment Variables
When a Data Integration Service grid runs jobs in separate remote processes, you can configure environment
variables for DTM processes that run on nodes with the compute role.
Note: If the Data Integration Service runs on a single node or on a grid that runs jobs in the service process
or in separate local processes, any environment variables that you define on the Compute view are ignored.
When a node in the grid has the compute role only, configure environment variables for DTM processes on
the Compute view.
When a node in the grid has both the service and compute roles, you configure environment variables for the
Data Integration Service process that runs on the node on the Processes view. You configure environment
variables for DTM processes that run on the node on the Compute view. DTM processes inherit the
environment variables defined for the Data Integration Service process. You can override an environment
variable value for DTM processes. Or, you can define specific environment variables for DTM processes.
Consider the following examples:
You define EnvironmentVar1=A on the Processes view and define EnvironmentVar1=B on the Compute
view. The Data Integration Service process that runs on the node uses the value A for the environment
variable. The DTM processes that run on the node use the value B.
You define EnvironmentVar1 on the Processes view and define EnvironmentVar2 on the Compute view.
The Data Integration Service process that runs on the node uses EnvironmentVar1. The DTM processes
that run on the node use both EnvironmentVar1 and EnvironmentVar2.
71
Description
Environment Variable
The Data Integration Service process fails and the primary node is not available.
Grid
When the Data Integration Service runs on a grid, the restart and failover behavior depends on whether
the master or worker service process becomes unavailable.
72
If the master service process shuts down unexpectedly, the Service Manager tries to restart the process.
If the Service Manager cannot restart the process, the Service Manager elects another node to run the
master service process. The remaining worker service processes register themselves with the new
master. The master service process then reconfigures the grid to run on one less node.
If a worker service process shuts down unexpectedly, the Service Manager tries to restart the process. If
the Service Manager cannot restart the process, the master service process reconfigures the grid to run
on one less node.
The Service Manager restarts the Data Integration Service process based on domain property values set for
the amount of time spent trying to restart the service and the maximum number of attempts to try within the
restart period.
The Data Integration Service clients are resilient to temporary connection failures during restart and failover
of the service.
73
CHAPTER 4
Service Components, 77
Compute Component, 80
Single Node, 85
Grid, 86
Logs, 86
74
The Data Integration Service can run on a single node or on a grid. A grid is an alias assigned to a group of
nodes that run jobs. When you run a job on a grid, you improve scalability and performance by distributing
jobs to processes running on multiple nodes in the grid.
75
76
1.
A Data Integration Service client sends a request to a service module to run a job.
2.
3.
4.
5.
Service Components
The service components of the Data Integration Service include modules that manage requests from client
tools. They also include managers that manage application deployment, caches, and job optimizations.
The service components run within the Data Integration Service process. The Data Integration Service
process must run on a node with the service role. A node with the service role can run application services.
Client Tools
Developer tool
Analyst tool
Run a mapping.
Developer tool
Command line
Developer tool
Developer tool
Sample third-party client tools include SQL SQuirreL Client, DBClient, and MySQL ODBC Client.
When you preview or run a mapping, the client tool sends the request and the mapping to the Data
Integration Service. The Mapping Service Module sends the mapping to the LDTM for optimization and
compilation. The LDTM passes the compiled mapping to a DTM instance, which generates the preview data
or runs the mapping.
When you preview data contained in an SQL data service in the Developer tool, the Developer tool sends the
request to the Data Integration Service. The Mapping Service Module sends the SQL statement to the LDTM
for optimization and compilation. The LDTM passes the compiled SQL statement to a DTM instance, which
runs the SQL statement and generates the preview data.
When you preview a web service operation mapping in the Developer tool, the Developer tool sends the
request to the Data Integration Service. The Mapping Service Module sends the operation mapping to the
LDTM for optimization and compilation. The LDTM passes the compiled operation mapping to a DTM
instance, which runs the operation mapping and generates the preview data.
Service Components
77
When you run a profile in the Analyst tool or the Developer tool, the application sends the request to the Data
Integration Service. The Profiling Service Module converts the profile into one or more mappings. The
Profiling Service Module sends the mappings to the LDTM for optimization and compilation. The LDTM
passes the compiled mappings to DTM instances that get the profiling rules and run the profile.
When you run a scorecard in the Analyst tool or the Developer tool, the application sends the request to the
Data Integration Service. The Profiling Service Module converts the scorecard into one or more mappings.
The Profiling Service Module sends the mappings to the LDTM for optimization and compilation. The LDTM
passes the compiled mappings to DTM instances that generate a scorecard for the profile.
To create and run profiles and scorecards, you must associate the Data Integration Service with a profiling
warehouse. The Profiling Service Module stores profiling data and metadata in the profiling warehouse.
78
Deployment Manager
The Deployment Manager is the component in Data Integration Service that manages applications. When you
deploy an application, the Deployment Manager manages the interaction between the Data Integration
Service and the Model Repository Service.
The Deployment Manager starts and stops an application. The Deployment Manager validates the mappings,
workflows, web services, and SQL data services in the application and their dependent objects when you
deploy the application.
After validation, the Deployment Manager stores application run-time metadata in the Model repository. Runtime metadata includes information to run the mappings, workflows, web services, and SQL data services in
the application.
The Deployment Manager creates a separate set of run-time metadata in the Model repository for each
application. When the Data Integration Service runs application objects, the Deployment Manager retrieves
the run-time metadata and makes it available to the DTM.
Service Components
79
Compute Component
The compute component of the Data Integration Service is the execution Data Transformation Manager
(DTM). The DTM extracts, transforms, and loads data to complete a data transformation job.
The DTM must run on a node with the compute role. A node with the compute role can perform computations
requested by application services.
80
Preview transformations.
Generate scorecards.
Transforming data
The DTM allocates CPU resources only when a DTM task needs a thread. When a task completes or if a task
is idle, the task returns the thread to a thread pool. The DTM reuses the threads in the thread pool for other
DTM tasks.
Processing Threads
When the DTM runs mappings, it uses reader, transformation, and writer pipelines that run in parallel to
extract, transform, and load data.
The DTM separates a mapping into pipeline stages and uses one reader thread, one transformation stage,
and one writer thread to process each stage. Each pipeline stage runs in one of the following threads:
Reader thread that controls how the DTM extracts data from the source.
Transformation thread that controls how the DTM processes data in the pipeline.
Writer thread that controls how the DTM loads data to the target.
Because the pipeline contains three stages, the DTM can process three sets of rows concurrently and
optimize mapping performance. For example, while the reader thread processes the third row set, the
transformation thread processes the second row set, and the writer thread processes the first row set.
If you have the partitioning option, the Data Integration Service can maximize parallelism for mappings and
profiles. When you maximize parallelism, the DTM separates a mapping into pipeline stages and uses
multiple threads to process each stage.
Output Files
The DTM generates output files when it runs mappings, mappings included in a workflow, profiles, SQL
queries to an SQL data service, or web service operation requests. Based on transformation cache settings
and target types, the DTM can create cache, reject, target, and temporary files.
By default, the DTM stores output files in the directories defined by execution options for the Data Integration
Service.
Data objects and transformations in the Developer tool use system parameters to access the values of these
Data Integration Service directories. By default, the system parameters are assigned to flat file directory,
cache file directory, and temporary file directory fields.
For example, when a developer creates an Aggregator transformation in the Developer tool, the CacheDir
system parameter is the default value assigned to the cache directory field. The value of the CacheDir
system parameter is defined in the Cache Directory property for the Data Integration Service. Developers
Compute Component
81
can remove the default system parameter and enter a different value for the cache directory. However, jobs
fail to run if the Data Integration Service cannot access the directory.
In the Developer tool, developers can change the default system parameters to define different directories for
each transformation or data object.
Cache Files
The DTM creates at least one cache file for each Aggregator, Joiner, Lookup, Rank, and Sorter
transformation included in a mapping, profile, SQL data service, or web service operation mapping.
If the DTM cannot process a transformation in memory, it writes the overflow values to cache files. When the
job completes, the DTM releases cache memory and usually deletes the cache files.
By default, the DTM stores cache files for Aggregator, Joiner, Lookup, and Rank transformations in the list of
directories defined by the Cache Directory property for the Data Integration Service. The DTM creates index
and data cache files. It names the index file PM*.idx, and the data file PM*.dat.
The DTM stores the cache files for Sorter transformations in the list of directories defined by the Temporary
Directories property for the Data Integration Service. The DTM creates one sorter cache file.
Reject Files
The DTM creates a reject file for each target instance in a mapping or web service operation mapping. If the
DTM cannot write a row to the target, the DTM writes the rejected row to the reject file. If the reject file does
not contain any rejected rows, the DTM deletes the reject file when the job completes.
By default, the DTM stores reject files in the directory defined by the Rejected Files Directory property for the
Data Integration Service. The DTM names reject files based on the name of the target data object. The
default name for reject files is <file_name>.bad.
Target Files
If a mapping or web service operation mapping writes to a flat file target, the DTM creates the target file
based on the configuration of the flat file data object.
By default, the DTM stores target files in the list of directories defined by the Target Directory property for the
Data Integration Service. The DTM names target files based on the name of the target data object. The
default name for target files is <file_name>.out.
Temporary Files
The DTM can create temporary files when it runs mappings, profiles, SQL queries, or web service operation
mappings. When the jobs complete, the temporary files are usually deleted.
By default, the DTM stores temporary files in the list of directories defined by the Temporary Directories
property for the Data Integration Service. The DTM also stores the cache files for Sorter transformations in
the list of directories defined by the Temporary Directories property.
82
Data Integration
Service
Configuration
Types of Jobs
In the Data
Integration
Service process
SQL data service and web service jobs on a single node or on a grid
where each node has both the service and compute roles.
Advantages:
SQL data service and web service jobs typically achieve better
performance when the Data Integration Service runs jobs in the
service process.
In separate DTM
processes on the
local node
In separate DTM
processes on
remote nodes
Grid
Note: Ad hoc jobs, with the exception of profiles, can run in the Data Integration Service process or in
separate DTM processes on the local node. Ad hoc jobs include mappings run from the Developer tool or
previews, scorecards, or drill downs on profile results run from the Developer tool or Analyst tool. If you
configure a Data Integration Service grid to run jobs in separate remote processes, the service runs ad hoc
jobs in separate local processes.
83
84
Single Node
When the Data Integration Service runs on a single node, the service and compute components of the Data
Integration Service run on the same node. The node must have both the service and compute roles.
A Data Integration Service that runs on a single node can run DTM instances in the Data Integration Service
process or in separate DTM processes. Configure the service based on the types of jobs that the service
runs.
If you run the Data Integration Service on a single node and you have the high availability option, you can
configure back-up nodes in case the primary node becomes unavailable. High availability enables the Service
Manager and the Data Integration Service to react to network failures and failures of the Data Integration
Service. If a Data Integration Service becomes unavailable, the Service Manager can restart the service on
the same node or on a back-up node.
Single Node
85
Grid
If your license includes grid, you can configure the Data Integration Service to run on a grid. A grid is an alias
assigned to a group of nodes that run jobs.
When the Data Integration Service runs on a grid, you improve scalability and performance by distributing
jobs to processes running on multiple nodes in the grid. The Data Integration Service is also more resilient
when it runs on a grid. If a service process shuts down unexpectedly, the Data Integration Service remains
available as long as another service process runs on another node.
When the Data Integration Service runs on a grid, the service and compute components of the Data
Integration Service can run on the same node or on different nodes, based on how you configure the grid and
the node roles. Nodes in a Data Integration Service grid can have a combination of the service only role, the
compute only role, and both the service and compute roles.
A Data Integration Service that runs on a grid can run DTM instances in the Data Integration Service process,
in separate DTM processes on the same node, or in separate DTM processes on remote nodes. Configure
the service based on the types of jobs that the service runs.
Logs
The Data Integration Service generates log events about service configuration and processing and about the
jobs that the DTM runs.
The Data Integration Service generates the following types of log events:
Service log events
The Data Integration Service process generates log events about service configuration, processing, and
failures. These log events are collected by the Log Manager in the domain. You can view the logs for the
Data Integration Service on the Logs tab of the Administrator tool.
Job log events
The DTM generates log events about the jobs that it runs. The DTM generates log events for the
following jobs:
Previews, profiles, scorecards, or mappings run from the Analyst tool or the Developer tool
Deployed mappings
Workflows
You can view the logs for these jobs on the Monitor tab of the Administrator tool.
When the DTM runs, it generates log events for the job that it is running. The DTM bypasses the Log
Manager and sends the log events to log files. The DTM stores the log files in the Log Directory property
specified for the Data Integration Service process. Log files have a .log file name extension.
If you created a custom location for logs before upgrading to the current version of Informatica, the Data
Integration Service continues to write logs to that location after you upgrade. When you create a new
Data Integration Service, the Data Integration Service writes logs to the default location unless you
specify a different location.
86
When the Workflow Service Module runs a workflow, it generates log events for the workflow. The
Workflow Service Module bypasses the Log Manager and sends the log events to log files. The Workflow
Service Module stores the log files in a folder named workflow in the log directory that you specify for
the Data Integration Service process.
When a Mapping task in a workflow starts a DTM instance to run a mapping, the DTM generates log
events for the mapping. The DTM stores the log files in a folder named mappingtask in the log directory
that you specify for the Data Integration Service process.
Logs
87
CHAPTER 5
88
89
Single node
When you enable a Data Integration Service that runs on a single node, a service process starts on the
node.
Grid
When you enable a Data Integration Service that runs on a grid, a service process starts on each node
in the grid that has the service role.
Primary and back-up nodes
When you enable a Data Integration Service configured to run on primary and back-up nodes, a service
process is available to run on each node, but only the service process on the primary node starts. For
example, you have the high availability option and you configure a Data Integration Service to run on a
primary node and two back-up nodes. You enable the Data Integration Service, which enables a service
process on each of the three nodes. A single process runs on the primary node, and the other processes
on the back-up nodes maintain standby status.
Note: The associated Model Repository Service must be started before you can enable the Data Integration
Service.
When you disable the Data Integration Service, you shut down the Data Integration Service and disable all
service processes. If you are running the Data Integration Service on a grid, you disable all service processes
on the grid.
When you disable the Data Integration Service, you must choose the mode to disable it in. You can choose
one of the following options:
Complete. Stops all applications and cancels all jobs within each application. Waits for all jobs to cancel
before disabling the service.
Abort. Stops all applications and tries to cancel all jobs before aborting them and disabling the service.
When you recycle the Data Integration Service, the Service Manager restarts the service. When the Service
Manager restarts the Data Integration Service, it also restores the state of each application associated with
the Data Integration Service.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
3.
On the Manage tab Actions menu, click one of the following options:
90
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
3.
4.
On the Manage tab Actions menu, click one of the following options:
Disable Process to disable the service process. Choose the mode to disable the service process in.
91
When you configure directories for the source and output files, you configure the paths for the home directory
and its subdirectories. The default value of the Home Directory property is <Informatica installation
directory>/tomcat/bin. If you change the default value, verify that the directory exists.
By default, the following directories have values relative to the home directory:
Temporary directories
Cache directory
Source directory
Target directory
You can define a different directory relative to the home directory. Or, you can define an absolute directory
outside the home directory.
If you define a different absolute directory, use the correct syntax for the operating system:
On Windows, enter an absolute path beginning with a drive letter, colon, and backslash. For example:
C:\<Informatica installation directory>\tomcat\bin\MyHomeDir
Data objects and transformations in the Developer tool use system parameters to access the values of these
Data Integration Service directories. By default, the system parameters are assigned to flat file directory,
cache file directory, and temporary file directory fields.
For example, when a developer creates an Aggregator transformation in the Developer tool, the CacheDir
system parameter is the default value assigned to the cache directory field. The value of the CacheDir
system parameter is defined in the Cache Directory property for the Data Integration Service. Developers
can remove the default system parameter and enter a different value for the cache directory. However, jobs
fail to run if the Data Integration Service cannot access the directory.
You can configure the Source Directory property to use a shared directory to create one directory for
source files.
If you run mappings that manage metadata changes in flat file sources and if the Data Integration Service
grid is configured to run jobs in separate remote processes, you must configure the Source Directory
property to use a shared directory.
If you run other types of mappings or if you run mappings that manage metadata changes in flat file
sources on any other Data Integration Service grid configuration, you can configure different source
directories for each node with the compute role. Replicate all source files in all of the source directories.
92
If you run mappings that use a persistent lookup cache, you must configure the Cache Directory property
to use a shared directory. If no mappings use a persistent lookup cache, you can configure the cache
directory to have a different directory for each node with the compute role.
You can configure the Target Directory, Temporary Directories, and Reject File Directory properties to
have different directories for each node with the compute role.
To configure a shared directory, configure the directory in the Execution Options on the Properties view. You
can configure a shared directory for the home directory so that all source and output file directories use the
same shared home directory. Or, you can configure a shared directory for a specific source or output file
directory. Remove any overridden values for the same execution option on the Compute view.
To configure different directories for each node with the compute role, configure the directory in the
Execution Options on the Compute view.
Configure the Control File Directory property for each flat file data object to use a shared directory to
create one directory for control files.
Configure the Control File Directory property for each flat file data object to use an identical directory
path that is local to each node with the service role. Replicate all control files in the identical directory on
each node with the service role.
Log Directory
Configure the directory for log files on the Processes view for the Data Integration Service. Data Integration
Service log files include files that contain service log events and files that contain job log events.
By default, the log directory for each Data Integration Service process is within the Informatica installation
directory on the node.
93
Configure each service process with identical absolute paths to the shared directories. If you use a mapped
or mounted drive, the absolute path to the shared location must also be identical.
For example, a newly elected master service process cannot access previous log files when nodes use the
following drives for the log directory:
A newly elected master service process also cannot access previous log files when nodes use the following
drives for the log directory:
94
Related Topics:
Preview jobs
Profiling jobs
For example, if you run two jobs from the same deployed application, two DTM instances are created in the
same DTM process. If you run a preview job, the DTM instance is created in a different DTM process.
When a DTM process finishes running a job, the process closes the DTM instance. When the DTM process
finishes running all jobs, the DTM process is released to the pool as an idle DTM process. An idle DTM
process is available to run any type of job.
95
You cannot use the Maximum Memory Size property for the Data Integration Service to limit the amount
of memory that the service allocates to run jobs. If you set the maximum memory size, the Data
Integration Service ignores it.
If the Data Integration Service runs on UNIX, the host file on each node with the compute role and on
each node with both the service and compute roles must contain a localhost entry. If the host file does not
contain a localhost entry, jobs that run in separate processes fail. Windows does not require a localhost
entry in the host file.
If you configure connection pooling, each DTM process maintains its own connection pool library. All DTM
instances running in the DTM process can use the connection pool library. The number of connection pool
libraries depends on the number of running DTM processes.
96
number of idle connection instances, the process drops the active connection instance instead of releasing it
to the pool.
The DTM process or the Data Integration Service process drops an idle connection instance from the pool
when the following conditions are true:
When you update the user name, password, or connection string for a database connection that has
connection pooling enabled, the updates take effect immediately. Subsequent connection requests use the
updated information. Also, the connection pool library drops all idle connections and restarts the connection
pool. It does not return any connection instances that are active at the time of the restart to the connection
pool when complete.
If you update any other database connection property, you must restart the Data Integration Service to apply
the updates.
97
Minimum Connections: 2
Maximum Connections: 4
When a DTM process runs five jobs, it uses the following process to maintain the connection pool:
1.
The DTM process receives a request to process five jobs at 11:00 a.m., and it creates five connection
instances.
2.
The DTM process completes processing at 11:30 a.m., and it releases four connections to the
connection pool as idle connections.
3.
4.
At 11:32 a.m., the maximum idle time is met for the idle connections, and the DTM process drops two
idle connections.
5.
The DTM process maintains two idle connections because the minimum connection pool size is two.
98
Adabas
IMS
Sequential
VSAM
To define a connection to a PowerExchange Listener, include a NODE statement in the DBMOVER file on the
Data Integration Service machine. Then define a database connection and associate the connection with the
Listener. The Location property specifies the Listener node name. Define database connection pooling
properties in the Pooling view for a database connection.
99
In particular, IMS netport jobs that use connection pooling might result in constraint issues. Because the
program specification block (PSB) is scheduled for a longer period of time when netport connections are
pooled, resource constraints can occur in the following cases:
A netport job on another port might try to read a separate database in the same PSB, but the scheduling
limit is reached.
The netport runs as a DL/1 job, and you attempt to restart the database within the IMS/DC environment
after the mapping finishes running. The database restart fails, because the database is still allocated to
the netport DL/1 region.
Processing in a second mapping or a z/OS job flow relies on the database being available when the first
mapping has finished running. If pooling is enabled, there is no guarantee that the database is available.
You might need to build a PSB that includes multiple IMS databases that the Data Integration Service
accesses. In this case, resource constraint issues are more severe as netport jobs are pooled that tie up
multiple IMS databases for long periods.
This requirement might apply because you can include up to ten NETPORT statements in a DBMOVER
file. Also, PowerExchange data maps cannot include program communication block (PCB) and PSB
values that PowerExchange can use dynamically.
100
TCPIP_SHOW_POOLING
Writes diagnostic information to the PowerExchange log file. Include the TCPIP_SHOW_POOLING
statement in the DBMOVER file on the Data Integration Service machine.
If TCPIP_SHOW_POOLING=Y, PowerExchange writes message PWX-33805 to the PowerExchange log
file each time a connection is returned to a PowerExchange connection pool.
Message PWX-33805 provides the following information:
Hits. Number of times that PowerExchange found a connection in a PowerExchange connection pool
that it could reuse.
Misses. Number of times that PowerExchange could not find a connection in a PowerExchange
connection pool that it could reuse.
Expired. Number of connections that were discarded from a PowerExchange connection pool
because the maximum idle time was exceeded.
Discarded pool full. Number of connections that were discarded from a PowerExchange connection
pool because the pool was full.
Discarded error. Number of connections that were discarded from a PowerExchange connection pool
due to an error condition.
101
Read from flat file, IBM DB2 for LUW, or Oracle sources.
Run transformations.
102
Each mapping contains one or more pipelines. A pipeline consists of a Read transformation and all the
transformations that receive data from that Read transformation. The Data Integration Service separates a
mapping pipeline into pipeline stages and then performs the extract, transformation, and load for each
pipeline stage in parallel.
Partition points mark the boundaries in a pipeline and divide the pipeline into stages. For every mapping
pipeline, the Data Integration Service adds a partition point after the Read transformation and before the
Write transformation to create multiple pipeline stages.
Each pipeline stage runs in one of the following threads:
Reader thread that controls how the Data Integration Service extracts data from the source.
Transformation thread that controls how the Data Integration Service processes data in the pipeline.
Writer thread that controls how the Data Integration Service loads data to the target.
The following figure shows a mapping separated into a reader pipeline stage, a transformation pipeline stage,
and a writer pipeline stage:
Because the pipeline contains three stages, the Data Integration Service can process three sets of rows
concurrently and optimize mapping performance. For example, while the reader thread processes the third
row set, the transformation thread processes the second row set, and the writer thread processes the first
row set.
The following table shows how multiple threads can concurrently process three sets of rows:
Reader Thread
Transformation Thread
Writer Thread
Row Set 1
Row Set 2
Row Set 1
Row Set 3
Row Set 2
Row Set 1
Row Set 4
Row Set 3
Row Set 2
Row Set n
If the mapping pipeline contains transformations that perform complicated calculations, processing the
transformation pipeline stage can take a long time. To optimize performance, the Data Integration Service
adds partition points before some transformations to create an additional transformation pipeline stage.
103
In the preceding image, maximum parallelism for the Data Integration Service is three. Maximum parallelism
for the mapping is Auto. The Data Integration Service separates the mapping into four pipeline stages and
104
uses a total of 12 threads to run the mapping. The Data Integration Service performs the following tasks at
each of the pipeline stages:
At the reader pipeline stage, the Data Integration Service queries the Oracle database system to discover
that both source tables, source A and source B, have two database partitions. The Data Integration
Service uses one reader thread for each database partition.
At the first transformation pipeline stage, the Data Integration Service redistributes the data to group rows
for the join condition across two threads.
At the second transformation pipeline stage, the Data Integration Service determines that three threads
are optimal for the Aggregator transformation. The service redistributes the data to group rows for the
aggregate expression across three threads.
At the writer pipeline stage, the Data Integration Service does not need to redistribute the rows across the
target partition point. All rows in a single partition stay in that partition after crossing the target partition
point.
105
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
3.
4.
5.
6.
Click OK.
7.
106
Configure the result set cache properties in the Data Integration Service process properties.
2.
Configure the cache expiration period in the SQL data service properties.
3.
Configure the cache expiration period in the web service operation properties. If you want the Data
Integration Service to cache the results by user, enable WS-Security in the web service properties.
The Data Integration Service purges result set caches in the following situations:
When the result set cache expires, the Data Integration Service purges the cache.
When you restart an application or run the infacmd dis purgeResultSetCache command, the Data
Integration Service purges the result set cache for objects in the application.
When you restart a Data Integration Service, the Data Integration Service purges the result set cache for
objects in applications that run on the Data Integration Service.
When you change the permissions for a user, the Data Integration Service purges the result set cache
associated with that user.
Configure the data object cache database connection in the cache properties for the Data Integration
Service.
2.
Enable caching in the properties of logical data objects or virtual tables in an application.
By default, the Data Object Cache Manager component of the Data Integration Service manages the cache
tables for logical data objects and virtual tables in the data object cache database. When the Data Object
107
Cache Manager manages the cache, it inserts all data into the cache tables with each refresh. If you want to
incrementally update the cache tables, you can choose to manage the cache tables yourself using a
database client or other external tool. After enabling data object caching, you can configure a logical data
object or virtual table to use a user-managed cache table.
Cache Tables
The Data Object Cache Manager is the component of the Data Integration Service that creates and manages
cache tables in a relational database.
You can use the following database types to store data object cache tables:
IBM DB2
Oracle
After the database administrator sets up the data object cache database, use the Administrator tool to create
a connection to the database. Then, you configure the Data Integration Service to use the cache database
connection.
When data object caching is enabled, the Data Object Cache Manager creates a cache table when you start
the application that contains the logical data object or virtual table. It creates one table in the cache database
for each cached logical data object or virtual table in an application. The Data Object Cache Manager uses a
prefix of CACHE to name each table.
Objects within an application share cache tables, but objects in different applications do not. If one logical
data object or virtual table is used in multiple applications, the Data Object Cache Manager creates a
separate cache table for each instance of the object.
Configure the cache database connection in the cache properties for the Data Integration Service.
The Data Object Cache Manager creates the cache tables in this database.
2.
Enable caching in the properties of logical data objects or virtual tables in an application.
When you enable caching, you can also configure the Data Integration Service to generate indexes on
the cache tables based on a column. Indexes can increase the performance of queries on the cache
database.
108
2.
3.
Select the application that contains the logical data object or virtual table for which you want to enable
caching.
4.
5.
Expand the application, and select the logical data object or virtual table.
6.
In the Logical Data Object Properties or Virtual Table Properties area, click Edit.
The Edit Properties dialog box appears.
7.
8.
In the Cache Refresh Period property, enter the amount of time in minutes that the Data Object Cache
Manager waits before refreshing the cache.
For example, if you enter 720, the Data Object Cache Manager refreshes the cache every 12 hours. If
you leave the default value of zero, the Data Object Cache Manager does not refresh the cache
according to a schedule. You must manually refresh the cache using the infacmd dis
RefreshDataObjectCache command.
9.
10.
Click OK.
109
11.
To generate indexes on the cache table based on a column, expand the logical data object or virtual
table.
a.
Select a column, and then click Edit in the Logical Data Object Column Properties or Virtual
Table Column Properties area.
The Edit Column Properties dialog box appears.
b.
12.
110
queries an SQL data service during a cache refresh, the Data Integration Service returns information
from the existing cache.
Abort a refresh
To abort a cache refresh, use the infacmd dis CancelDataObjectCacheRefresh command. If you abort a
cache refresh, the Data Object Cache Manager restores the existing cache.
Purge the cache
To purge the cache, use the infacmd dis PurgeDataObjectCache command. You must disable the
application before you purge the cache.
2.
3.
111
4.
In the Navigator, expand an application and select Logical Data Objects or SQL Data Services.
5.
Select an SQL data service, click the Virtual Tables view, and then select a table row.
2.
3.
Select the application that contains the logical data object or virtual table for which you want to use a
user-managed cache table.
4.
5.
Expand the application, and select the logical data object or virtual table.
6.
In the Logical Data Object Properties or Virtual Table Properties area, click Edit.
The Edit Properties dialog box appears.
7.
Enter the name of the user-managed cache table that you created in the data object cache database.
When you enter a cache table name, the Data Object Cache Manager does not generate the cache for
the object and ignores the cache refresh period.
112
The following figure shows a logical data object configured to use a user-managed cache table:
8.
Click OK.
9.
113
intelligence tools can query the temporary table instead of the SQL data service, resulting in increased
performance.
To implement temporary tables, the Informatica administrator and the business intelligence tool user perform
the following separate tasks:
Step 1. The Informatica administrator creates a connection for the data integration service.
In the Administrator tool, create a connection to the SQL data service. Edit the SQL Properties of the
Data Integration Service and select a relational database connection for the Table Storage Connection
property. Recycle the Data Information Service.
Step 2. The business intelligence tool user creates a connection for the SQL data service.
In a business intelligence tool, create a connection to the SQL data service. The connection uses the
Informatica ODBC or JDBC driver.
Step 3. Queries from the business intelligence tool create and use temporary tables.
While the connection is active, the business intelligence tool issues queries to the SQL data service.
These queries create and use temporary tables to store large amounts of data that the complex query
produces. When the connection ends, the database drops the temporary table.
114
Description
Literal
data
Literals describe a user or system-supplied string or value that is not an identifier or keyword.
Use strings, numbers, dates, or boolean values when you insert literal data into a temporary
table. Use the following statement format to insert literal data into a temporary table:
INSERT INTO <TABLENAME> <OPTIONAL COLUMN LIST> VALUES (<VALUE
LIST>), (<VALUE LIST>)
For example, INSERT INTO temp_dept (dept_id, dept_name, location)
VALUES (2, 'Marketing', 'Los Angeles').
Query
data
You can query an SQL data service and insert data from the query into a temporary table. Use
the following statement format to insert query data into a temporary table:
INSERT INTO <TABLENAME> <OPTIONAL COLUMN LIST> <SELECT QUERY>
For example, INSERT INTO temp_dept(dept_id, dept_name, location) SELECT
dept_id, dept_name, location from dept where dept_id = 99.
You can use a set operator, such as UNION, in the SQL statement when you insert query data
into a temporary table. Use the following statement format when you use a set operator:
INSERT INTO <TABLENAME> <OPTIONAL COLUMN LIST> (<SELECT QUERY>
<SET OPERATOR> <SELECT QUERY>)
For example, INSERT INTO temp_dept select * from north_america_dept
UNION select * from asia_dept.
You can specify schema and default schema for a temporary table.
You can place the primary key, NULL, NOT NULL, and DEFAULT constraints on a temporary table.
You cannot place a foreign key or CHECK and UNIQUE constraints on a temporary table.
You cannot issue a query that contains a common table expression or a correlated subquery against a
temporary table.
115
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
In the Domain Navigator, select a Data Integration Service that has an associated profiling warehouse.
3.
To create profiling warehouse content, click the Actions menu on the Manage tab and select Profiling
Warehouse Database Contents > Create.
4.
To delete profiling warehouse content, click the Actions menu on the Manage tab and select Profiling
Warehouse Database Contents > Delete.
Database Management
You need to periodically review and manage the profiling warehouse database growth. You can remove
profile information that you no longer need and monitor or maintain the profiling warehouse tables.
The need for maintenance depends on different scenarios, such as short-term projects or when you no longer
need the profile results. You can remove unused profile results and recover disk space used by the results so
that you can reuse the database space for other purposes.
Purge
Purges profile and scorecard results from the profiling warehouse.
The infacmd ps Purge command uses the following syntax:
Purge
<-DomainName|-dn> domain_name
[<-Gateway|-hp> gateway_name]
[<-NodeName|-nn>] node_name
<-UserName|-un> user_name
<-Password|-pd> Password
[<-SecurityDomain|-sdn> security_domain]
<-MrsServiceName|-msn> MRS_name
<-DsServiceName|-dsn> data_integration_service_name
<-ObjectType|-ot> object_type
116
<-ObjectPathAndName|-opn> MRS_object_path
[<-RetainDays|-rd> results_retain_days]
[<-ProjectFolderPath|-pf> project_folder_path]
[<-ProfileTaskName|-pt> profile_task_name]
[<-Recursive|-r> recursive]
[<-PurgeAllResults|-pa> purge_all_results]
The following table describes infacmd ps Purge options and arguments:
Option
Argument
Description
-DomainName
domain_name
-dn
-Gateway
-hp
You can set the domain name with the -dn option or the
environment variable INFA_DEFAULT_DOMAIN. If you set a
domain name with both methods, the -dn option takes
precedence.
gateway_nam
e
-NodeName
node_name
user_name
-nn
-UserName
-un
Password
Required if you specify the user name. Password for the user
name. The password is case sensitive. You can set a
password with the -pd option or the environment variable
INFA_DEFAULT_DOMAIN_PASSWORD. If you set a
password with both methods, the password set with the -pd
option takes precedence.
117
Option
Argument
Description
-SecurityDomain
security_doma
in
-sdn
MRS_name
data_integrati
on_service_na
me
-dsn
-ObjectType
MRS_object_p
ath
-msn
-DsServiceName
-ot
-ObjectPathAndName
-opn *
results_retain
_days
project_folder
_path
-ProfileTaskName
-pt *
profile_task_n
ame
-Recursive
recursive
-r
118
Option
Argument
Description
-PurgeAllResults
purge_all_res
ults
Optional. Set this option to purge all results for the profile or
scorecard object.
-pa
Tablespace Recovery
As part of the regular profile operations, the Data Integration Service writes profile results to the profiling
warehouse and deletes results from the profiling warehouse. The indexes and base tables can become
fragmented over a period of time. You need to reclaim the unused disk space, especially for Index Organized
Tables in Oracle database.
Most of the profiling warehouse tables contain relatively small amount of data and you do not need to recover
the tablespace and index space.
The following tables store large amounts of profile data and deleting the tables can leave the tables
fragmented:
Name
Description
IDP_FIELD_VERBOSE_SMRY_DATA
IDP_VERBOSE_FIELD_DTL_RES
When you perform the tablespace recovery, ensure that no user runs a profile task. After you recover the
data, update the database statistics to reflect the changed structure.
IBM DB2
The recommendation is to shut down the Data Integration Service when you reorganize the tables and
indexes.
To recover the database for a table, run the following command:
REORG TABLE <TABLE NAME>
REORG INDEXES ALL FOR TABLE <TABLE NAME> ALLOW WRITE ACCESS CLEANUP ONLY ALL
Oracle
You can rebuild Index Organized Tables in Oracle. This action reclaims unused fragments inside the index
and applies to the IDP_FIELD_VERBOSE_SMRY_DATA and IDP_FIELD_VERBOSE_SMRY_DATA profiling
warehouse tables.
To recover the database for a table, run the following command:
ALTER TABLE <Table Name> MOVE ONLINE
119
Database Statistics
Update the database statistics to allow the database to quickly run the queries on the profiling warehouse.
120
Example
The Finance department wants to configure a web service to accept web service requests from a range of IP
addresses. To configure the Data Integration Service to accept web service requests from machines in a
local network, enter the following expression as an allowed IP Address:
192\.168\.1\.[0-9]*
The Data Integration Service accepts requests from machines with IP addresses that match this pattern. The
Data Integration Service refuses to process requests from machines with IP addresses that do not match this
pattern.
Pass-through Security
Pass-through security is the capability to connect to an SQL data service or an external source with the client
user credentials instead of the credentials from a connection object.
Users might have access to different sets of data based on the job in the organization. Client systems restrict
access to databases by the user name and the password. When you create an SQL data service, you might
combine data from different systems to create one view of the data. However, when you define the
connection to the SQL data service, the connection has one user name and password.
Pass-through Security
121
If you configure pass-through security, you can restrict users from some of the data in an SQL data service
based on their user name. When a user connects to the SQL data service, the Data Integration Service
ignores the user name and the password in the connection object. The user connects with the client user
name or the LDAP user name.
A web service operation mapping might need to use a connection object to access data. If you configure
pass-through security and the web service uses WS-Security, the web service operation mapping connects to
a source using the user name and password provided in the web service SOAP request.
Configure pass-through security for a connection in the connection properties of the Administrator tool or with
infacmd dis UpdateServiceOptions. You can set pass-through security for connections to deployed
applications. You cannot set pass-through security in the Developer tool. Only SQL data services and web
services recognize the pass-through security configuration.
For more information about configuring security for SQL data services, see the Informatica How-To Library
article "How to Configure Security for SQL Data Services":
http://communities.informatica.com/docs/DOC-4507.
Example
An organization combines employee data from multiple databases to present a single view of employee data
in an SQL data service. The SQL data service contains data from the Employee and Compensation
databases. The Employee database contains name, address, and department information. The
Compensation database contains salary and stock option information.
A user might have access to the Employee database but not the Compensation database. When the user
runs a query against the SQL data service, the Data Integration Service replaces the credentials in each
database connection with the user name and the user password. The query fails if the user includes salary
information from the Compensation database.
122
Select a connection.
2.
3.
4.
To choose pass-through security for the connection, select the Pass-through Security Enabled option.
5.
Optionally, select the Data Integration Service for which you want to enable object caching for passthrough security.
6.
7.
8.
Select Allow Caching to allow data object caching for the SQL data service or web service. This applies
to all connections.
9.
Click OK.
You must recycle the Data Integration Service to enable caching for the connections.
Pass-through Security
123
CHAPTER 6
Grid for Mappings, Profiles, and Workflows that Run in Local Mode, 132
Grid for Mappings, Profiles, and Workflows that Run in Remote Mode, 137
124
125
All machines that represent nodes with the compute role or nodes with both the service and compute roles
must have installations of the native database client software associated with the databases that the Data
Integration Service accesses. For example, you run mappings that read from and write to an Oracle
database. You must install and configure the same version of the Oracle client on all nodes in the grid that
have the compute role and all nodes in the grid that have both the service and compute roles.
For more information about establishing native connectivity between the Data Integration Service and a
database, see Configure Native Connectivity on Service Machines on page 413.
126
127
The Data Integration Service manages requests and runs jobs on the following nodes in the grid:
On Node1, the master service process manages application deployment and logging. The master service
process also acts as a worker service process and completes jobs. The Data Integration Service
dispatches a preview request directly to the service process on Node1. The service process creates a
DTM instance to run the preview job. SQL data service and web service jobs can also run on Node1.
On Node2, the Data Integration Service dispatches SQL queries and web service requests directly to the
worker service process. The worker service process creates a separate DTM instance to run each job and
complete the request. Preview jobs can also run on Node2.
On Node3, the Data Integration Service dispatches two preview requests from a different user login than
the preview1 request directly to the worker service process. The worker service process creates a
separate DTM instance to run each preview job. SQL data service and web service jobs can also run on
Node3.
Rules and Guidelines for Grids that Run Jobs in the Service
Process
Consider the following rules and guidelines when you configure a Data Integration Service grid to run SQL
data service, web service, and preview jobs in the Data Integration Service process:
If the grid contains nodes with the compute role only, the Data Integration Service cannot start.
If the grid contains nodes with the service role only, jobs that are dispatched to the service process on the
node fail to run.
Configure environment variables for the Data Integration Service processes on the Processes view for
the service. The Data Integration Service ignores any environment variables configured on the Compute
view.
128
1.
Create a grid for SQL data service and web service jobs.
2.
3.
Configure the Data Integration Service to run jobs in the service process.
4.
5.
6.
Optionally, configure properties for each Data Integration Service process that runs on a node in the
grid.
7.
Optionally, configure compute properties for each DTM instance that can run on a node in the grid.
8.
2.
3.
4.
5.
Description
Name
Name of the grid. The name is not case sensitive and must be unique within the domain. It
cannot exceed 128 characters or begin with @. It also cannot contain spaces or the following
special characters:
`~%^*+={}\;:'"/?.,<>|!()][
Description
Nodes
Path
6.
Click OK.
On the Services and Nodes view, select the Data Integration Service in the Domain Navigator.
2.
129
3.
4.
5.
6.
Click OK.
On the Services and Nodes view, select the Data Integration Service in the Domain Navigator.
2.
3.
4.
For the Launch Job Options property, select In the service process.
5.
Click OK.
Complete the following steps in the Administrator tool to configure the Data Integration Service to
communicate with the external HTTP load balancer:
a.
On the Services and Nodes view, select the Data Integration Service in the Domain Navigator.
b.
c.
d.
2.
Enter the logical URL for the external HTTP load balancer, and then click OK.
Configure the external load balancer to distribute requests to all nodes in the grid that have both the
service and compute roles.
130
1.
On the Services and Nodes view, select the Data Integration Service in the Domain Navigator.
2.
3.
Select a node to configure the shared log directory for that node.
4.
6.
Click OK.
7.
Repeat the steps for each node listed in the Processes tab to configure each service process with
identical absolute paths to the shared directories.
Related Topics:
Related Topics:
Related Topics:
131
132
The Data Integration Service manages requests and runs jobs on the following nodes in the grid:
On Node1, the master service process runs the workflow instance and non-mapping tasks. The master
service process dispatches mappings included in Mapping tasks from workflow1 to the worker service
processes on Node2 and Node3. The master service process also acts as a worker service process and
completes jobs. The Data Integration Service dispatches a preview request directly to the service process
on Node1. The service process creates a DTM instance within a separate DTM process to run the preview
job. Mapping and profile jobs can also run on Node1.
On Node2, the worker service process creates a DTM instance within a separate DTM process to run
mapping1 from workflow1. Ad hoc jobs can also run on Node2.
On Node3, the worker service process creates a DTM instance within a separate DTM process to run
mapping2 from workflow1. Ad hoc jobs can also run on Node3.
Rules and Guidelines for Grids that Run Jobs in Local Mode
Consider the following rules and guidelines when you configure a Data Integration Service grid to run jobs in
separate local processes:
If the grid contains nodes with the compute role only, the Data Integration Service cannot start.
If the grid contains nodes with the service role only, jobs that are dispatched to the service process on the
node fail to run.
Configure environment variables for the Data Integration Service processes on the Processes view for
the service. The Data Integration Service ignores any environment variables configured on the Compute
view.
Grid for Mappings, Profiles, and Workflows that Run in Local Mode
133
Create a grid for mappings, profiles, and workflows that run in separate local processes.
2.
3.
Configure the Data Integration Service to run jobs in separate local processes.
4.
5.
Optionally, configure properties for each Data Integration Service process that runs on a node in the
grid.
6.
Optionally, configure compute properties for each DTM instance that can run on a node in the grid.
7.
2.
3.
4.
134
5.
Description
Name
Name of the grid. The name is not case sensitive and must be unique within the domain. It
cannot exceed 128 characters or begin with @. It also cannot contain spaces or the following
special characters:
`~%^*+={}\;:'"/?.,<>|!()][
Description
Nodes
Path
6.
Click OK.
On the Services and Nodes view, select the Data Integration Service in the Domain Navigator.
2.
3.
4.
5.
6.
Click OK.
On the Services and Nodes view, select the Data Integration Service in the Domain Navigator.
2.
3.
4.
For the Launch Job Options property, select In separate local processes.
5.
Click OK.
On the Services and Nodes view, select the Data Integration Service in the Domain Navigator.
Grid for Mappings, Profiles, and Workflows that Run in Local Mode
135
2.
3.
Select a node to configure the shared log directory for that node.
4.
5.
6.
Click OK.
7.
Repeat the steps for each node listed in the Processes tab to configure each service process with
identical absolute paths to the shared directories.
Related Topics:
Related Topics:
Related Topics:
136
Communicates with the Resource Manager Service to manage the grid of available compute nodes.
When the Service Manager on a node with the compute role starts, the Service Manager registers the
node with the Resource Manager Service.
Orchestrates worker service process requests and dispatches mappings to worker compute nodes.
The master compute node also acts as a worker compute node and can run mappings.
DTM processes on worker compute nodes
The Data Integration Service designates the remaining nodes with the compute role as worker compute
nodes. The Service Manager on a worker compute node runs mappings in separate DTM processes
started within containers.
Grid for Mappings, Profiles, and Workflows that Run in Remote Mode
137
Service role
A Data Integration Service process runs on each node with the service role. Service components within
the Data Integration Service process run workflows and profiles, and perform mapping optimization and
compilation.
Compute role
DTM processes run on each node with the compute role. The DTM processes run deployed mappings,
mappings run by Mapping tasks within a workflow, and mappings converted from a profile.
Both service and compute roles
A Data Integration Service process and DTM processes run on each node with both the service and
compute roles. At least one node with both service and compute roles is required to run ad hoc jobs, with
the exception of profiles. Ad hoc jobs include mappings run from the Developer tool or previews,
scorecards, or drill downs on profile results run from the Developer tool or Analyst tool. The Data
Integration Service runs these job types in separate DTM processes on the local node.
In addition, nodes with both roles can complete all of the tasks that a node with the service role only or a
node with the compute role only can complete. For example, a workflow can run on a node with the
service role only or on a node with both the service and compute roles. A deployed mapping can run on
a node with the compute role only or on a node with both the service and compute roles.
The following table lists the job types that run on nodes based on the node role:
Job Type
Service Role
Compute Role
Service and
Compute Roles
Yes
Yes
Yes
Yes
Run workflows.
Yes
Yes
Yes
Yes
Run profiles.
Yes
Yes
Yes
Yes
Yes
Note: If you associate a Content Management Service with the Data Integration Service to run mappings that
read reference data, each node in the grid must have both the service and compute roles.
Job Types
When a Data Integration Service grid runs jobs in separate remote processes, how the Data Integration
Service runs each job depends on the job type.
The Data Integration Service balances the workload across the nodes in the grid based on the following job
types:
Workflows
When you run a workflow instance, the master service process runs the workflow instance and nonmapping tasks. The master service process uses round robin to dispatch each mapping within a Mapping
138
task to a worker service process. The LDTM component of the worker service process optimizes and
compiles the mapping. The worker service process then communicates with the master compute node to
dispatch the compiled mapping to a separate DTM process running on a worker compute node.
Deployed mappings
When you run a deployed mapping, the master service process uses round robin to dispatch each
mapping to a worker service process. The LDTM component of the worker service process optimizes
and compiles the mapping. The worker service process then communicates with the master compute
node to dispatch the compiled mapping to a separate DTM process running on a worker compute node.
Profiles
When you run a profile, the master service process converts the profiling job into multiple mapping jobs
based on the advanced profiling properties of the Data Integration Service. The master service process
then distributes the mappings across the worker service processes. The LDTM component of the worker
service process optimizes and compiles the mapping. The worker service process then communicates
with the master compute node to dispatch the compiled mapping to a separate DTM process running on
a worker compute node.
Ad hoc jobs, with the exception of profiles
When you run an ad hoc job, with the exception of profiles, the Data Integration Service uses round robin
to dispatch the first request directly to a worker service process that runs on a node with both the service
and compute roles. The worker service process runs the job in a separate DTM process on the local
node. To ensure faster throughput, the Data Integration Service bypasses the master service process.
When you run additional ad hoc jobs from the same login, the Data Integration Service dispatches the
requests to the same worker service process.
Note: Informatica does not recommend running SQL queries or web service requests on a Data Integration
Service grid that is configured to run jobs in separate remote processes. SQL data service and web service
jobs typically achieve better performance when the Data Integration Service runs jobs in the service process.
If you do run SQL queries and web service requests on a Data Integration Service grid configured to run jobs
in separate remote processes, these job types run on the nodes in the grid with both the service and compute
roles. The Data Integration Service runs these job types in separate DTM processes on the local node. For
web service requests, you must configure the external HTTP load balancer to distribute requests to nodes
that have both the service and compute roles.
Grid for Mappings, Profiles, and Workflows that Run in Remote Mode
139
The Data Integration Service manages requests and runs jobs on the following nodes in the grid:
On Node1, the master service process runs the workflow instance and non-mapping tasks. The master
service process dispatches a mapping included in a Mapping task from workflow1 to the worker service
process on Node2. The master service process also acts as a worker service process and can optimize
and compile mappings. Profile jobs can also run on Node1.
On Node2, the worker service process optimizes and compiles the mapping. The worker service process
then communicates with the master compute node on Node3 to dispatch the compiled mapping to a
worker compute node. The Data Integration Service dispatches a preview request directly to the worker
service process on Node2. The service process creates a DTM instance within a separate DTM process
on Node2 to run the preview job. Node2 also serves as a worker compute node and can run compiled
mappings.
On Node3, the Service Manager on the master compute node orchestrates requests to run mappings. The
master compute node also acts as a worker compute node and runs the mapping from workflow1 in a
separate DTM process started within a container.
Rules and Guidelines for Grids that Run Jobs in Remote Mode
Consider the following rules and guidelines when you configure a Data Integration Service grid to run jobs in
separate remote processes:
140
The grid must contain at least one node with both the service and compute roles to run an ad hoc job, with
the exception of profiles. The Data Integration Service runs these job types in a separate DTM process on
the local node. Add additional nodes with both the service and compute roles so that these job types can
be distributed to service processes running on other nodes in the grid.
To support failover for the Data Integration Service, the grid must contain at least two nodes that have the
service role.
If you associate a Content Management Service with the Data Integration Service to run mappings that
read reference data, each node in the grid must have both the service and compute roles.
The grid cannot include two nodes that are defined on the same host machine.
Informatica does not recommend assigning multiple Data Integration Services to the same grid nor
assigning one node to multiple Data Integration Service grids.
If a worker compute node is shared across multiple grids, mappings dispatched to the node might fail due
to an over allocation of the node's resources. If a master compute node is shared across multiple grids,
the log events for the master compute node are also shared and might become difficult to troubleshoot.
To recycle the Data Integration Service, select the service in the Domain Navigator and click Recycle the
Service.
2.
Create a grid for mappings, profiles, and workflows that run in separate remote processes.
3.
4.
Configure the Data Integration Service to run jobs in separate remote processes.
5.
6.
7.
Optionally, configure properties for each Data Integration Service process that runs on a node with the
service role.
8.
Optionally, configure compute properties for each DTM instance that can run on a node with the compute
role.
9.
Grid for Mappings, Profiles, and Workflows that Run in Remote Mode
141
Note: Before you can disable the service role on a node, you must shut down all application service
processes running on the node and remove the node as a primary or back-up node for any application
service. You cannot disable the service role on a gateway node.
1.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
In the Domain Navigator, select a node that you plan to add to the grid.
3.
4.
Select or clear the service and compute roles to update the node role.
5.
Click OK.
6.
If you disabled the compute role, the Disable Compute Role dialog box appears. Perform the following
steps:
a.
b.
7.
Abort. Tries to stop all jobs before aborting them and disabling the role.
Click OK.
Repeat the steps to update the node role for each node that you plan to add to the grid.
At least one node with both the service and compute roles to run previews and to run ad hoc jobs, with the
exception of profiles.
If you associate a Content Management Service with the Data Integration Service to run mappings that read
reference data, each node in the grid must have both the service and compute roles.
142
1.
2.
3.
4.
5.
Description
Name
Name of the grid. The name is not case sensitive and must be unique within the domain. It
cannot exceed 128 characters or begin with @. It also cannot contain spaces or the following
special characters:
`~%^*+={}\;:'"/?.,<>|!()][
Description
Nodes
Path
6.
Click OK.
On the Services and Nodes view, select the Data Integration Service in the Domain Navigator.
2.
3.
4.
5.
6.
Click OK.
On the Services and Nodes view, select the Data Integration Service in the Domain Navigator.
2.
3.
4.
For the Launch Job Options property, select In separate remote processes.
5.
Click OK.
Grid for Mappings, Profiles, and Workflows that Run in Remote Mode
143
2.
Select the Resource Manager Service in the Domain Navigator, and click Recycle the Service.
On the Services and Nodes view, select the Data Integration Service in the Domain Navigator.
2.
3.
Select a node to configure the shared log directory for that node.
4.
5.
6.
Click OK.
7.
Repeat the steps for each node listed in the Processes tab to configure each service process with
identical absolute paths to the shared directories.
Related Topics:
Related Topics:
144
Related Topics:
Grid for Mappings, Profiles, and Workflows that Run in Remote Mode
145
The DTM section of the log concludes with the following line:
### End Grid Task [gtid-1443479776986-1-79777626-99] Segment [s0] Tasklet [t-0]
Attempt [1]
The Data Integration Service cannot reuse DTM processes because you run jobs from different deployed
applications.
For example, you have configured a Data Integration Service grid that contains a single compute node. You
want to concurrently run two mappings from different applications. Because the mappings are in different
applications, the Data Integration Service runs the mappings in separate DTM processes, which requires two
containers. The machine that represents the compute node has four cores. Only one container can be
initialized, and so the two mappings cannot run concurrently. You can override the compute node attributes
to specify that the Resource Manager Service can allocate eight cores for jobs that run on the compute node.
Then, two DTM processes can run at the same time and the two mappings can run concurrently.
Use caution when you override compute node attributes. Specify values that are close to the actual resources
available on the machine so that you do not overload the machine. Configure the values such that the
memory requirements for the total number of concurrent mappings do not exceed the actual resources. A
mapping that runs in one thread requires one core. A mapping can use the amount of memory configured in
the Maximum Memory Per Request property for the Data Integration Service modules.
To override compute node attributes, run the infacmd rms SetComputeNodeAttributes command for a
specified node.
146
Argument
Description
-MaxCores
max_number_of_cores_to_allocate
-mc
max_memory_in_mb_to_allocate
After you override compute node attributes, you must recycle the Data Integration Service for the changes to
take effect. To reset an option to its default value, specify -1 as the value.
Create a grid where each node in the grid includes both the service and compute roles.
2.
Create a Data Integration Service and assign the service to run on the grid. Configure the Data
Integration Service to run jobs in separate local or remote processes.
3.
Create a Content Management Service and a new Data Integration Service to run on each node in the
grid.
4.
Associate each Content Management Service with the Data Integration Service that runs on the same
node.
5.
Associate each Content Management Service and Data Integration Service with the same Model
Repository Service that the Data Integration Service on grid is associated with.
The Content Management Service provides reference data information to all Data Integration Service
processes that run on the same node and that are associated with the same Model Repository Service.
147
The following image shows an example domain that contains three nodes. A total of three Data Integration
Services, two Content Management Services, and one Model Repository Service exist in the domain:
A Data Integration Service named DIS_grid. DIS_grid is assigned to run on the grid. A DIS_grid process
runs on each node in the grid. When you run a job on the grid, the DIS_grid processes run the job.
A Data Integration Service named DIS1 and a Content Management Service named CMS1 assigned to
run on Node1. CMS1 is associated with DIS1.
A Data Integration Service named DIS2 and a Content Management Service named CMS2 assigned to
run on Node2. CMS2 is associated with DIS2.
A Model Repository Service named MRS1 assigned to run on Node3. Each Data Integration Service and
Content Management Service in the domain is associated with MRS1. In this example, the Model
Repository Service runs on a node outside of the Data Integration Service grid. However, the Model
Repository Service can run on any node in the domain.
148
When you increase the pool size value, the Data Integration Service uses more hardware resources such as
CPU, memory, and system I/O. Set this value based on the resources available on the nodes in the grid. For
example, consider the number of CPUs on the machines where Data Integration Service processes run and
the amount of memory that is available to the Data Integration Service.
Note: If the Data Integration Service grid runs jobs in separate remote processes, additional concurrent jobs
might not run on compute nodes after you increase the value of this property. You might need to override
compute node attributes to increase the number of concurrent jobs on each compute node. For more
information, see Override Compute Node Attributes to Increase Concurrent Jobs on page 146.
Editing a Grid
You can edit a grid to change the description, add nodes to the grid, or remove nodes from the grid.
Before you remove a node from the grid, disable the Data Integration Service process running on the node.
1.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
3.
4.
Click OK.
5.
If you added or removed a node from a Data Integration Service grid configured to run jobs in separate
remote processes, recycle the Data Integration Service for the changes to take effect.
Deleting a Grid
You can delete a grid from the domain if the grid is no longer required.
Before you delete a grid, disable the Data Integration Service running on the grid.
1.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
3.
Troubleshooting a Grid
I enabled a Data Integration Service that runs on a grid, but one of the service processes failed to start.
When you enable a Data Integration Service that runs on a grid, a service process starts on each node in the
grid that has the service role. A service process might fail to start for the following reasons:
Editing a Grid
149
Another process running on the machine is using the HTTP port number assigned to the service process.
On the Processes view for the Data Integration Service, enter a unique HTTP port number for the service
process. Then, enable the service process running on that node.
A job failed to run on a Data Integration Service grid. Which logs do I review?
If the Data Integration Service grid is configured to run jobs in the service process or in separate local
processes, review the following logs in this order:
1.
2.
Data Integration Service log accessible from the Service view of the Logs tab.
Includes log events about service configuration, processing, and failures.
If the Data Integration Service grid is configured to run jobs in separate remote processes, additional
components write log files. Review the following logs in this order:
1.
2.
Data Integration Service log accessible from the Service view of the Logs tab.
Includes log events about service configuration, processing, and failures. The Data Integration Service
log includes the following message which indicates the host name and port number of the master
compute node:
INFO: [GRIDCAL_0204] The Integration Service [<MyDISName>] elected a new master
compute node [<HostName>:<PortNumber>].
3.
Master compute node log accessible in the cadi_services_0.log file located in the log directory
configured for the master compute node.
Includes log events written by the Service Manager on the master compute node about managing the
grid of compute nodes and orchestrating worker service process requests. The master compute node
logs are not accessible from the Administrator tool.
4.
Resource Manager Service log accessible from the Service view of the Logs tab.
Includes log events about service configuration and processing and about nodes with the compute role
that register with the service.
5.
Container management log accessible from the Domain view of the Logs tab. Select Container
Management for the category.
Includes log events about how the Service Manager manages containers on nodes with the compute
role.
A mapping that ran in a separate remote process has an incomplete log file.
When a mapping runs on a Data Integration Service grid configured to run jobs in separate remote
processes, the Data Integration Service writes two files for the mapping log. The worker service process that
optimizes and compiles the mapping on the service node writes log events to one log file. The DTM process
that runs the mapping on the compute node writes log events to another log file. When you access the
mapping log, the Data Integration Service consolidates the two files into a single log file.
150
The mapping has completed, but the DTM process failed to send the complete log file to the master Data
Integration Service process.
The DTM process might fail to send the complete DTM log because of a network error or because the
worker compute node unexpectedly shut down. The DTM process sends the log file to the Data
Integration Service process in multiple sections. The DTM section of the log begins and ends with the
following lines:
###
### <MyWorkerComputeNodeName>
###
### Start Grid Task [gtid-1443479776986-1-79777626-99] Segment [s0] Tasklet [t-0]
Attempt [1]
....
### End Grid Task [gtid-1443479776986-1-79777626-99] Segment [s0] Tasklet [t-0]
Attempt [1]
If these lines are not included in the mapping log or if the beginning line is included but not the ending
line, then the DTM process failed to send the complete log file. To resolve the issue, you can find the DTM
log files written to the following directory on the node where the master Data Integration Service process
runs:
<Informatica installation directory>/logs/<node name>/services/DataIntegrationService/
disLogs/logConsolidation/<mappingName>_<jobID>_<timestamp>
If the job ID folder is empty, find the log file that the DTM process temporarily writes on the worker
compute node.
To find the temporary DTM log file on the worker compute node, find the following message in the first
section of the mapping log:
INFO: [GCL_5] The grid task [gtid-1443479776986-1-79777626-99] cluster logs can be
found at [./1443479776986/taskletlogs/gtid-1443479776986-1-79777626-99].
The listed directory is a subdirectory of the following default log directory configured for the worker compute
node:
<Informatica installation directory>/logs/<node name>/dtmLogs/
Troubleshooting a Grid
151
CHAPTER 7
Applications, 153
Mappings, 158
Workflows, 167
152
Applications View
To manage deployed applications, select a Data Integration Service in the Navigator and then click the
Applications view.
The Applications view displays the applications that have been deployed to a Data Integration Service. You
can view the objects in the application and the properties. You can start and stop an application, an SQL data
service, and a web service in the application. You can also back up and restore an application.
The Applications view shows the applications in alphabetic order. The Applications view does not show
empty folders. Expand the application name in the top panel to view the objects in the application.
When you select an application or object in the top panel of the Applications view, the bottom panel displays
read-only general properties and configurable properties for the selected object. The properties change
based on the type of object you select.
When you select physical data objects, you can click a column heading in the lower panel to sort the list of
objects. You can use the filter bar to filter the list of objects.
Refresh the Applications view to see the latest applications and their states.
Applications
The Applications view displays the applications that users deployed to a Data Integration Service. You can
view the objects in the application and application properties. You can deploy, enable, rename, start, back
up, and restore an application.
Application State
The Applications view shows the state for each application deployed to the Data Integration Service.
An application can have one of the following states:
Disabled. The application is disabled from running. If you recycle the Data Integration Service, the
application will not start.
Application Properties
Application properties include read-only general properties and a property to configure whether the
application starts when the Data Integration Service starts.
The following table describes the read-only general properties for applications:
Property
Description
Name
Description
Applications
153
Property
Description
Type
Location
The location of the application. This includes the domain and Data Integration Service
name.
Deployment Date
Created By
Unique Identifier
Creation Date
Last Modified By
Creation Domain
Deployed By
Description
Startup Type
Determines whether an application starts when the Data Integration Service starts. When you
enable the application, the application starts by default when you start or recycle the Data
Integration Service.
Choose Disabled to prevent the application from starting. You cannot manually start an
application if it is disabled.
Deploying an Application
Deploy an object to an application archive file if you want to check the application into version control or if
your organization requires that administrators deploy objects to Data Integration Services.
1.
2.
Select a Data Integration Service, and then click the Applications view.
3.
4.
5.
6.
Click Add More Files if you want to deploy multiple application files.
You can add up to 10 files.
154
7.
8.
To select additional Data Integration Services, select them in the Data Integration Services panel. To
choose all Data Integration Services, select the box at the top of the list.
9.
10.
To continue working while deployment is in progress, you can click Deploy in Background.
The deployment process continues in the background.
11.
If a name conflict occurs, choose one of the following options to resolve the conflict:
Rename the new application. Enter the new application name if you select this option.
If you replace or update the existing application and the existing application is running, select the Force
Stop the Existing Application if it is Running option to stop the existing application. You cannot
update or replace an existing application that is running. When you stop an application, all running
objects in the application are aborted.
After you select an option, click OK.
12.
Click Close.
You can also deploy an application file using the infacmd dis deployApplication program.
Enabling an Application
An application must be enabled to run before you can start it. When you enable a Data Integration Service,
the enabled applications start automatically.
You can configure a default deployment mode for a Data Integration Service. When you deploy an application
to a Data Integration Service, the property determines the application state after deployment. An application
might be enabled or disabled. If an application is disabled, you can enable it manually. If the application is
enabled after deployment, the SQL data services, web services, and workflows are also enabled.
1.
2.
In the Applications view, select the application that you want to enable.
3.
4.
Renaming an Application
Rename an application to change the name. You can rename an application when the application is not
running.
1.
Applications
155
2.
In the Application view, select the application that you want to rename.
3.
4.
Starting an Application
You can start an application from the Administrator tool.
An application must be running before you can start or access an object in the application. You can start the
application from the Applications Actions menu if the application is enabled to run.
1.
2.
In the Applications view, select the application that you want to start.
3.
Backing Up an Application
You can back up an application to an XML file. The backup file contains all the property settings for the
application. You can restore the application to another Data Integration Service.
You must stop the application before you back it up.
1.
2.
3.
4.
5.
If you click Save, enter an XML file name and choose the location to back up the application.
The Administrator tool backs up the application to an XML file in the location you choose.
Restoring an Application
You can restore an application from an XML backup file. The application must be an XML backup file that you
create with the Backup option.
1.
In the Domain Navigator, select a Data Integration Service that you want to restore the application to.
2.
3.
4.
5.
6.
156
Keep the existing application and discard the new application. The Administrator tool does not restore
the file.
Replace the existing application with the new application. The Administrator tool restores the backup
application to the Data Integration Service.
7.
Rename the new application. Choose a different name for the application you are restoring.
2.
3.
4.
Description
Name
Description
Type
Location
The location of the logical data object. This includes the domain and Data Integration Service
name.
157
The following table describes the configurable logical data object properties:
Property
Description
Enable Caching
Cache the logical data object in the data object cache database.
Cache Refresh
Period
Cache Table
Name
The name of the user-managed table from which the Data Integration Service accesses the
logical data object cache. A user-managed cache table is a table in the data object cache
database that you create, populate, and manually refresh when needed.
If you specify a cache table name, the Data Object Cache Manager does not manage the
cache for the object and ignores the cache refresh period.
If you do not specify a cache table name, the Data Object Cache Manager manages the
cache for the object.
The following table describes the configurable logical data object column properties:
Property
Description
Create Index
Enables the Data Integration Service to generate indexes for the cache table based on this
column. Default is false.
Description
Name
Type
Mappings
The Applications view displays mappings included in applications that have been deployed to the Data
Integration Service.
Mapping properties include read-only general properties and properties to configure the settings the Data
Integration Services uses when it runs the mappings in the application.
158
The following table describes the read-only general properties for mappings:
Property
Description
Name
Description
Type
Location
The location of the mapping. This includes the domain and Data Integration
Service name.
Description
Date format
Date/time format the Data Integration Services uses when the mapping converts
strings to dates.
Default is MM/DD/YYYY HH24:MI:SS.
Tracing level
Overrides the tracing level for each transformation in the mapping. The tracing
level determines the amount of information the Data Integration Service sends to
the mapping log files.
Choose one of the following tracing levels:
- None. The Data Integration Service uses the tracing levels set in the mapping.
- Terse. The Data Integration Service logs initialization information, error messages,
and notification of rejected data.
- Normal. The Data Integration Service logs initialization and status information, errors
encountered, and skipped rows due to transformation row errors. It summarizes
mapping results, but not at the level of individual rows.
- Verbose Initialization. In addition to normal tracing, the Data Integration Service logs
additional initialization details, names of index and data files used, and detailed
transformation statistics.
- Verbose Data. In addition to verbose initialization tracing, the Data Integration
Service logs each row that passes into the mapping. The Data Integration Service
also notes where it truncates string data to fit the precision of a column and provides
detailed transformation statistics. The Data Integration Service writes row data for all
rows in a block when it processes a transformation.
Default is None.
Mappings
159
Property
Description
Optimization level
Controls the optimization methods that the Data Integration Service applies to a
mapping as follows:
- None. The Data Integration Service does not optimize the mapping.
- Minimal. The Data Integration Service applies the early projection optimization
method to the mapping.
- Normal. The Data Integration Service applies the early projection, early selection,
and predicate optimization methods to the mapping.
- Full. The Data Integration Service applies the early projection, early selection,
predicate optimization, and semi-join optimization methods to the mapping.
Default is Normal.
Sort order
Order in which the Data Integration Service sorts character data in the mapping.
Default is Binary.
Virtual tables
Virtual columns
The Applications view displays read-only general properties for SQL data services and the objects contained
in the SQL data services. Properties that appear in the view depend on the object type.
The following table describes the read-only general properties for SQL data services, virtual tables, virtual
columns, and virtual stored procedures:
160
Property
Description
Name
Description
Short description of the selected object. Appears for all object types.
Type
Location
The location of the selected object. This includes the domain and Data Integration Service name.
Appears for all object types.
Property
Description
JDBC URL
JDBC connection string used to access the SQL data service. The SQL data service contains
virtual tables that you can query. It also contains virtual stored procedures that you can run.
Appears for SQL data services.
Column Type
The following table describes the configurable SQL data service properties:
Property
Description
Startup Type
Determines whether the SQL data service is enabled to run when the application starts or
when you start the SQL data service. Enter ENABLED to allow the SQL data service to run.
Enter DISABLED to prevent the SQL data service from running.
Trace Level
Level of error written to the log files. Choose one of the following message levels:
-
OFF
SEVERE
WARNING
INFO
FINE
FINEST
ALL
Default is INFO.
Connection
Timeout
Maximum number of milliseconds to wait for a connection to the SQL data service. Default is
3,600,000.
Request Timeout
Maximum number of milliseconds for an SQL request to wait for an SQL data service
response. Default is 3,600,000.
Sort Order
Sort order that the Data Integration Service uses for sorting and comparing data when
running in Unicode mode. You can choose the sort order based on your code page. When the
Data Integration runs in ASCII mode, it ignores the sort order value and uses a binary sort
order. Default is binary.
Maximum Active
Connections
Result Set
Cache Expiration
Period
The number of milliseconds that the result set cache is available for use. If set to -1, the
cache never expires. If set to 0, result set caching is disabled. Changes to the expiration
period do not apply to existing caches. If you want all caches to use the same expiration
period, purge the result set cache after you change the expiration period. Default is 0.
161
Property
Description
Number of milliseconds that the DTM instance stays open after it completes the last request.
Identical SQL queries can reuse the open instance. Use the keep alive time to increase
performance when the time required to process the SQL query is small compared to the
initialization time for the DTM instance. If the query fails, the DTM instance terminates.
Must be an integer. A negative integer value means that the DTM Keep Alive Time for the
Data Integration Service is used. 0 means that the Data Integration Service does not keep
the DTM instance in memory. Default is -1.
Optimization
Level
The optimizer level that the Data Integration Service applies to the object. Enter the numeric
value that is associated with the optimizer level that you want to configure. You can enter one
of the following numeric values:
- 0. The Data Integration Service does not apply optimization.
- 1. The Data Integration Service applies the early projection optimization method.
- 2. The Data Integration Service applies the early projection, early selection, push-into, and
predicate optimization methods.
- 3. The Data Integration Service applies the cost-based, early projection, early selection, pushinto, predicate, and semi-join optimization methods.
Description
Enable Caching
Cache Refresh
Period
Cache Table
Name
The name of the user-managed table from which the Data Integration Service accesses the
virtual table cache. A user-managed cache table is a table in the data object cache database
that you create, populate, and manually refresh when needed.
If you specify a cache table name, the Data Object Cache Manager does not manage the
cache for the object and ignores the cache refresh period.
If you do not specify a cache table name, the Data Object Cache Manager manages the
cache for the object.
162
Description
Create Index
Enables the Data Integration Service to generate indexes for the cache table based on this
column. Default is false.
Deny With
When you use column level security, this property determines whether to substitute the restricted
column value or to fail the query. If you substitute the column value, you can choose to substitute
the value with NULL or with a constant value.
Select one of the following options:
- ERROR. Fails the query and returns an error when an SQL query selects a restricted column.
- NULL. Returns a null value for a restricted column in each row.
- VALUE. Returns a constant value for a restricted column in each row.
Insufficient
Permission
Value
The constant that the Data Integration Service returns for a restricted column.
Description
The number of milliseconds that the result set cache is available for
use. If set to -1, the cache never expires. If set to 0, result set caching
is disabled. Changes to the expiration period do not apply to existing
caches. If you want all caches to use the same expiration period, purge
the result set cache after you change the expiration period. Default is
0.
2.
In the Applications view, select the SQL data service that you want to enable.
3.
4.
163
2.
In the Application view, select the SQL data service that you want to rename.
3.
4.
Web Services
The Applications view displays web services included in applications that have been deployed to a Data
Integration Service. You can view the operations in the web service and configure properties that the Data
Integration Service uses to run a web service. You can enable and rename a web service.
164
Property
Description
Name
Description
Type
Location
The location of the selected object. This includes the domain and Data Integration Service name.
Appears for all objects.
WSDL URL
WSDL URL used to connect to the web service. Appears for web services.
Description
Startup Type
Determines whether the web service is enabled to run when the application starts or when
you start the web service.
Trace Level
Level of error messages written to the run-time web service log. Choose one of the following
message levels:
- OFF. The DTM process does not write messages to the web service run-time logs.
- SEVERE. SEVERE messages include errors that might cause the web service to stop running.
- WARNING. WARNING messages include recoverable failures or warnings. The DTM process
writes WARNING and SEVERE messages to the web service run-time log.
- INFO. INFO messages include web service status messages. The DTM process writes INFO,
WARNING and SEVERE messages to the web service run-time log.
- FINE. FINE messages include data processing errors for the web service request. The DTM
process writes FINE, INFO, WARNING and SEVERE messages to the web service run-time log.
- FINEST. FINEST message are used for debugging. The DTM process writes FINEST, FINE,
INFO, WARNING and SEVERE messages to the web service run-time log.
- ALL. The DTM process writes FINEST, FINE, INFO, WARNING and SEVERE messages to the
web service run-time log.
Default is INFO.
Request Timeout
Maximum number of milliseconds that the Data Integration Service runs an operation
mapping before the web service request times out. Default is 3,600,000.
Maximum
Concurrent
Requests
Maximum number of requests that a web service can process at one time. Default is 10.
Sort Order
Sort order that the Data Integration Service to sort and compare data when running in
Unicode mode.
Enable Transport
Layer Security
Indicates that the web service must use HTTPS. If the Data Integration Service is not
configured to use HTTPS, the web service will not start.
Enable WSSecurity
Enables the Data Integration Service to validate the user credentials and verify that the user
has permission to run each web service operation.
Optimization
Level
The optimizer level that the Data Integration Service applies to the object. Enter the numeric
value that is associated with the optimizer level that you want to configure. You can enter
one of the following numeric values:
- 0. The Data Integration Service does not apply optimization.
- 1. The Data Integration Service applies the early projection optimization method.
- 2. The Data Integration Service applies the early projection, early selection, push-into, and
predicate optimization methods.
- 3. The Data Integration Service applies the cost-based, early projection, early selection, pushinto, predicate, and semi-join optimization methods.
Number of milliseconds that the DTM instance stays open after it completes the last request.
Web service requests that are issued against the same operation can reuse the open
instance. Use the keep alive time to increase performance when the time required to process
the request is small compared to the initialization time for the DTM instance. If the request
fails, the DTM instance terminates.
Must be an integer. A negative integer value means that the DTM Keep Alive Time for the
Data Integration Service is used. 0 means that the Data Integration Service does not keep
the DTM instance in memory. Default is -1.
Web Services
165
Property
Description
SOAP Output
Precision
Maximum number of characters that the Data Integration Service generates for the response
message. The Data Integration Service truncates the response message when the response
message exceeds the SOAP output precision. Default is 200,000.
SOAP Input
Precision
Maximum number of characters that the Data Integration Service parses in the request
message. The web service request fails when the request message exceeds the SOAP input
precision. Default is 200,000.
Description
The number of milliseconds that the result set cache is available for
use. If set to -1, the cache never expires. If set to 0, result set caching
is disabled. Changes to the expiration period do not apply to existing
caches. If you want all caches to use the same expiration period, purge
the result set cache after you change the expiration period. Default is
0.
2.
In the Application view, select the web service that you want to enable.
3.
4.
2.
In the Application view, select the web service that you want to rename.
3.
4.
166
Workflows
The Applications view displays workflows included in applications that have been deployed to a Data
Integration Service. You can view workflow properties, enable a workflow, and start a workflow.
Workflow Properties
Workflow properties include read-only general properties.
The following table describes the read-only general properties for workflows:
Property
Description
Name
Description
Type
Location
The location of the workflow. This includes the domain and Data Integration Service name.
Enabling a Workflow
Before you can run instances of the workflow, the Data Integration Service must be running and the workflow
must be enabled.
Enable a workflow to allow users to run instances of the workflow. Disable a workflow to prevent users from
running instances of the workflow. When you disable a workflow, the Data Integration Service aborts any
running instances of the workflow.
When a deployed application is enabled by default, the workflows in the application are also enabled.
When a deployed application is disabled by default, the workflows are also disabled. When you enable the
application manually, each workflow in the application is also enabled.
1.
2.
In the Applications view, select the workflow that you want to enable.
3.
Starting a Workflow
After you deploy a workflow, you run an instance of the workflow from the deployed application from the
Administrator tool.
1.
In the Administrator tool, click the Data Integration Service on which you deployed the workflow.
2.
3.
Expand the application that contains the workflow you want to start.
4.
5.
6.
Optionally, browse and select a parameter file for the workflow run.
Workflows
167
168
7.
Select Show Workflow Monitoring if you want to view the workflow graph for the workflow run.
8.
Click OK.
CHAPTER 8
169
The following figure shows the Metadata Manager components managed by the Metadata Manager Service
on a node in an Informatica domain:
Metadata Manager application. The Metadata Manager application is a web-based application. Use
Metadata Manager to browse and analyze metadata from disparate source repositories. You can load,
browse, and analyze metadata from application, business intelligence, data integration, data modeling,
and relational metadata sources.
PowerCenter repository for Metadata Manager. Contains the metadata objects used by the PowerCenter
Integration Service to load metadata into the Metadata Manager warehouse. The metadata objects
include sources, targets, sessions, and workflows.
PowerCenter Repository Service. Manages connections to the PowerCenter repository for Metadata
Manager.
PowerCenter Integration Service. Runs the workflows in the PowerCenter repository to read from
metadata sources and load metadata into the Metadata Manager warehouse.
Metadata Manager repository. Contains the Metadata Manager warehouse and models. The Metadata
Manager warehouse is a centralized metadata warehouse that stores the metadata from metadata
sources. Models define the metadata that Metadata Manager extracts from metadata sources.
Metadata sources. The application, business intelligence, data integration, data modeling, and database
management sources that Metadata Manager extracts metadata from.
170
Note: The procedure to configure the Metadata Manager Service varies based on the operating mode of the
PowerCenter Repository Service and on whether the PowerCenter repository contents are created or not.
1.
Set up the Metadata Manager repository database. Set up a database for the Metadata Manager
repository. You supply the database information when you create the Metadata Manager Service.
2.
Create a PowerCenter Repository Service and PowerCenter Integration Service (Optional). You can use
an existing PowerCenter Repository Service and PowerCenter Integration Service, or you can create
them. If want to create the application services to use with Metadata Manager, create the services in the
following order:
a.
PowerCenter Repository Service. Create a PowerCenter Repository Service but do not create
contents. Start the PowerCenter Repository Service in exclusive mode.
b.
PowerCenter Integration Service. Create the PowerCenter Integration Service. The service will not
start because the PowerCenter Repository Service does not have content. You enable the
PowerCenter Integration Service after you create and configure the Metadata Manager Service.
3.
Create the Metadata Manager Service. Use the Administrator tool to create the Metadata Manager
Service.
4.
Configure the Metadata Manager Service. Configure the properties for the Metadata Manager Service.
5.
Create repository contents. The steps to create repository contents differ based on the code page of the
Metadata Manager and PowerCenter repositories.
If the code page is Latin-based, then create contents for the Metadata Manager repository and restore
the PowerCenter repository. Use the Metadata Manager Service Actions menu to create the contents
for both repositories.
If the code page is not Latin-based, then create the repository contents in the following order:
a.
Restore the PowerCenter repository. Use the Metadata Manager Service Actions menu to restore
the PowerCenter repository. When you restore the PowerCenter repository, enable the option to
automatically restart the PowerCenter Repository Service in normal mode.
b.
Create the Metadata Manager repository contents. Use the Metadata Manager Service Actions
menu to create the contents.
6.
Enable the PowerCenter Integration Service. Enable the associated PowerCenter Integration Service for
the Metadata Manager Service.
7.
Optionally, create a Reporting Service. To run reports on the Metadata Manager repository, create a
Reporting Service. After you create the Reporting Service, you can log in to Data Analyzer and run
reports against the Metadata Manager repository.
8.
Optionally, create a Reporting and Dashboards Service. To run reports on the Metadata Manager
repository, create a Reporting and Dashboards Service. After you create a Reporting and Dashboards
Service, add a reporting source to run reports against the data in the data source.
9.
Enable the Metadata Manager Service. Enable the Metadata Manager Service in the Informatica domain.
10.
Create or assign users. Create users and assign them privileges for the Metadata Manager Service, or
assign existing users privileges for the Metadata Manager Service.
Note: You can use a Metadata Manager Service and the associated Metadata Manager repository in one
Informatica domain. After you create the Metadata Manager Service and Metadata Manager repository in one
domain, you cannot create a second Metadata Manager Service to use the same Metadata Manager
repository. You also cannot back up and restore the repository to use with a different Metadata Manager
Service in a different domain.
171
2.
3.
4.
Enter values for the Metadata Manager Service general properties, and click Next.
5.
Enter values for the Metadata Manager Service database properties, and click Next.
6.
Enter values for the Metadata Manager Service security properties, and click Finish.
172
Property
Description
Name
Name of the Metadata Manager Service. The name is not case sensitive and must be unique
within the domain. It cannot exceed 128 characters or begin with @. It also cannot contain
spaces or the following special characters:
` ~ % ^ * + = { } \ ; : ' " / ? . , < > | ! ( ) ] [
Description
Location
Domain and folder where the service is created. Click Browse to choose a different folder. You
can move the Metadata Manager Service after you create it.
License
Node
Node in the Informatica domain that the Metadata Manager Service runs on.
Associated
Integration
Service
PowerCenter Integration Service used by Metadata Manage to load metadata into the Metadata
Manager warehouse.
Repository
User Name
User account for the PowerCenter repository. Use the repository user account you configured
for the PowerCenter Repository Service. For a list of the required privileges for this user, see
Privileges for the Associated PowerCenter Integration Service User on page 187 .
Repository
Password
Security
Domain
Name of the security domain to which the PowerCenter repository user belongs.
Database Type
Property
Description
Code Page
Metadata Manager repository code page. The Metadata Manager Service and Metadata
Manager application use the character set encoded in the repository code page when writing
data to the Metadata Manager repository.
Note: The Metadata Manager repository code page, the code page on the machine where the
associated PowerCenter Integration Service runs, and the code page for any database
management and PowerCenter resources that you load into the Metadata Manager warehouse
must be the same.
Connect String
Native connect string to the Metadata Manager repository database. The Metadata Manager
Service uses the connect string to create a connection object to the Metadata Manager
repository in the PowerCenter repository.
Database User
User account for the Metadata Manager repository database. Set up this account with the
appropriate database client tools.
Database
Password
Password for the Metadata Manager repository database user. Must be in 7-bit ASCII.
Tablespace
Name
Tablespace name for Metadata Manager repositories on IBM DB2. When you specify the
tablespace name, the Metadata Manager Service creates all repository tables in the same
tablespace. You cannot use spaces in the tablespace name.
To improve repository performance on IBM DB2 EEE repositories, specify a tablespace name
with one node.
Database
Hostname
Database Port
SID/Service
Name
Indicates whether the Database Name property contains an Oracle full service name or SID.
Database
Name
Full service name or SID for Oracle databases. Service name for IBM DB2 databases.
Database name for Microsoft SQL Server databases.
173
Property
Description
Additional
JDBC
Parameters
Additional JDBC parameters that you want to append to the database connection URL. Enter
the parameters as name=value pairs separated by semicolon characters (;). For example:
param1=value1;param2=value2
You can use this property to specify the following information:
- Backup server location. If you use a database server that is highly available such as Oracle RAC,
enter the location of a backup server.
- Oracle Advanced Security Option (ASO) parameters. If the Metadata Manager repository database
is an Oracle database that uses ASO, enter the following additional parameters:
EncryptionLevel=[encryption level];EncryptionTypes=[encryption
types];DataIntegrityLevel=[data integrity
level];DataIntegrityTypes=[data integrity types]
The parameter values must match the values in the sqlnet.ora file on the machine where the
Metadata Manager Service runs.
- Authentication information for Microsoft SQL Server.
Note: The Metadata Manager Service does not support the alternateID option for DB2.
To authenticate the user credentials with Windows authentication and establish a trusted
connection to a Microsoft SQL Server repository, enter the following text:
AuthenticationMethod=ntlm;LoadLibraryPath=[directory containing
DDJDBCx64Auth04.dll].
jdbc:informatica:sqlserver://[host]:[port];DatabaseName=[DB
name];AuthenticationMethod=ntlm;LoadLibraryPath=[directory
containing DDJDBCx64Auth04.dll]
When you use a trusted connection to connect to a Microsoft SQL Server database, the Metadata
Manager Service connects to the repository with the credentials of the user logged in to the
machine on which the service is running.
To start the Metadata Manager Service as a Windows service with a trusted connection, configure
the Windows service properties to log on with a trusted user account.
Secure JDBC
Parameters
Secure JDBC parameters that you want to append to the database connection URL. Use this
property to specify secure connection parameters such as passwords. The Administrator tool
does not display secure parameters or parameter values in the Metadata Manager Service
properties. Enter the parameters as name=value pairs separated by semicolon characters (;).
For example:
param1=value1;param2=value2
If secure communication is enabled for the Metadata Manager repository database, enter the
secure JDBC parameters in this property.
Port Number
Port number the Metadata Manager application runs on. Default is 10250.
Enable
Secured
Socket Layer
Indicates that you want to configure SSL security protocol for the Metadata Manager
application. If you enable this option, you must create a keystore file that contains the required
keys and certificates.
You can create a keystore file with keytool. keytool is a utility that generates and stores private
or public key pairs and associated certificates in a keystore file. When you generate a public or
private key pair, keytool wraps the public key into a self-signed certificate. You can use the
self-signed certificate or use a certificate signed by a certificate authority.
174
Property
Description
Keystore File
Keystore file that contains the keys and certificates required if you use the SSL security
protocol with the Metadata Manager application. Required if you select Enable Secured Socket
Layer.
Keystore
Password
Password for the keystore file. Required if you select Enable Secured Socket Layer.
175
Example
IBM DB2
dbname
mydatabase
servername@dbname
sqlserver@mydatabase
Note: If you do not specify the
connect string in the syntax specified,
you must specify the ODBC entry
specified for the data source.
Oracle
oracle.world
Note: The Metadata Manager Service uses the DataDirect drivers included with the Informatica installation.
Informatica does not support the use of any other database driver.
Metadata Manager repository. Create the Metadata Manager warehouse tables and import models for
metadata sources into the Metadata Manager repository.
PowerCenter repository. Restore a repository backup file packaged with PowerCenter to the PowerCenter
repository database. The repository backup file includes the metadata objects used by Metadata Manager
to load metadata into the Metadata Manager warehouse. When you restore the repository, the Service
Manager creates a folder named Metadata Load in the PowerCenter repository. The Metadata Load folder
contains the metadata objects, including sources, targets, sessions, and workflows.
The tasks you complete depend on whether the Metadata Manager repository contains contents or if the
PowerCenter repository contains the PowerCenter objects for Metadata Manager.
176
The following table describes the tasks you must complete for each repository:
Repository
Condition
Action
Metadata Manager
repository
Metadata Manager
repository
Has content.
No action.
PowerCenter
repository
PowerCenter
repository
Has content.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
In the Domain Navigator, select the Metadata Manager Service for which the Metadata Manager
repository has no content.
3.
4.
Optionally, choose to restore the PowerCenter repository. You can restore the repository if the
PowerCenter Repository Service runs in exclusive mode and the repository does not contain contents.
5.
Click OK.
The activity log displays the results of the create contents operation.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
In the Domain Navigator, select the Metadata Manager Service for which the PowerCenter repository
has no contents.
3.
4.
5.
Click OK.
The activity log displays the results of the restore repository operation.
177
information that you want to save, back up the repository with the database client or mmRepoCmd before you
delete it.
1.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
In the Domain Navigator, select the Metadata Manager Service for which you want to delete Metadata
Manager repository content.
3.
4.
Enter the user name and password for the database account.
5.
Click OK.
The activity log displays the results of the delete contents operation.
178
General properties. Include the name and description of the service, the license object for the service, and
the node where the service runs.
Metadata Manager Service properties. Include port numbers for the Metadata Manager application and
the Metadata Manager Agent, and the Metadata Manager file location.
Database properties. Include database properties for the Metadata Manager repository.
Configuration properties. Include the HTTP security protocol and keystore file, and maximum concurrent
and queued requests to the Metadata Manager application.
Connection pool properties. Metadata Manager maintains a connection pool for connections to the
Metadata Manager repository. Connection pool properties include the number of active available
connections to the Metadata Manager repository database and the amount of time that Metadata Manager
holds database connection requests in the connection pool.
Advanced properties. Include properties for the Java Virtual Manager (JVM) memory settings, and
Metadata Manager Browse and Load tab options.
Custom properties. Configure custom properties that are unique to specific environments.
If you update any of the properties, restart the Metadata Manager Service for the modifications to take effect.
General Properties
To edit the general properties, select the Metadata Manager Service in the Navigator, select the Properties
view, and then click Edit in the General Properties section.
The following table describes the general properties for the service:
Property
Description
Name
Name of the service. The name is not case sensitive and must be unique within the
domain. It cannot exceed 128 characters or begin with @. It also cannot contain spaces
or the following special characters:
`~%^*+={}\;:'"/?.,<>|!()][
You cannot change the name of the service after you create it.
Description
License
Node
Node on which the service runs. To assign the Metadata Manager Service to a different
node, you must first disable the service.
2.
3.
Select another node for the Node property, and then click OK.
4.
5.
Change the Metadata Manager File Location property to a location that is accessible from the new node,
and then click OK.
6.
Copy the contents of the Metadata Manager file location directory on the original node to the location on
the new node.
7.
If the Metadata Manager Service is running in HTTPS security mode, click Edit in the Configuration
Properties section. Change the Keystore File location to a location that is accessible from the new node,
and then click OK.
8.
179
Description
Port Number
Port number that the Metadata Manager application runs on. Default is 10250.
Agent Port
Port number for the Metadata Manager Agent when the Metadata Manager Service runs on
Windows. The agent uses this port to communicate with metadata source repositories. Default
is 10251.
If the Metadata Manager Service runs on UNIX, you must install the Metadata Manager Agent
on a separate Windows machine.
Metadata
Manager File
Location
Location of the files used by the Metadata Manager application. Files include the following file
types:
- Index files. Index files created by Metadata Manager required to search the Metadata Manager
warehouse.
- Log files. Log files generated by Metadata Manager when you load resources.
- Parameter files. Files generated by Metadata Manager and used by PowerCenter workflows.
- Repository backup files. Metadata Manager repository backup files that are generated by the
mmRepoCmd command line program.
Location that Metadata Manager uses to store graph database files for data lineage.
By default, Metadata Manager stores the graph database files in the following directory:
<Informatica services installation directory>\services
\MetadataManagerService\mm_files\<Metadata Manager Service name>
If you change the Metadata Manager file location, copy the contents of the directory to the new location.
If you configure a shared file location, the location must be accessible to all nodes running a Metadata
Manager Service and to all users of the Metadata Manager application.
To decrease the load times for Cloudera Navigator resources, ensure that the Metadata Manager file
location directory is on a disk with a fast input/output rate.
180
To change the Metadata Manager lineage graph location, you must disable the Metadata Manager
Service, copy the contents of the directory to the new location, and then restart the Metadata Manager
Service.
The lineage graph location must be accessible to all nodes that run the Metadata Manager Service and to
the Informatica domain administrator user account.
Database Properties
You can edit the Metadata Manager repository database properties. Select the Metadata Manager Service in
the Navigator, select the Properties view, and then click Edit in the Database Properties section.
The following table describes the database properties for a Metadata Manager repository database:
Property
Description
Database
Type
Type of database for the Metadata Manager repository. To apply changes, restart the Metadata
Manager Service.
Code Page
Metadata Manager repository code page. The Metadata Manager Service and Metadata
Manager use the character set encoded in the repository code page when writing data to the
Metadata Manager repository. To apply changes, restart the Metadata Manager Service.
Note: The Metadata Manager repository code page, the code page on the machine where the
associated PowerCenter Integration Service runs, and the code page for any database
management and PowerCenter resources that you load into the Metadata Manager warehouse
must be the same.
Connect
String
Native connect string to the Metadata Manager repository database. The Metadata Manager
Service uses the connection string to create a target connection to the Metadata Manager
repository in the PowerCenter repository.
To apply changes, restart the Metadata Manager Service.
Database
User
User account for the Metadata Manager repository database. Set up this account using the
appropriate database client tools. To apply changes, restart the Metadata Manager Service.
Database
Password
Password for the Metadata Manager repository database user. Must be in 7-bit ASCII. To apply
changes, restart the Metadata Manager Service.
Tablespace
Name
Tablespace name for the Metadata Manager repository on IBM DB2. When you specify the
tablespace name, the Metadata Manager Service creates all repository tables in the same
tablespace. You cannot use spaces in the tablespace name. To apply changes, restart the
Metadata Manager Service.
To improve repository performance on IBM DB2 EEE repositories, specify a tablespace name
with one node.
Database
Hostname
Host name for the Metadata Manager repository database. To apply changes, restart the
Metadata Manager Service.
Database Port
Port number for the Metadata Manager repository database. To apply changes, restart the
Metadata Manager Service.
SID/Service
Name
Indicates whether the Database Name property contains an Oracle full service name or an SID.
Database
Name
Full service name or SID for Oracle databases. Service name for IBM DB2 databases.
Database name for Microsoft SQL Server databases. To apply changes, restart the Metadata
Manager Service.
181
Property
Description
Additional
JDBC
Parameters
Additional JDBC parameters that you want to append to the database connection URL. Enter
the parameters as name=value pairs separated by semicolon characters (;). For example:
param1=value1;param2=value2
You can use this property to specify the following information:
- Backup server location. If you use a database server that is highly available such as Oracle RAC,
enter the location of a backup server.
- Oracle Advanced Security Option (ASO) parameters. If the Metadata Manager repository database
is an Oracle database that uses ASO, enter the following additional parameters:
EncryptionLevel=[encryption level];EncryptionTypes=[encryption
types];DataIntegrityLevel=[data integrity
level];DataIntegrityTypes=[data integrity types]
The parameter values must match the values in the sqlnet.ora file on the machine where the
Metadata Manager Service runs.
- Authentication information for Microsoft SQL Server.
Note: The Metadata Manager Service does not support the alternateID option for DB2.
To authenticate the user credentials using Windows authentication and establish a trusted
connection to a Microsoft SQL Server repository, enter the following text:
AuthenticationMethod=ntlm;LoadLibraryPath=[directory containing
DDJDBCx64Auth04.dll].
jdbc:informatica:sqlserver://[host]:[port];DatabaseName=[DB
name];AuthenticationMethod=ntlm;LoadLibraryPath=[directory
containing DDJDBCx64Auth04.dll]
When you use a trusted connection to connect to a Microsoft SQL Server database, the Metadata
Manager Service connects to the repository with the credentials of the user logged in to the
machine on which the service is running.
To start the Metadata Manager Service as a Windows service using a trusted connection, configure
the Windows service properties to log on using a trusted user account.
Secure JDBC
Parameters
Secure JDBC parameters that you want to append to the database connection URL. Use this
property to specify secure connection parameters such as passwords. The Administrator tool
does not display secure parameters or parameter values in the Metadata Manager Service
properties. Enter the parameters as name=value pairs separated by semicolon characters (;).
For example:
param1=value1;param2=value2
If secure communication is enabled for the Metadata Manager repository database, enter the
secure JDBC parameters in this property.
To update the secure JDBC parameters, click Modify Secure JDBC Parameters and enter the
new values.
182
EncryptionMethod
Encryption method for data transfer between Metadata Manager and the database server. Must be set to
SSL.
TrustStore
Path and file name of the truststore file that contains the SSL certificate of the database server.
TrustStorePassword
Password used to access the truststore file.
HostNameInCertificate
Host name of the machine that hosts the secure database. If you specify a host name, the Metadata
Manager Service validates the host name included in the connection string against the host name in the
SSL certificate.
ValidateServerCertificate
Indicates whether the Metadata Manager Service validates the certificate that the database server
presents. If you set this parameter to true, the Metadata Manager Service validates the certificate. If you
specify the HostNameInCertificate parameter, the Metadata Manager Service also validates the host
name in the certificate.
If you set this parameter to false, the Metadata Manager Service does not validate the certificate that the
database server presents. The Metadata Manager Service ignores any truststore information that you
specify.
KeyStore
Path and file name of the keystore file that contains the SSL certificates that the Metadata Manager
Service presents to the database server.
KeyStorePassword
Password used to access the keystore file.
Configuration Properties
To edit the configuration properties, select the Metadata Manager Service in the Navigator, select the
Properties view, and then click Edit in the Configuration Properties section.
The following table describes the configuration properties for a Metadata Manager Service:
Property
Description
URLScheme
Indicates the security protocol that you configure for the Metadata Manager
application: HTTP or HTTPS.
Keystore File
Keystore file that contains the keys and certificates required if you use the SSL
security protocol with the Metadata Manager application. You must use the same
security protocol for the Metadata Manager Agent if you install it on another
machine.
Keystore Password
183
Property
Description
MaxConcurrentRequests
MaxQueueLength
Maximum queue length for incoming connection requests when all possible request
processing threads are in use by the Metadata Manager application. Metadata
Manager refuses client requests when the queue is full. Default is 500.
You can use the MaxConcurrentRequests property to set the number of clients that can connect to Metadata
Manager. You can use the MaxQueueLength property to set the number of client requests Metadata Manager
can process at one time.
You can change the parameter values based on the number of clients that you expect to connect to Metadata
Manager. For example, you can use smaller values in a test environment. In a production environment, you
can increase the values. If you increase the values, more clients can connect to Metadata Manager, but the
connections might use more system resources.
Description
Maximum Active
Connections
Number of active connections to the Metadata Manager repository database available. The
Metadata Manager application maintains a connection pool for connections to the repository
database.
Increase the number of maximum active connections when you increase the number of
maximum concurrent resource loads. For example, if you set the Max Concurrent Resource
Load property to 10, Informatica recommends that you also set this property to 50 or more.
Default is 20.
Maximum Wait
Time
Amount of time in seconds that Metadata Manager holds database connection requests in
the connection pool. If Metadata Manager cannot process the connection request to the
repository within the wait time, the connection fails.
Default is 180.
184
Advanced Properties
To edit the advanced properties, select the Metadata Manager Service in the Navigator, select the
Properties view, and then click Edit in the Advanced Properties section.
The following table describes the advanced properties for a Metadata Manager Service:
Property
Description
Max Heap
Size
Amount of RAM in megabytes allocated to the Java Virtual Manager (JVM) that runs Metadata
Manager. Use this property to increase the performance of Metadata Manager.
For example, you can use this value to increase the performance of Metadata Manager during
indexing.
Note: If you create Cloudera Navigator resources, set this property to at least 4096 MB (4 GB).
Default is 1024.
Maximum
Catalog Child
Objects
Number of child objects that appear in the Metadata Manager metadata catalog for any parent
object. The child objects can include folders, logical groups, and metadata objects. Use this
option to limit the number of child objects that appear in the metadata catalog for any parent
object.
Default is 100.
Error Severity
Level
Level of error messages written to the Metadata Manager Service log. Specify one of the
following message levels:
-
Fatal
Error
Warning
Info
Trace
Debug
When you specify a severity level, the log includes all errors at that level and above. For
example, if the severity level is Warning, the log includes fatal, error, and warning messages.
Use Trace or Debug if Informatica Global Customer Support instructs you to use that logging
level for troubleshooting purposes.
Default is Error.
185
Property
Description
Max
Concurrent
Resource
Load
Maximum number of resources that Metadata Manager can load simultaneously. Maximum is
10.
Metadata Manager adds resource loads to the load queue in the order that you request the
loads. If you simultaneously load more than the maximum, Metadata Manager adds the
resource loads to the load queue in a random order. For example, you set the property to 5 and
schedule eight resource loads to run at the same time. Metadata Manager adds the eight loads
to the load queue in a random order. Metadata Manager simultaneously processes the first five
resource loads in the queue. The last three resource loads wait in the load queue.
If a resource load succeeds, fails and cannot be resumed, or fails during the path building task
and can be resumed, Metadata Manager removes the resource load from the queue. Metadata
Manager starts processing the next load waiting in the queue.
If a resource load fails when the PowerCenter Integration Service runs the workflows and the
workflows can be resumed, the resource load is resumable. Metadata Manager keeps the
resumable load in the load queue until the timeout interval is exceeded or until you resume the
failed load. Metadata Manager includes a resumable load due to a failure during workflow
processing in the concurrent load count.
Default is 3.
Note: If you increase the number of maximum concurrent resource loads, increase the number
of maximum active connections to the Metadata Manager repository database. For example, if
you set this property to 10, Informatica recommends that you also set the Maximum Active
Connections property to 50 or more.
Timeout
Interval
Amount of time in minutes that Metadata Manager holds a resumable resource load in the load
queue. You can resume a resource load within the timeout period if the load fails when
PowerCenter runs the workflows and the workflows can be resumed. If you do not resume a
failed load within the timeout period, Metadata Manager removes the resource from the load
queue.
Default is 30.
Note: If a resource load fails during the path building task, you can resume the failed load at
any time.
186
The following table describes the associated PowerCenter Integration Service properties:
Property
Description
Associated Integration
Service
Name of the PowerCenter Integration Service that you want to use with Metadata
Manager.
Name of the PowerCenter repository user that has the required privileges. Not
available for a domain with Kerberos authentication.
Repository Password
Password for the PowerCenter repository user. Not available for a domain with
Kerberos authentication.
Security Domain
Name of the security domain to which the PowerCenter repository user belongs.
To perform these tasks, the user must have the required privileges and permissions for the domain,
PowerCenter Repository Service, and Metadata Manager Service.
The following table lists the required privileges and permissions that the PowerCenter repository user for the
associated PowerCenter Integration Service must have:
Service
Privileges
Permissions
Domain
PowerCenter
Repository
Service
Metadata Manager
Service
Load Resource
In the PowerCenter repository, the user who creates a folder or connection object is the owner of the object.
The object owner or a user assigned the Administrator role for the PowerCenter Repository Service can
delete repository folders and connection objects. If you change the associated PowerCenter Integration
187
Service user, you must assign this user as the owner of the following repository objects in the PowerCenter
Client:
188
The Metadata Load folder and all profiling folders created by the Metadata Manager Service
CHAPTER 9
189
Informatica Developer. Informatica Developer connects to the Model Repository Service to create, update,
and delete objects. Informatica Developer and Informatica Analyst share objects in the Model repository.
Informatica Analyst. Informatica Analyst connects to the Model Repository Service to create, update, and
delete objects. Informatica Developer and Informatica Analyst client applications share objects in the
Model repository.
Data Integration Service. When you start a Data Integration Service, it connects to the Model Repository
Service. The Data Integration Service connects to the Model Repository Service to run or preview project
components. The Data Integration Service also connects to the Model Repository Service to store runtime metadata in the Model repository. Application configuration and objects within an application are
examples of run-time metadata.
Note: A Model Repository Service can be associated with one Analyst Service and multiple Data Integration
Services.
190
The following figure shows how a Model repository client connects to the Model repository database:
1. A Model repository client sends a repository connection request to the master gateway node, which is the entry point to the domain.
2. The Service Manager sends back the host name and port number of the node running the Model Repository Service. In the diagram,
the Model Repository Service is running on node A.
3. The repository client establishes a TCP/IP connection with the Model Repository Service process on node A.
4. The Model Repository Service process communicates with the Model repository database over JDBC. The Model Repository Service
process stores objects in or retrieves objects from the Model repository database based on requests from the Model repository client.
Note: The Model repository tables have an open architecture. Although you can view the repository tables,
never manually edit them through other utilities. Informatica is not responsible for corrupted data that is
caused by customer alteration of the repository tables or data within those tables.
Each Model repository must have its own schema. Two Model repositories or the Model repository and the
domain configuration database cannot share the same schema.
191
Note: The Model Repository Service uses the DataDirect drivers included with the Informatica installation.
Informatica does not support the use of any other database driver.
If the repository is in an IBM DB2 9.7 database, verify that IBM DB2 Version 9.7 Fix Pack 7 or a later fix
pack is installed.
On the IBM DB2 instance where you create the database, set the following parameters to ON:
- DB2_SKIPINSERTED
- DB2_EVALUNCOMMITTED
- DB2_SKIPDELETED
- AUTO_RUNSTATS
Value
applheapsz
8192
appl_ctl_heap_sz
8192
For IBM DB2 9.5 only.
logfilsiz
8000
maxlocks
98
locklist
50000
auto_stmt_stats
ON
192
Set the NPAGES parameter to at least 5000. The NPAGES parameter determines the number of pages in
the tablespace.
Verify that the database user has CREATETAB, CONNECT, and BINDADD privileges.
Informatica does not support IBM DB2 table aliases for repository tables. Verify that table aliases have not
been created for any tables in the database.
In the DataDirect Connect for JDBC utility, update the DynamicSections parameter to 3000.
The default value for DynamicSections is too low for the Informatica repositories. Informatica requires a
larger DB2 package than the default. When you set up the DB2 database for the domain configuration
repository or a Model repository, you must set the DynamicSections parameter to at least 3000. If the
DynamicSections parameter is set to a lower number, you can encounter problems when you install or run
Informatica services.
For more information about updating the DynamicSections parameter, see Appendix D, Updating the
DynamicSections Parameter of a DB2 Database on page 447.
The database user account must have the CONNECT, CREATE TABLE, and CREATE VIEW privileges.
Verify that the database user has the CONNECT, RESOURCE, and CREATE VIEW privileges.
Informatica does not support Oracle public synonyms for repository tables. Verify that public synonyms
have not been created for any tables in the database.
193
have one Model Repository Service process configured for each node. The Model Repository Service runs
the Model Repository Service process on the primary node.
Note: When you enable the Model Repository Service, the machine on which the service runs requires at
least 750 MB of free memory. If enough free memory is not available, the service might fail to start.
When you enable a Model Repository Service that runs on a single node, a service process starts on the
node. When you enable a Model Repository Service configured to run on primary and back-up nodes, a
service process is available to run on each node, but it might not start. For example, you have the high
availability option and you configure a Model Repository Service to run on a primary node and two back-up
nodes. You enable the Model Repository Service, which enables a service process on each of the three
nodes. A single process runs on the primary node, and the other processes on the back-up nodes maintain
standby status.
When you disable the Model Repository Service, you shut down the Model Repository Service and disable all
service processes.
When you disable the Model Repository Service, you must choose the mode to disable it in. You can choose
one of the following options:
Complete. Allows the service operations to run to completion before disabling the service.
Abort. Tries to stop all service operations before aborting them and disabling the service.
When you recycle the Model Repository Service, the Service Manager restarts the Model Repository Service.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
3.
On the Manage tab Actions menu, click one of the following options:
194
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
3.
4.
On the Manage tab Actions menu, click one of the following options:
Disable Process to disable the service process. Choose the mode to disable the service process in.
General properties
Search properties
Advanced properties
Cache properties
Versioning properties
Custom properties
If you update any of the properties, you must restart the Model Repository Service for the modifications to
take effect.
If you modify the repository database for a Model Repository Service that is configured for monitoring, then
you must restart the domain. If you do not restart the domain after you modify the repository database, then
the Model Repository Service does not resume statistics collection.
195
Description
Name
Name of the service. The name is not case sensitive and must be unique within the
domain. It cannot exceed 128 characters or begin with @. It also cannot contain
spaces or the following special characters:
`~%^*+={}\;:'"/?.,<>|!()][
You cannot change the name of the service after you create it.
Description
License
Node
Backup Nodes
If your license includes high availability, nodes on which the service can run if the
primary node is unavailable.
Description
Database Type
Username
Password
The JDBC connection string used to connect to the Model repository database.
Use the following JDBC connect string syntax for each supported database:
- IBM DB2. jdbc:informatica:db2://
<host_name>:<port_number>;DatabaseName=<database_name>;Ba
tchPerformanceWorkaround=true;DynamicSections=3000
- Microsoft SQL Server that uses the default instance.
jdbc:informatica:sqlserver://
<host_name>:<port_number>;DatabaseName=<database_name>;Sn
apshotSerializable=true
- Microsoft SQL Server that uses a named instance.
jdbc:informatica:sqlserver://<host_name>
\<named_instance_name>;DatabaseName=<database_name>;Snaps
hotSerializable=true
- Oracle. jdbc:informatica:oracle://
<host_name>:<port_number>;SID=<database_name>;MaxPooledSt
atements=20;CatalogOptions=0;BatchPerformanceWorkaround=t
rue
196
Property
Description
If the Model repository database is secured with the SSL protocol, you must enter
the secure database parameters.
Enter the parameters as name=value pairs separated by semicolon characters
(;). For example:
param1=value1;param2=value2
Dialect
The SQL dialect for a particular database. The dialect maps java objects to
database objects.
For example:
org.hibernate.dialect.Oracle9Dialect
Driver
Database Schema
Database Tablespace
The tablespace name for a particular database. For a multi-partition IBM DB2
database, the tablespace must span a single node and a single partition.
Description
EncryptionMethod
ValidateServerCertificate
HostNameInCertificate
Optional. Host name of the machine that hosts the secure database. If you
specify a host name, Informatica validates the host name included in the
connection string against the host name in the SSL certificate.
cryptoProtocolVersion
Required for Oracle if the Informatica domain runs on AIX and the Oracle
database encryption level is set to TLS. Set the parameter to
cryptoProtocolVersion=TLSv1,TLSv1.1,TLSv1.2.
197
Description
TrustStore
Required. Path and file name of the truststore file that contains the SSL
certificate for the database.
If you do not include the path for the truststore file, Informatica looks for the
file in the following default directory: <Informatica installation
directory>/tomcat/bin
TrustStorePassword
Required. Password for the truststore file for the secure database.
Note: Informatica appends the secure JDBC parameters to the JDBC connection string. If you include the
secure JDBC parameters directly in the connection string, do not enter any parameter in the Secure JDBC
Parameters field.
Description
Search Analyzer
Fully qualified Java class name of the factory class if you used a factory class when
you created a custom search analyzer.
If you use a custom search analyzer, enter the name of either the search analyzer
class or the search analyzer factory class.
198
Description
Amount of RAM allocated to the Java Virtual Machine (JVM) that runs the Model
Repository Service. Use this property to increase the performance. Append one of
the following letters to the value to specify the units:
-
b for bytes.
k for kilobytes.
m for megabytes.
g for gigabytes.
Java Virtual Machine (JVM) command line options to run Java-based programs.
When you configure the JVM options, you must set the Java SDK classpath, Java
SDK minimum memory, and Java SDK maximum memory properties.
You must set the following JVM command line options:
- Xms. Minimum heap size. Default value is 256 m.
- MaxPermSize. Maximum permanent generation size. Default is 128 m.
- Dfile.encoding. File encoding. Default is UTF-8.
Description
Enable Cache
Enables the Model Repository Service to store Model repository objects in cache
memory. To apply changes, restart the Model Repository Service.
JVM options for the Model Repository Service cache. To configure the amount of
memory allocated to cache, configure the maximum heap size. This field must
include the maximum heap size, specified by the -Xmx option. The default value
and minimum value for the maximum heap size is -Xmx128m. The options you
configure apply when Model Repository Service cache is enabled. To apply
changes, restart the Model Repository Service. The options you configure in this
field do not apply to the JVM that runs the Model Repository Service.
199
Note: While the Model repository synchronizes its contents with the version control system for the first time,
the Model repository is unavailable. Model repository users must close all editable objects before the process
starts.
The following table describes the versioning properties for the Model Repository Service:
Property
Description
The supported version control system that you want to connect to. You can choose
Perforce or SVN.
Host
The URL, IP address, or host name of the machine where the Perforce version
control system runs.
When you configure SVN as the version control system, this option is not available.
URL
Port
Required. Port number that the version control system host uses to listen for
requests from the Model Repository Service.
Path to the root directory of the version control system that stores the Model
repository objects.
Note: When you complete editing Versioning properties, the Model repository
connects to the version control system and generates the specified directory if the
directory does not exist yet.
Only one Model Repository Service can use this directory.
For Perforce, use the syntax:
//directory/path
where directory is the Perforce directory root, and path is the remainder of the
path to the root directory of Model repository objects.
Example:
//depot/Informatica/repository_copy
When you configure SVN as the version control system, this option is not available.
Note: If you change the depot path after you synchronize the Model repository with
the version control system, version history for objects in the Model repository is lost.
Username
Password
200
Search properties
Audit properties
Custom properties
Environment variables
Description
201
The following table describes the performance properties for the Model Repository Service process:
Property
Description
Hibernate Connection
Pool Size
Minimum number of connections a pool will maintain at any given time. Equivalent
to the c3p0 minPoolSize property. Default is 1.
Size of the c3p0 global cache for prepared statements. This property controls the
total number of statements cached. Equivalent to the c3p0 maxStatements property.
Default is 1000.
The Model Repository Service uses the value of this property to set the c3p0
maxStatementsPerConnection property based on the number of connections set in
the Hibernate Connection Pool Size property.
Description
Audit Enabled
Description
Repository Logging
Directory
The directory that stores logs for Log Persistence Configuration or Log Persistence
SQL. To disable the logs, do not specify a logging directory. These logs are not the
repository logs that appear in the Log Viewer. Default is blank.
Log Level
202
Property
Description
Log Persistence
Configuration to File
Description
Environment Variables
203
The Model Repository Service fails and the primary node is not available.
The Service Manager restarts the Model Repository Service based on domain property values set for the
amount of time spent trying to restart the service and the maximum number of attempts to try within the
restart period.
Model Repository Service clients are resilient to temporary connection failures during failover and restart of
the service.
2.
3.
To create the repository content, on the Manage tab Actions menu, click Repository Contents >
Create.
4.
Or, to delete repository content, on the Manage tab Actions menu, click Repository Contents >
Delete.
If you delete and create new repository content for a Model Repository Service that is configured for
monitoring, then you must restart the domain after you create new content. If you do not restart the domain,
then the Model Repository Service does not resume statistics collection.
204
the search index. If you need to recover the repository, you can restore the content of the repository from this
file.
When you back up a repository, the Model Repository Service writes the file to the service backup directory.
The service backup directory is a subdirectory of the node backup directory with the name of the Model
Repository Service. For example, a Model Repository Service named MRS writes repository backup files to
the following location:
<node_backup_directory>\MRS
You specify the node backup directory when you set up the node. View the general properties of the node to
determine the path of the backup directory. The Model Repository Service uses the extension .mrep for all
Model repository backup files.
To ensure that the Model Repository Service creates a consistent backup file, the backup operation blocks all
other repository operations until the backup completes. You might want to schedule repository backups when
users are not logged in.
To restore the backup file of a Model Repository Service to a different Model Repository Service, you must
copy the backup file and place it in backup directory of the Model Repository Service to which you want to
restore the backup. For example, you want to restore the backup file of a Model Repository Service named
MRS1 to a Model Repository Service named MRS2. You must copy the backup file of MRS1 from
<node_backup_directory>\MRS1 and place the file in <node_backup_directory>\MRS2.
Note: When you back up and then delete the contents of a Model repository, you must restart the Model
Repository Service before you restore the contents from the backup. If you try to restore the Model repository
contents and have not recycled the service, you may get an error related to search indices.
2.
3.
On the Manage tab Actions menu, click Repository Contents > Back Up.
The Back Up Repository Contents dialog box appears.
4.
Description
Username
Password
SecurityDomain
Description
5.
6.
Click OK.
The Model Repository Service writes the backup file to the service backup directory.
205
2.
3.
On the Manage tab Actions menu, click Repository Contents > Restore.
The Restore Repository Contents dialog box appears.
4.
5.
6.
Option
Description
Username
Password
Security Domain
Click OK.
If the Model Repository service is configured for monitoring, then you must recycle the Model Repository
Service. If you do not recycle the Model Repository Service, then the service does not resume statistics
collection.
2.
3.
On the Manage tab Actions menu, click Repository Contents > View Backup Files.
The View Repository Backup Files dialog box appears and shows the backup files for the Model
Repository Service.
206
You can change the default search analyzer. You can use a packaged search analyzer or you can create and
use a custom search analyzer.
The Model Repository Service stores the index files in the search index root directory that you define for the
service process. The Model Repository Service updates the search index files each time a user saves,
modifies, or deletes a Model repository object. You must manually update the search index if you change the
search analyzer, if you create a Model Repository Service to use existing repository content, if you upgrade
the Model Repository Service, or if the search index files become corrupted.
2.
If you use a factory class when you extend the Analyzer class, the factory class implementation must
have a public method with the following signature:
public org.apache.lucene.analysis.Analyzer createAnalyzer(Properties settings)
The Model Repository Service uses the factory to connect to the search analyzer.
3.
Place the custom search analyzer and required .jar files in the following directory:
<Informatica_Installation_Directory>/services/ModelRepositoryService
In the Administrator tool, select the Services and Nodes view on the Manage tab.
2.
3.
To use one of the packaged search analyzers, specify the fully qualified java class name of the search
analyzer in the Model Repository Service search properties.
4.
To use a custom search analyzer, specify the fully qualified java class name of either the search
analyzer or the search analyzer factory in the Model Repository Service search properties.
5.
6.
Click Actions > Search Index > Re-Index on the Manage tab Actions menu to re-index the search
index.
207
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
3.
To re-index after changing the search analyzer, creating the Model Repository Service to use existing
repository content, or upgrading the Model Repository Service, click Actions > Search Index > ReIndex on the Manage tab Actions menu.
4.
To correct corrupted search index files, complete the following steps on the Manage tab Actions menu:
a.
Click Actions > Search Index > Delete to delete the corrupted search index.
b.
Click Actions > Search Index > Create to create a search index.
c.
Click Actions > Search Index > Re-Index to re-index the search index.
2.
3.
4.
5.
6.
208
7.
8.
Specify the level of logging in the Repository Logging Severity Level field.
9.
Click OK.
Creating a project.
Creating a folder.
2.
3.
4.
5.
6.
7.
8.
Click OK.
209
Configuring Cache
1.
2.
3.
4.
5.
6.
Specify the amount of memory allocated to cache in the Cache JVM Options field.
7.
8.
210
To retain version history, manually copy the contents of the version control system directory to the new
version control system location, change versioning properties, and then recycle the Model Repository
Service.
To discard version history, change versioning properties, recycle the Model Repository Service, and then
re-synchronize the Model repository with the new version control system type or location.
Note: When you change Model repository properties, you must recycle the Model Repository Service for your
changes to take effect. Ask users to save changes and close Model repository objects that they have open
for editing. While synchronization is in progress, the Model repository is unavailable.
211
The following image shows the process of configuring, synchronizing, and re-synchronizing the Model
repository with a version control system:
1.
2.
Synchronize the Model repository contents with the version control system.
3.
4.
212
a.
b.
Change the version control system type and restart the Model Repository Service.
c.
To retain version history, copy the contents of the existing version control system directory to the
new version control system, and configure the Model repository for the new location.
To discard version history, re-synchronize the Model repository to the new version control
system.
If you use Perforce as the version control system, you can change the Perforce host or port number. If
you use Subversion, you can change the URL.
5.
6.
a.
b.
Change the version control system location and restart the Model Repository Service.
c.
To retain version history, copy the contents of the existing version control system directory to the
new version control system location, and configure the Model repository for the new location.
To discard version history, re-synchronize the Model repository to the new version control
system host or URL.
b.
Change the version control system directory and restart the Model Repository Service.
c.
To retain version history, copy the contents of the existing version control system directory to the
new directory, and configure the Model repository for the new location.
To discard version history, re-synchronize the Model repository to the new version control
system directory.
b.
c.
You can perform these tasks from the command line or from the Administrator tool.
Instruct Model repository users to save changes to and close repository objects.
2.
3.
Select the Model repository to synchronize with the version control system.
4.
5.
Click OK.
The Model Repository Service copies the contents of the repository to the version control system
directory. During synchronization, the Model repository is unavailable.
When synchronization is complete, versioning is active for Model repository objects. All Model repository
objects are checked in to the version control system. Users can check out, check in, view version history, and
retrieve historical versions of objects.
After the Model repository is synchronized with the version control system, you cannot disable version control
system integration.
213
The Perforce version control system fails to check in some objects, with an error about excessively long
object path names.
Due to Windows OS limitations on the number of characters in a file path, Model repository objects with long
path and file names fail when you try to check them in. The Perforce error message reads "Submit aborted"
and says the file path exceeds the internal length limit.
To work around this problem, limit the length of directory names in the path to the Perforce depot, and limit
the length of project, folder, and object names in the Model repository. Shorter names in all instances help
limit the total number of characters in the object path name.
Objects View
You can view and manage repository objects from the Objects tab of the Model Repository Service.
The following image shows the Objects tab with a filter on the Type column:
Note: If a Model repository is not integrated with a version control system, the Checked out on column is
replaced with Locked on, and the Checked out by column is replaced with Locked by.
When you manage Model repository objects, you filter the list of objects and then select an action:
1.
When you open the Objects tab, the display is empty. Enter filter criteria in the filter bar and then click
the Filter icon to get a list of objects to manage. For example, to display a list of objects with Type
names beginning with "ma," type ma in the filter bar, and then click the Filter icon.
2.
Select one or more objects. Then right-click a selected object and select an action, or click one of the
action icons.
214
215
To assign the checked-out objects to other team members, complete the following steps:
1.
Filter the list of checked out objects to list all the objects that abcar has checked out.
2.
3.
Select the remainder of the objects and reassign them to user zovar.
Any changes that abcar made are retained. User zovar can continue development on the objects, or
check in the objects without additional changes. User zovar can also choose to undo the check-out of
the objects and lose any changes that abcar made.
The Perforce version control system fails to check in some objects, with an error about excessively long
object path names.
Due to Windows OS limitations on the number of characters in a file path, Model repository objects with long
path and file names fail when you try to check them in. The Perforce error message reads "Submit aborted"
and says the file path exceeds the internal length limit.
To work around this problem, limit the length of directory names in the path to the Perforce depot, and limit
the length of project, folder, and object names in the Model repository. Shorter names in all instances help
limit the total number of characters in the object path name.
Alternatively, you can install Informatica or the Perforce instance on non-Windows hosts that do not have this
limitation.
2.
In the Administrator tool, click the Manage tab > Services and Nodes view.
3.
On the Domain Actions menu, click New > Model Repository Service .
4.
In the properties view, enter the general properties for the Model Repository Service.
5.
Click Next.
6.
7.
8.
9.
10.
216
Do Not Create New Content. Select this option if the specified database contains existing content for
the Model repository. This is the default.
Create New Content. Select this option to create content for the Model repository in the specified
database.
Click Finish.
If you created the Model Repository Service to use existing content, select the Model Repository Service
in the Navigator, and then click Actions > Search Index > Re-Index on the Manage tab Actions menu.
CHAPTER 10
Enable or disable the PowerCenter Integration Service. Enable the PowerCenter Integration Service to
run sessions and workflows. You might disable the PowerCenter Integration Service to prevent users from
running sessions and workflows while performing maintenance on the machine or modifying the
repository.
Configure normal or safe mode. Configure the PowerCenter Integration Service to run in normal or safe
mode.
Configure the PowerCenter Integration Service properties. Configure the PowerCenter Integration Service
properties to change behavior of the PowerCenter Integration Service.
Configure the associated repository. You must associate a repository with a PowerCenter Integration
Service. The PowerCenter Integration Service uses the mappings in the repository to run sessions and
workflows.
217
Configure the PowerCenter Integration Service processes. Configure service process properties for each
node, such as the code page and service process variables.
Remove a PowerCenter Integration Service. You may need to remove a PowerCenter Integration Service
if it becomes obsolete.
Based on your license, the PowerCenter Integration Service can be highly available.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
On the Domain Navigator Actions menu, click New > PowerCenter Integration Service.
The New Integration Service dialog box appears.
3.
Description
Name
Description
218
Property
Description
Location
Domain and folder where the service is created. Click Browse to choose a
different folder. You can also move the PowerCenter Integration Service to a
different folder after you create it.
License
Node
Assign
Grid
Primary Node
Backup Nodes
Associated Repository
Service
Repository Password
Password for the user. Required when you select an associated PowerCenter
Repository Service.
Security Domain
Security domain for the user. Required when you select an associated
PowerCenter Repository Service. To apply changes, restart the PowerCenter
Integration Service.
The Security Domain field appears when the Informatica domain contains an
LDAP security domain.
4.
Click Finish.
You must specify a PowerCenter Repository Service before you can enable the PowerCenter Integration
Service.
219
You can specify the code page for each PowerCenter Integration Service process node and select the
Enable Service option to enable the service. If you do not specify the code page information now, you
can specify it later. You cannot enable the PowerCenter Integration Service until you assign the code
page for each PowerCenter Integration Service process node.
5.
Click OK.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
3.
4.
Select a process.
5.
6.
7.
220
Complete. Allows the sessions and workflows to run to completion before shutting down the service.
Stop. Stops all sessions and workflows and then shuts down the service.
Abort. Tries to stop all sessions and workflows before aborting them and shutting down the service.
When you enable the PowerCenter Integration Service, the service starts. The associated PowerCenter
Repository Service must be started before you can enable the PowerCenter Integration Service. If you enable
a PowerCenter Integration Service when the associated PowerCenter Repository Service is not running, the
following error appears:
The Service Manager could not start the service due to the following error: [DOM_10076]
Unable to enable service [<Integration Service] because of dependent services
[<PowerCenter Repository Service>] are not initialized.
If the PowerCenter Integration Service is unable to start, the Service Manager keeps trying to start the
service until it reaches the maximum restart attempts defined in the domain properties. For example, if you
try to start the PowerCenter Integration Service without specifying the code page for each PowerCenter
Integration Service process, the domain tries to start the service. The service does not start without
specifying a valid code page for each PowerCenter Integration Service process. The domain keeps trying to
start the service until it reaches the maximum number of attempts.
If the service fails to start, review the logs for this PowerCenter Integration Service to determine the reason
for failure and fix the problem. After you fix the problem, you must disable and re-enable the PowerCenter
Integration Service to start it.
To enable or disable a PowerCenter Integration Service:
1.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
3.
On the Manage tab Actions menu, select Disable Service to disable the service or select Enable
Service to enable the service.
4.
To disable and immediately enable the PowerCenter Integration Service, select Recycle.
Operating Mode
You can run the PowerCenter Integration Service in normal or safe operating mode. Normal mode provides
full access to users with permissions and privileges to use a PowerCenter Integration Service. Safe mode
limits user access to the PowerCenter Integration Service and workflow activity during environment migration
or PowerCenter Integration Service maintenance activities.
Run the PowerCenter Integration Service in normal mode during daily operations. In normal mode, users with
workflow privileges can run workflows and get session and workflow information for workflows assigned to
the PowerCenter Integration Service.
You can configure the PowerCenter Integration Service to run in safe mode or to fail over in safe mode.
When you enable the PowerCenter Integration Service to run in safe mode or when the PowerCenter
Integration Service fails over in safe mode, it limits access and workflow activity to allow administrators to
perform migration or maintenance activities.
Run the PowerCenter Integration Service in safe mode to control which workflows a PowerCenter Integration
Service runs and which users can run workflows during migration and maintenance activities. Run in safe
mode to verify a production environment, manage workflow schedules, or maintain a PowerCenter Integration
Service. In safe mode, users that have the Administrator role for the associated PowerCenter Repository
Service can run workflows and get information about sessions and workflows assigned to the PowerCenter
Integration Service.
Operating Mode
221
Normal Mode
When you enable a PowerCenter Integration Service to run in normal mode, the PowerCenter Integration
Service begins running scheduled workflows. It also completes workflow failover for any workflows that failed
while in safe mode, recovers client requests, and recovers any workflows configured for automatic recovery
that failed in safe mode.
Users with workflow privileges can run workflows and get session and workflow information for workflows
assigned to the PowerCenter Integration Service.
When you change the operating mode from safe to normal, the PowerCenter Integration Service begins
running scheduled workflows and completes workflow failover and workflow recovery for any workflows
configured for automatic recovery. You can use the Administrator tool to view the log events about the
scheduled workflows that started, the workflows that failed over, and the workflows recovered by the
PowerCenter Integration Service.
Safe Mode
In safe mode, access to the PowerCenter Integration Service is limited. You can configure the PowerCenter
Integration Service to run in safe mode or to fail over in safe mode:
Enable in safe mode. Enable the PowerCenter Integration Service in safe mode to perform migration or
maintenance activities. When you enable the PowerCenter Integration Service in safe mode, you limit
access to the PowerCenter Integration Service.
When you enable a PowerCenter Integration Service in safe mode, you can choose to have the
PowerCenter Integration Service complete, abort, or stop running workflows. In addition, the operating
mode on failover also changes to safe.
Fail over in safe mode. Configure the PowerCenter Integration Service process to fail over in safe mode
during migration or maintenance activities. When the PowerCenter Integration Service process fails over
to a backup node, it restarts in safe mode and limits workflow activity and access to the PowerCenter
Integration Service. The PowerCenter Integration Service restores the state of operations for any
workflows that were running when the service process failed over, but does not fail over or automatically
recover the workflows. You can manually recover the workflow.
After the PowerCenter Integration Service fails over in safe mode during normal operations, you can
correct the error that caused the PowerCenter Integration Service process to fail over and restart the
service in normal mode.
The behavior of the PowerCenter Integration Service when it fails over in safe mode is the same as when you
enable the PowerCenter Integration Service in safe mode. All scheduled workflows, including workflows
scheduled to run continuously or start on service initialization, do not run. The PowerCenter Integration
Service does not fail over schedules or workflows, does not automatically recover workflows, and does not
recover client requests.
222
Test a development environment. Run the PowerCenter Integration Service in safe mode to test a
development environment before migrating to production. You can run workflows that contain session and
command tasks to test the environment. Run the PowerCenter Integration Service in safe mode to limit
access to the PowerCenter Integration Service when you run the test sessions and command tasks.
Manage workflow schedules. During migration, you can unschedule workflows that only run in a
development environment. You can enable the PowerCenter Integration Service in safe mode, unschedule
the workflow, and then enable the PowerCenter Integration Service in normal mode. After you enable the
service in normal mode, the workflows that you unscheduled do not run.
Troubleshoot the PowerCenter Integration Service. Configure the PowerCenter Integration Service to fail
over in safe mode and troubleshoot errors when you migrate or test a production environment configured
for high availability. After the PowerCenter Integration Service fails over in safe mode, you can correct the
error that caused the PowerCenter Integration Service to fail over.
Perform maintenance on the PowerCenter Integration Service. When you perform maintenance on a
PowerCenter Integration Service, you can limit the users who can run workflows. You can enable the
PowerCenter Integration Service in safe mode, change PowerCenter Integration Service properties, and
verify the PowerCenter Integration Service functionality before allowing other users to run workflows. For
example, you can use safe mode to test changes to the paths for PowerCenter Integration Service files for
PowerCenter Integration Service processes.
Workflow Tasks
The following table describes the tasks that users with the Administrator role can perform when the
PowerCenter Integration Service runs in safe mode:
Task
Task Description
Run workflows.
Start, stop, abort, and recover workflows. The workflows may contain session or
command tasks required to test a development or production environment.
Unschedule workflows.
Monitor PowerCenter
Integration Service
properties.
Recover workflows.
Operating Mode
223
Workflow schedules. Scheduled workflows remain scheduled, but they do not run if the PowerCenter
Integration Service is running in safe mode. This includes workflows scheduled to run continuously and
run on service initialization.
Workflow schedules do not fail over when a PowerCenter Integration Service fails over in safe mode. For
example, you configure a PowerCenter Integration Service to fail over in safe mode. The PowerCenter
Integration Service process fails for a workflow scheduled to run five times, and it fails over after it runs
the workflow three times. The PowerCenter Integration Service does not complete the remaining
workflows when it fails over to the backup node. The PowerCenter Integration Service completes the
workflows when you enable the PowerCenter Integration Service in safe mode.
Workflow failover. When a PowerCenter Integration Service process fails over in safe mode, workflows do
not fail over. The PowerCenter Integration Service restores the state of operations for the workflow. When
you enable the PowerCenter Integration Service in normal mode, the PowerCenter Integration Service
fails over the workflow and recovers it based on the recovery strategy for the workflow.
Workflow recovery.The PowerCenter Integration Service does not recover workflows when it runs in safe
mode or when the operating mode changes from normal to safe.
The PowerCenter Integration Service recovers a workflow that failed over in safe mode when you change
the operating mode from safe to normal, depending on the recovery strategy for the workflow. For
example, you configure a workflow for automatic recovery and you configure the PowerCenter Integration
Service to fail over in safe mode. If the PowerCenter Integration Service process fails over, the workflow is
not recovered while the PowerCenter Integration Service runs in safe mode. When you enable the
PowerCenter Integration Service in normal mode, the workflow fails over and the PowerCenter Integration
Service recovers it.
You can manually recover the workflow if the workflow fails over in safe mode. You can recover the
workflow after the resilience timeout for the PowerCenter Integration Service expires.
Client request recovery. The PowerCenter Integration Service does not recover client requests when it
fails over in safe mode. For example, you stop a workflow and the PowerCenter Integration Service
process fails over before the workflow stops. The PowerCenter Integration Service process does not
recover your request to stop the workflow when the workflow fails over.
When you enable the PowerCenter Integration Service in normal mode, it recovers the client requests.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
3.
4.
5.
To run the PowerCenter Integration Service in normal mode, set OperatingMode to Normal.
To run the service in safe mode, set OperatingMode to Safe.
224
6.
7.
Click OK.
8.
The PowerCenter Integration Service starts in the selected mode. The service status at the top of the content
pane indicates when the service has restarted.
General properties. Assign a license and configure the PowerCenter Integration Service to run on a grid
or nodes.
PowerCenter Integration Service properties. Set the values for the PowerCenter Integration Service
variables.
Advanced properties. Configure advanced properties that determine security and control the behavior of
sessions and logs
Operating mode configuration. Set the PowerCenter Integration Service to start in normal or safe mode
and to fail over in normal or safe mode.
Compatibility and database properties. Configure the source and target database properties, such the
maximum number of connections, and configure properties to enable compatibility with previous versions
of PowerCenter.
Configuration properties. Configure the configuration properties, such as the data display format.
HTTP proxy properties. Configure the connection to the HTTP proxy server.
Custom properties. Configure custom properties that are unique to specific environments.
To view the properties, select the PowerCenter Integration Service in the Navigator and click Properties view.
To modify the properties, edit the section for the property you want to modify.
General Properties
The amount of system resources that the PowerCenter Integration Services uses depends on how you set up
the PowerCenter Integration Service. You can configure a PowerCenter Integration Service to run on a grid
or on nodes. You can view the system resource usage of the PowerCenter Integration Service using the
PowerCenter Workflow Monitor.
When you use a grid, the PowerCenter Integration Service distributes workflow tasks and session threads
across multiple nodes. You can increase performance when you run sessions and workflows on a grid. If you
choose to run the PowerCenter Integration Service on a grid, select the grid. You must have the server grid
option to run the PowerCenter Integration Service on a grid. You must create the grid before you can select
the grid.
If you configure the PowerCenter Integration Service to run on nodes, choose one or more PowerCenter
Integration Service process nodes. If you have only one node and it becomes unavailable, the domain cannot
225
accept service requests. With the high availability option, you can run the PowerCenter Integration Service on
multiple nodes. To run the service on multiple nodes, choose the primary and backup nodes.
To edit the general properties, select the PowerCenter Integration Service in the Navigator, and then click the
Properties view. Edit the section General Properties section. To apply changes, restart the PowerCenter
Integration Service.
The following table describes the general properties for the service:
Property
Description
Name
Name of the service. The name is not case sensitive and must be unique within the domain.
It cannot exceed 128 characters or begin with @. It also cannot contain spaces or the
following special characters:
`~%^*+={}\;:'"/?.,<>|!()][
You cannot change the name of the service after you create it.
Description
License
Assign
Grid
Name of the grid on which the PowerCenter Integration Service runs. Required if you run the
PowerCenter Integration Service on a grid.
Primary Node
Primary node on which the PowerCenter Integration Service runs. Required if you run the
PowerCenter Integration Service on nodes and you specify at least one backup node. You
can select any node in the domain.
Backup Node
Backup node on which the PowerCenter Integration Service can run on. If the primary node
becomes unavailable, the PowerCenter Integration Service runs on a backup node. You can
select multiple nodes as backup nodes. Available if you have the high availability option and
you run the PowerCenter Integration Service on nodes.
226
Description
DataMovementMode
$PMSuccessEmailUser
Service variable that specifies the email address of the user to receive email
messages when a session completes successfully. Use this variable for the
Email User Name attribute for success email. If multiple email addresses are
associated with a single user, messages are sent to all of the addresses.
If the Integration Service runs on UNIX, you can enter multiple email addresses
separated by a comma. If the Integration Service runs on Windows, you can
enter multiple email addresses separated by a semicolon or use a distribution
list. The PowerCenter Integration Service does not expand this variable when
you use it for any other email type.
$PMFailureEmailUser
Service variable that specifies the email address of the user to receive email
messages when a session fails to complete. Use this variable for the Email User
Name attribute for failure email. If multiple email addresses are associated with
a single user, messages are sent to all of the addresses.
If the Integration Service runs on UNIX, you can enter multiple email addresses
separated by a comma. If the Integration Service runs on Windows, you can
enter multiple email addresses separated by a semicolon or use a distribution
list. The PowerCenter Integration Service does not expand this variable when
you use it for any other email type.
$PMSessionLogCount
Service variable that specifies the number of session logs the PowerCenter
Integration Service archives for the session.
Minimum value is 0. Default is 0.
$PMWorkflowLogCount
Service variable that specifies the number of workflow logs the PowerCenter
Integration Service archives for the workflow.
Minimum value is 0. Default is 0.
$PMSessionErrorThreshold
Service variable that specifies the number of non-fatal errors the PowerCenter
Integration Service allows before failing the session. Non-fatal errors include
reader, writer, and DTM errors. If you want to stop the session on errors, enter
the number of non-fatal errors you want to allow before stopping the session.
The PowerCenter Integration Service maintains an independent error count for
each source, target, and transformation. Use to configure the Stop On option in
the session properties.
Defaults to 0. If you use the default setting 0, non-fatal errors do not cause the
session to stop.
227
Advanced Properties
You can configure the properties that control the behavior of PowerCenter Integration Service security,
sessions, and logs. To edit the advanced properties, select the PowerCenter Integration Service in the
Navigator, and then click the Properties view. Edit the Advanced Properties section.
The following table describes the advanced properties:
Property
Description
Level of error logging for the domain. These messages are written to the Log
Manager and log files. Specify one of the following message levels:
-
Default is INFO.
Resilience Timeout
Limit on Resilience
Timeouts
Number of seconds that the service holds on to resources for resilience purposes.
This property places a restriction on clients that connect to the service. Any
resilience timeouts that exceed the limit are cut off at the limit. If blank, the value is
derived from the domain-level settings.
Valid values are between 0 and 2,592,000, inclusive. Default is 180 seconds.
Appends a timestamp to messages that are written to the workflow log. Default is
No.
Allow Debugging
Allows you to run debugger sessions from the Designer. Default is Yes.
LogsInUTF8
TrustStore
Enables the use of operating system profiles. You can select this option if the
PowerCenter Integration Service runs on UNIX. To apply changes, restart the
PowerCenter Integration Service.
Enter the value for TrustStore using the following syntax:
<path>/<filename >
For example:
./Certs/trust.keystore
ClientStore
228
Property
Description
JCEProvider
IgnoreResourceRequirem
ents
Ignores task resource requirements when distributing tasks across the nodes of a
grid. Used when the PowerCenter Integration Service runs on a grid. Ignored when
the PowerCenter Integration Service runs on a node.
Enable this option to cause the Load Balancer to ignore task resource requirements.
It distributes tasks to available nodes whether or not the nodes have the resources
required to run the tasks.
Disable this option to cause the Load Balancer to match task resource requirements
with node resource availability when distributing tasks. It distributes tasks to nodes
that have the required resources.
Default is Yes.
Level of run-time information stored in the repository. Specify one of the following
levels:
- None. PowerCenter Integration Service does not store any session or workflow runtime information in the repository.
- Normal. PowerCenter Integration Service stores workflow details, task details, session
statistics, and source and target statistics in the repository. Default is Normal.
- Verbose. PowerCenter Integration Service stores workflow details, task details,
session statistics, source and target statistics, partition details, and performance
details in the repository.
To store session performance details in the repository, you must also configure the
session to collect performance details and write them to the repository.
The PowerCenter Workflow Monitor shows run-time statistics stored in the
repository.
229
Property
Description
Flushes session recovery data for the recovery file from the operating system buffer
to the disk. For real-time sessions, the PowerCenter Integration Service flushes the
recovery data after each flush latency interval. For all other sessions, the
PowerCenter Integration Service flushes the recovery data after each commit
interval or user-defined commit. Use this property to prevent data loss if the
PowerCenter Integration Service is not able to write recovery data for the recovery
file to the disk.
Specify one of the following levels:
- Auto. PowerCenter Integration Service flushes recovery data for all real-time sessions
with a JMS or WebSphere MQ source and a non-relational target.
- Yes. PowerCenter Integration Service flushes recovery data for all sessions.
- No. PowerCenter Integration Service does not flush recovery data. Select this option if
you have highly available external systems or if you need to optimize performance.
Description
OperatingMode
OperatingModeOnFailover
Operating mode of the PowerCenter Integration Service when the service process
fails over to another node.
230
Description
PMServer3XCompatibility
JoinerSourceOrder6xCompatibility
AggregateTreatRowAsInsert
DateHandling40Compatibility
TreatCHARasCHARonRead
231
Property
Description
NumOfDeadlockRetries
DeadlockSleep
Configuration Properties
You can configure session and miscellaneous properties, such as whether to enforce code page
compatibility.
To edit the configuration properties, select the PowerCenter Integration Service in the Navigator, and then
click the Properties view > Configuration Properties > Edit.
The following table describes the configuration properties:
Property
Description
XMLWarnDupRows
Writes duplicate row warnings and duplicate rows for XML targets to the
session log.
Default is Yes.
CreateIndicatorFiles
Creates indicator files when you run a workflow with a flat file target.
Default is No.
232
Property
Description
OutputMetaDataForFF
TreatDBPartitionAsPassThrough
ExportSessionLogLibName
TreatNullInComparisonOperatorsAs
Default is NULL.
WriterWaitTimeOut
233
Property
Description
DateDisplayFormat
ValidateDataCodePages
Description
HttpProxyServer
HttpProxyPort
HttpProxyUser
Authenticated user name for the HTTP proxy server. This is required if the proxy server
requires authentication.
HttpProxyPassword
Password for the authenticated user. This is required if the proxy server requires
authentication.
HttpProxyDomain
234
Operating system user name. Configure the operating system user that the PowerCenter Integration
Service uses to run workflows.
Service process variables. Configure service process variables in the operating system profile to specify
different output file locations based on the profile assigned to the workflow.
Environment variables. Configure environment variables that the PowerCenter Integration Services uses
at run time.
On UNIX, verify that setuid is enabled on the file system that contains the Informatica installation. If
necessary, remount the file system with setuid enabled.
2.
Enable operating system profiles in the advanced properties section of the PowerCenter Integration
Service properties.
Note: You can use the default umask value 0022. Or, set the value to 0027 or 0077 for better security.
3.
Configure pmimpprocess on every node where the PowerCenter Integration Service runs. pmimpprocess
is a tool that the DTM process, command tasks, and parameter files use to switch between operating
system users.
4.
Create the operating system profiles on the Security page of the Administrator tool.
On the Security tab Actions menu, select Configure operating system profiles
5.
235
6.
To configure pmimpprocess:
1.
2.
Enter the following information at the command line to log in as the administrator user:
Enter the following commands to set the owner and group to the administrator user:
chown <administrator user name> pmimpprocess
chgrp <administrator user name> pmimpprocess
4.
pmimpprocess
pmimpprocess
236
Description
Associated
Repository Service
Repository User
Name
User name to access the repository. To apply changes, restart the PowerCenter
Integration Service.
Not available for a domain with Kerberos authentication.
Repository
Password
Password for the user. To apply changes, restart the PowerCenter Integration Service.
Security Domain
Security domain for the user. To apply changes, restart the PowerCenter Integration
Service.
The Security Domain field appears when the Informatica domain contains an LDAP
security domain.
General properties
Custom properties
Environment variables
General properties include the code page and directories for PowerCenter Integration Service files and Java
components.
To configure the properties, select the PowerCenter Integration Service in the Administrator tool and click the
Processes view. When you select a PowerCenter Integration Service process, the detail panel displays the
properties for the service process.
Code Pages
You must specify the code page of each PowerCenter Integration Service process node. The node where the
process runs uses the code page when it extracts, transforms, or loads data.
Before you can select a code page for a PowerCenter Integration Service process, you must select an
associated repository for the PowerCenter Integration Service. The code page for each PowerCenter
Integration Service process node must be a subset of the repository code page. When you edit this property,
the field displays code pages that are a subset of the associated PowerCenter Repository Service code page.
When you configure the PowerCenter Integration Service to run on a grid or a backup node, you can use a
different code page for each PowerCenter Integration Service process node. However, all codes pages for
the PowerCenter Integration Service process nodes must be compatible.
237
Configuring $PMRootDir
When you configure the PowerCenter Integration Service process variables, you specify the paths for the root
directory and its subdirectories. You can specify an absolute directory for the service process variables. Make
sure all directories specified for service process variables exist before running a workflow.
Set the root directory in the $PMRootDir service process variable. The syntax for $PMRootDir is different for
Windows and UNIX:
On Windows, enter a path beginning with a drive letter, colon, and backslash. For example:
C:\Informatica\<infa_vesion>\server\infa_shared
You can use $PMRootDir to define subdirectories for other service process variable values. For example, set
the $PMSessionLogDir service process variable to $PMRootDir/SessLogs.
Recovery also fails when nodes use the following drives for the storage directory:
To use the mapped or mounted drives successfully, both nodes must use the same drive.
238
Java transformation
General Properties
The following table describes the general properties:
Property
Description
Codepage
$PMRootDir
Root directory accessible by the node. This is the root directory for other service
process variables. It cannot include the following special characters:
*?<>|,
Default is <Installation_Directory>\server\infa_shared.
The installation directory is based on the service version of the service that you
created. When you upgrade the PowerCenter Integration Service, the $PMRootDir is
not updated to the upgraded service version installation directory.
$PMSessionLogDir
Default directory for session logs. It cannot include the following special characters:
*?<>|,
Default is $PMRootDir/SessLogs.
$PMBadFileDir
Default directory for reject files. It cannot include the following special characters:
*?<>|,
Default is $PMRootDir/BadFiles.
239
Property
Description
$PMCacheDir
$PMTargetFileDir
Default directory for target files. It cannot include the following special characters:
*?<>|,
Default is $PMRootDir/TgtFiles.
$PMSourceFileDir
Default directory for source files. It cannot include the following special characters:
*?<>|,
Default is $PMRootDir/SrcFiles.
Note: If you use Metadata Manager, use the default value. Metadata Manager stores
transformed metadata for packaged resource types in files in the $PMRootDir/SrcFiles
directory. If you change this property, Metadata Manager cannot retrieve the
transformed metadata when you load a packaged resource.
$PMExtProcDir
Default directory for external procedures. It cannot include the following special
characters:
*?<>|,
Default is $PMRootDir/ExtProc.
$PMTempDir
Default directory for temporary files. It cannot include the following special characters:
*?<>|,
Default is $PMRootDir/Temp.
$PMWorkflowLogDir
Default directory for workflow logs. It cannot include the following special characters:
*?<>|,
Default is $PMRootDir/WorkflowLogs.
$PMLookupFileDir
Default directory for lookup files. It cannot include the following special characters:
*?<>|,
Default is $PMRootDir/LkpFiles.
$PMStorageDir
Default directory for state of operation files. The PowerCenter Integration Service uses
these files for recovery if you have the high availability option or if you enable a
workflow for recovery. These files store the state of each workflow and session
operation. It cannot include the following special characters:
*?<>|,
Default is $PMRootDir/Storage.
240
Java SDK classpath. You can set the classpath to any JAR files you need to run a
session that require java components. The PowerCenter Integration Service appends
the values you set to the system CLASSPATH. For more information, see Directories
for Java Components on page 239.
Property
Description
Environment Variables
The database client path on a node is controlled by an environment variable.
Set the database client path environment variable for the PowerCenter Integration Service process if the
PowerCenter Integration Service process requires a different database client than another PowerCenter
Integration Service process that is running on the same node. For example, the service version of each
PowerCenter Integration Service running on the node requires a different database client version. You can
configure each PowerCenter Integration Service process to use a different value for the database client
environment variable.
The database client code page on a node is usually controlled by an environment variable. For example,
Oracle uses NLS_LANG, and IBM DB2 uses DB2CODEPAGE. All PowerCenter Integration Services and
PowerCenter Repository Services that run on this node use the same environment variable. You can
configure a PowerCenter Integration Service process to use a different value for the database client code
page environment variable than the value set for the node.
You might want to configure the code page environment variable for a PowerCenter Integration Service
process for the following reasons:
A PowerCenter Integration Service and PowerCenter Repository Service running on the node require
different database client code pages. For example, you have a Shift-JIS repository that requires that the
code page environment variable be set to Shift-JIS. However, the PowerCenter Integration Service reads
from and writes to databases using the UTF-8 code page. The PowerCenter Integration Service requires
that the code page environment variable be set to UTF-8.
Set the environment variable on the node to Shift-JIS. Then add the environment variable to the
PowerCenter Integration Service process properties and set the value to UTF-8.
241
Multiple PowerCenter Integration Services running on the node use different data movement modes. For
example, you have one PowerCenter Integration Service running in Unicode mode and another running in
ASCII mode on the same node. The PowerCenter Integration Service running in Unicode mode requires
that the code page environment variable be set to UTF-8. For optimal performance, the PowerCenter
Integration Service running in ASCII mode requires that the code page environment variable be set to 7bit ASCII.
Set the environment variable on the node to UTF-8. Then add the environment variable to the properties
of the PowerCenter Integration Service process running in ASCII mode and set the value to 7-bit ASCII.
If the PowerCenter Integration Service uses operating system profiles, environment variables configured in
the operating system profile override the environment variables set in the general properties for the
PowerCenter Integration Service process.
2.
3.
Configure the PowerCenter Integration Service processes for the nodes in the grid. If the PowerCenter
Integration Service uses operating system profiles, all nodes on the grid must run on UNIX.
4.
Assign resources to nodes. You assign resources to a node to allow the PowerCenter Integration
Service to match the resources required to run a task or session thread with the resources available on a
node.
After you configure the grid and PowerCenter Integration Service, you configure a workflow to run on the
PowerCenter Integration Service assigned to a grid.
Creating a Grid
To create a grid, create the grid object and assign nodes to the grid. You can assign a node to more than one
grid.
When you create a grid for the Data Integration Service, the nodes assigned to the grid must have specific
roles depending on the types of jobs that the Data Integration Service runs. For more information, see Grid
Configuration by Job Type on page 125.
242
1.
2.
3.
4.
5.
Description
Name
Name of the grid. The name is not case sensitive and must be unique within the domain. It
cannot exceed 128 characters or begin with @. It also cannot contain spaces or the following
special characters:
`~%^*+={}\;:'"/?.,<>|!()][
Description
Nodes
Path
6.
Click OK.
In the Administrator tool, select the PowerCenter Integration Service Properties tab.
2.
3.
Select the grid you want to assign to the PowerCenter Integration Service.
243
Verify the shared storage location. Verify that the shared storage location is accessible to each node in
the grid. If the PowerCenter Integration Service uses operating system profiles, the operating system user
must have access to the shared storage location.
Configure the service process. Configure $PMRootDir to the shared location on each node in the grid.
Configure service process variables with identical absolute paths to the shared directories on each node
in the grid. If the PowerCenter Integration Service uses operating system profiles, the service process
variables you define in the operating system profile override the service process variable setting for every
node. The operating system user must have access to the $PMRootDir configured in the operating system
profile on every node in the grid.
2.
3.
4.
Configure the following service process settings for each node in the grid:
Code pages. For accurate data movement and transformation, verify that the code pages are
compatible for each service process. Use the same code page for each node where possible.
Service process variables. Configure the service process variables the same for each service
process. For example, the setting for $PMCacheDir must be identical on each node in the grid.
Directories for Java components. Point to the same Java directory to ensure that java components
are available to objects that access Java, such as Custom transformations that use Java coding.
Resources
Informatica resources are the database connections, files, directories, node names, and operating system
types required by a task. You can configure the PowerCenter Integration Service to check resources. When
you do this, the Load Balancer matches the resources available to nodes in the grid with the resources
required by the workflow. It dispatches tasks in the workflow to nodes where the required resources are
available. If the PowerCenter Integration Service is not configured to run on a grid, the Load Balancer ignores
resource requirements.
For example, if a session uses a parameter file, it must run on a node that has access to the file. You create
a resource for the parameter file and make it available to one or more nodes. When you configure the
session, you assign the parameter file resource as a required resource. The Load Balancer dispatches the
Session task to a node that has the parameter file resource. If no node has the parameter file resource
available, the session fails.
Resources for a node can be predefined or user-defined. Informatica creates predefined resources during
installation. Predefined resources include the connections available on a node, node name, and operating
system type. When you create a node, all connection resources are available by default. Disable the
connection resources that are not available on the node. For example, if the node does not have Oracle client
libraries, disable the Oracle Application connections. If the Load Balancer dispatches a task to a node where
the required resources are not available, the task fails. You cannot disable or remove node name or
operating system type resources.
User-defined resources include file/directory and custom resources. Use file/directory resources for
parameter files or file server directories. Use custom resources for any other resources available to the node,
such as database client version.
244
The following table lists the types of resources you use in Informatica:
Type
Predefined/
UserDefined
Description
Connection
Predefined
Node Name
Predefined
Operating
System Type
Predefined
Custom
User-defined
File/Directory
User-defined
Any resource for files or directories, such as a parameter file or a file server
directory.
For example, a Session task requires a file resource if it accesses a session
parameter file.
You configure resources required by Session, Command, and predefined Event-Wait tasks in the task
properties.
You define resources available to a node on the Resources tab of the node in the Administrator tool.
Note: When you define a resource for a node, you must verify that the resource is available to the node. If
the resource is not available and the PowerCenter Integration Service runs a task that requires the resource,
the task fails.
You can view the resources available to all nodes in a domain on the Resources view of the domain. The
Administrator tool displays a column for each node. It displays a checkmark when a resource is available for
a node
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
3.
4.
5.
On the Manage tab Actions menu, click Enable Selected Resource or Disable Selected Resource.
245
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
3.
4.
5.
6.
7.
Click OK.
To remove a custom or file/directory resource, select a resource and click Delete Selected Resource on
the Manage tab Actions menu.
246
1.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
3.
4.
Troubleshooting a Grid
I changed the nodes assigned to the grid, but the Integration Service to which the grid is assigned does
not show the latest Integration Service processes.
When you change the nodes in a grid, the Service Manager performs the following transactions in the domain
configuration database:
1.
Updates the grid based on the node changes. For example, if you add a node, the node appears in the
grid.
2.
Updates the Integration Services to which the grid is assigned. All nodes with the service role in the grid
appear as service processes for the Integration Service.
If the Service Manager cannot update an Integration Service and the latest service processes do not appear
for the Integration Service, restart the Integration Service. If that does not work, reassign the grid to the
Integration Service.
Dispatch mode. The dispatch mode determines how the Load Balancer dispatches tasks. You can
configure the Load Balancer to dispatch tasks in a simple round-robin fashion, in a round-robin fashion
using node load metrics, or to the node with the most available computing resources.
Service level. Service levels establish dispatch priority among tasks that are waiting to be dispatched. You
can create different service levels that a workflow developer can assign to workflows.
You configure the following Load Balancer settings for each node:
Resources. When the PowerCenter Integration Service runs on a grid, the Load Balancer can compare
the resources required by a task with the resources available on each node. The Load Balancer
dispatches tasks to nodes that have the required resources. You assign required resources in the task
properties. You configure available resources using the Administrator tool or infacmd.
CPU profile. In adaptive dispatch mode, the Load Balancer uses the CPU profile to rank the computing
throughput of each CPU and bus architecture in a grid. It uses this value to ensure that more powerful
nodes get precedence for dispatch.
247
Resource provision thresholds. The Load Balancer checks one or more resource provision thresholds to
determine if it can dispatch a task. The Load Balancer checks different thresholds depending on the
dispatch mode.
Round-robin. The Load Balancer dispatches tasks to available nodes in a round-robin fashion. It checks
the Maximum Processes threshold on each available node and excludes a node if dispatching a task
causes the threshold to be exceeded. This mode is the least compute-intensive and is useful when the
load on the grid is even and the tasks to dispatch have similar computing requirements.
Metric-based. The Load Balancer evaluates nodes in a round-robin fashion. It checks all resource
provision thresholds on each available node and excludes a node if dispatching a task causes the
thresholds to be exceeded. The Load Balancer continues to evaluate nodes until it finds a node that can
accept the task. This mode prevents overloading nodes when tasks have uneven computing requirements.
Adaptive. The Load Balancer ranks nodes according to current CPU availability. It checks all resource
provision thresholds on each available node and excludes a node if dispatching a task causes the
thresholds to be exceeded. This mode prevents overloading nodes and ensures the best performance on
a grid that is not heavily loaded.
Checks resource
provision thresholds?
Uses task
statistics?
Uses CPU
profile?
Allows bypass in
dispatch queue?
Round-Robin
Checks maximum
processes.
No
No
No
Metric-Based
Yes
No
No
Adaptive
Yes
Yes
Yes
248
40 MB memory
15% CPU
The Load Balancer dispatches tasks for execution in the order the Workflow Manager or scheduler submits
them. The Load Balancer does not bypass any tasks in the dispatch queue. Therefore, if a resource intensive
task is first in the dispatch queue, all other tasks with the same service level must wait in the queue until the
Load Balancer dispatches the resource intensive task.
40 MB memory
15% CPU
In adaptive dispatch mode, the order in which the Load Balancer dispatches tasks from the dispatch queue
depends on the task requirements and dispatch priority. For example, if multiple tasks with the same service
level are waiting in the dispatch queue and adequate computing resources are not available to run a resource
intensive task, the Load Balancer reserves a node for the resource intensive task and keeps dispatching less
intensive tasks to other nodes.
Service Levels
Service levels establish priorities among tasks that are waiting to be dispatched.
When the Load Balancer has more tasks to dispatch than the PowerCenter Integration Service can run at the
time, the Load Balancer places those tasks in the dispatch queue. When multiple tasks are waiting in the
dispatch queue, the Load Balancer uses service levels to determine the order in which to dispatch tasks from
the queue.
Service levels are domain properties. Therefore, you can use the same service levels for all repositories in a
domain. You create and edit service levels in the domain properties or using infacmd.
249
When you create a service level, a workflow developer can assign it to a workflow in the Workflow Manager.
All tasks in a workflow have the same service level. The Load Balancer uses service levels to dispatch tasks
from the dispatch queue. For example, you create two service levels:
Service level Low has dispatch priority 10 and maximum dispatch wait time 7,200 seconds.
Service level High has dispatch priority 2 and maximum dispatch wait time 1,800 seconds.
When multiple tasks are in the dispatch queue, the Load Balancer dispatches tasks with service level High
before tasks with service level Low because service level High has a higher dispatch priority. If a task with
service level Low waits in the dispatch queue for two hours, the Load Balancer changes its dispatch priority
to the maximum priority so that the task does not remain in the dispatch queue indefinitely.
The Administrator tool provides a default service level named Default with a dispatch priority of 5 and
maximum dispatch wait time of 1800 seconds. You can update the default service level, but you cannot
delete it.
When you remove a service level, the Workflow Manager does not update tasks that use the service level. If
a workflow service level does not exist in the domain, the Load Balancer dispatches the tasks with the default
service level.
2.
3.
4.
5.
Click OK.
6.
To remove a service level, click the Remove button for the service level you want to remove.
Configuring Resources
When you configure the PowerCenter Integration Service to run on a grid and to check resource
requirements, the Load Balancer dispatches tasks to nodes based on the resources available on each node.
You configure the PowerCenter Integration Service to check available resources in the PowerCenter
Integration Service properties in Informatica Administrator.
You assign resources required by a task in the task properties in the PowerCenter Workflow Manager.
You define the resources available to each node in the Administrator tool. Define the following types of
resources:
Connection. Any resource installed with PowerCenter, such as a plug-in or a connection object. When you
create a node, all connection resources are available by default. Disable the connection resources that
are not available to the node.
File/Directory. A user-defined resource that defines files or directories available to the node, such as
parameter files or file server directories.
Custom. A user-defined resource that identifies any other resources available to the node. For example,
you may use a custom resource to identify a specific database client version.
Enable and disable available resources on the Resources tab for the node in the Administrator tool or using
infacmd.
250
Maximum CPU run queue length. The maximum number of runnable threads waiting for CPU resources
on the node. The Load Balancer does not count threads that are waiting on disk or network I/Os. If you set
this threshold to 2 on a 4-CPU node that has four threads running and two runnable threads waiting, the
Load Balancer does not dispatch new tasks to this node.
This threshold limits context switching overhead. You can set this threshold to a low value to preserve
computing resources for other applications. If you want the Load Balancer to ignore this threshold, set it to
a high number such as 200. The default value is 10.
The Load Balancer uses this threshold in metric-based and adaptive dispatch modes.
Maximum memory %. The maximum percentage of virtual memory allocated on the node relative to the
total physical memory size. If you set this threshold to 120% on a node, and virtual memory usage on the
node is above 120%, the Load Balancer does not dispatch new tasks to the node.
The default value for this threshold is 150%. Set this threshold to a value greater than 100% to allow the
allocation of virtual memory to exceed the physical memory size when dispatching tasks. If you want the
Load Balancer to ignore this threshold, set it to a high number such as 1,000.
The Load Balancer uses this threshold in metric-based and adaptive dispatch modes.
251
Maximum processes. The maximum number of running processes allowed for each PowerCenter
Integration Service process that runs on the node. This threshold specifies the maximum number of
running Session or Command tasks allowed for each PowerCenter Integration Service process that runs
on the node. For example, if you set this threshold to 10 when two PowerCenter Integration Services are
running on the node, the maximum number of Session tasks allowed for the node is 20 and the maximum
number of Command tasks allowed for the node is 20. Therefore, the maximum number of processes that
can run simultaneously is 40.
The default value for this threshold is 10. Set this threshold to a high number, such as 200, to cause the
Load Balancer to ignore it. To prevent the Load Balancer from dispatching tasks to the node, set this
threshold to 0.
The Load Balancer uses this threshold in all dispatch modes.
252
CHAPTER 11
Grids, 265
PowerCenter Integration Service process. The PowerCenter Integration Service starts one or more
PowerCenter Integration Service processes to run and monitor workflows. When you run a workflow, the
PowerCenter Integration Service process starts and locks the workflow, runs the workflow tasks, and
starts the process to run sessions.
Load Balancer. The PowerCenter Integration Service uses the Load Balancer to dispatch tasks. The Load
Balancer dispatches tasks to achieve optimal performance. It may dispatch tasks to a single node or
across the nodes in a grid.
253
Data Transformation Manager (DTM) process. The PowerCenter Integration Service starts a DTM process
to run each Session and Command task within a workflow. The DTM process performs session
validations, creates threads to initialize the session, read, write, and transform data, and handles pre- and
post- session operations.
The PowerCenter Integration Service can achieve high performance using symmetric multi-processing
systems. It can start and run multiple tasks concurrently. It can also concurrently process partitions within a
single session. When you create multiple partitions within a session, the PowerCenter Integration Service
creates multiple database connections to a single source and extracts a separate range of data for each
connection. It also transforms and loads the data in parallel.
254
Run workflow tasks and evaluates the conditional links connecting tasks.
When you start the PowerCenter Integration Service. When you start the PowerCenter Integration
Service, it queries the repository for a list of workflows configured to run on it.
When you save a workflow. When you save a workflow assigned to a PowerCenter Integration Service to
the repository, the PowerCenter Integration Service process adds the workflow to or removes the
workflow from the schedule queue.
255
conditions, such as success or failure. Based on the result of the evaluation, the PowerCenter Integration
Service process runs successive links and tasks.
Load Balancer
The Load Balancer dispatches tasks to achieve optimal performance and scalability. When you run a
workflow, the Load Balancer dispatches the Session, Command, and predefined Event-Wait tasks within the
workflow. The Load Balancer matches task requirements with resource availability to identify the best node to
run a task. It dispatches the task to a PowerCenter Integration Service process running on the node. It may
dispatch tasks to a single node or across nodes.
The Load Balancer dispatches tasks in the order it receives them. When the Load Balancer needs to dispatch
more Session and Command tasks than the PowerCenter Integration Service can run, it places the tasks it
cannot run in a queue. When nodes become available, the Load Balancer dispatches tasks from the queue in
the order determined by the workflow service level.
The following concepts describe Load Balancer functionality:
256
Dispatch process. The Load Balancer performs several steps to dispatch tasks.
Resources. The Load Balancer can use PowerCenter resources to determine if it can dispatch a task to a
node.
Resource provision thresholds. The Load Balancer uses resource provision thresholds to determine
whether it can start additional tasks on a node.
Dispatch mode. The dispatch mode determines how the Load Balancer selects nodes for dispatch.
Service levels. When multiple tasks are waiting in the dispatch queue, the Load Balancer uses service
levels to determine the order in which to dispatch tasks from the queue.
Dispatch Process
The Load Balancer uses different criteria to dispatch tasks depending on whether the PowerCenter
Integration Service runs on a node or a grid.
The Load Balancer checks resource provision thresholds on the node. If dispatching the task causes any
threshold to be exceeded, the Load Balancer places the task in the dispatch queue, and it dispatches the
task later.
The Load Balancer checks different thresholds depending on the dispatch mode.
2.
The Load Balancer dispatches all tasks to the node that runs the master PowerCenter Integration
Service process.
The Load Balancer verifies which nodes are currently running and enabled.
2.
If you configure the PowerCenter Integration Service to check resource requirements, the Load Balancer
identifies nodes that have the PowerCenter resources required by the tasks in the workflow.
3.
The Load Balancer verifies that the resource provision thresholds on each candidate node are not
exceeded. If dispatching the task causes a threshold to be exceeded, the Load Balancer places the task
in the dispatch queue, and it dispatches the task later.
The Load Balancer checks thresholds based on the dispatch mode.
4.
Resources
You can configure the PowerCenter Integration Service to check the resources available on each node and
match them with the resources required to run the task. If you configure the PowerCenter Integration Service
to run on a grid and to check resources, the Load Balancer dispatches a task to a node where the required
PowerCenter resources are available. For example, if a session uses an SAP source, the Load Balancer
dispatches the session only to nodes where the SAP client is installed. If no available node has the required
resources, the PowerCenter Integration Service fails the task.
You configure the PowerCenter Integration Service to check resources in the Administrator tool.
You define resources available to a node in the Administrator tool. You assign resources required by a task in
the task properties.
The PowerCenter Integration Service writes resource requirements and availability information in the
workflow log.
Load Balancer
257
Maximum CPU Run Queue Length. The maximum number of runnable threads waiting for CPU resources
on the node. The Load Balancer excludes the node if the maximum number of waiting threads is
exceeded.
The Load Balancer checks this threshold in metric-based and adaptive dispatch modes.
Maximum Memory %. The maximum percentage of virtual memory allocated on the node relative to the
total physical memory size. The Load Balancer excludes the node if dispatching the task causes this
threshold to be exceeded.
The Load Balancer checks this threshold in metric-based and adaptive dispatch modes.
Maximum Processes. The maximum number of running processes allowed for each PowerCenter
Integration Service process that runs on the node. The Load Balancer excludes the node if dispatching
the task causes this threshold to be exceeded.
The Load Balancer checks this threshold in all dispatch modes.
If all nodes in the grid have reached the resource provision thresholds before any PowerCenter task has
been dispatched, the Load Balancer dispatches tasks one at a time to ensure that PowerCenter tasks are still
executed.
You define resource provision thresholds in the node properties.
Dispatch Mode
The dispatch mode determines how the Load Balancer selects nodes to distribute workflow tasks. The Load
Balancer uses the following dispatch modes:
Round-robin. The Load Balancer dispatches tasks to available nodes in a round-robin fashion. It checks
the Maximum Processes threshold on each available node and excludes a node if dispatching a task
causes the threshold to be exceeded. This mode is the least compute-intensive and is useful when the
load on the grid is even and the tasks to dispatch have similar computing requirements.
Metric-based. The Load Balancer evaluates nodes in a round-robin fashion. It checks all resource
provision thresholds on each available node and excludes a node if dispatching a task causes the
thresholds to be exceeded. The Load Balancer continues to evaluate nodes until it finds a node that can
accept the task. This mode prevents overloading nodes when tasks have uneven computing requirements.
Adaptive. The Load Balancer ranks nodes according to current CPU availability. It checks all resource
provision thresholds on each available node and excludes a node if dispatching a task causes the
thresholds to be exceeded. This mode prevents overloading nodes and ensures the best performance on
a grid that is not heavily loaded.
When the Load Balancer runs in metric-based or adaptive mode, it uses task statistics to determine whether
a task can run on a node. The Load Balancer averages statistics from the last three runs of the task to
estimate the computing resources required to run the task. If no statistics exist in the repository, the Load
Balancer uses default values.
In adaptive dispatch mode, the Load Balancer can use the CPU profile for the node to identify the node with
the most computing resources.
You configure the dispatch mode in the domain properties.
258
Service Levels
Service levels establish priority among tasks that are waiting to be dispatched.
When the Load Balancer has more Session and Command tasks to dispatch than the PowerCenter
Integration Service can run at the time, the Load Balancer places the tasks in the dispatch queue. When
nodes become available, the Load Balancer dispatches tasks from the queue. The Load Balancer uses
service levels to determine the order in which to dispatch tasks from the queue.
You create and edit service levels in the domain properties in the Administrator tool. You assign service
levels to workflows in the workflow properties in the PowerCenter Workflow Manager.
259
query, target query, lookup database query, and stored procedure call text convert from the source,
target, lookup, or stored procedure data code page to the UCS-2 character set without loss of data in
conversion. If the PowerCenter Integration Service encounters an error when converting data, it writes
an error message to the session log.
Verifies connection object permissions
After validating the session code pages, the DTM verifies permissions for connection objects used in the
session. The DTM verifies that the user who started or scheduled the workflow has execute permissions
for connection objects associated with the session.
Starts worker DTM processes
The DTM sends a request to the PowerCenter Integration Service process to start worker DTM
processes on other nodes when the session is configured to run on a grid.
Runs pre-session operations
After verifying connection object permissions, the DTM runs pre-session shell commands. The DTM then
runs pre-session stored procedures and SQL commands.
Runs the processing threads
After initializing the session, the DTM uses reader, transformation, and writer threads to extract,
transform, and load data. The number of threads the DTM uses to run the session depends on the
number of partitions configured for the session.
Runs post-session operations
After the DTM runs the processing threads, it runs post-session SQL commands and stored procedures.
The DTM then runs post-session shell commands.
Sends post-session email
When the session finishes, the DTM composes and sends email that reports session completion or
failure. If the DTM terminates abnormally, the PowerCenter Integration Service process sends postsession email.
Note: If you use operating system profiles, the PowerCenter Integration Service runs the DTM process as the
operating system user you specify in the operating system profile.
Processing Threads
The DTM allocates process memory for the session and divides it into buffers. This is also known as buffer
memory. The DTM uses multiple threads to process data in a session. The main DTM thread is called the
master thread.
The master thread creates and manages other threads. The master thread for a session can create mapping,
pre-session, post-session, reader, transformation, and writer threads.
For each target load order group in a mapping, the master thread can create several threads. The types of
threads depend on the session properties and the transformations in the mapping. The number of threads
depends on the partitioning information for each target load order group in the mapping.
260
The following figure shows the threads the master thread creates for a simple mapping that contains one
target load order group:
The mapping contains a single partition. In this case, the master thread creates one reader, one
transformation, and one writer thread to process the data. The reader thread controls how the PowerCenter
Integration Service process extracts source data and passes it to the source qualifier, the transformation
thread controls how the PowerCenter Integration Service process handles the data, and the writer thread
controls how the PowerCenter Integration Service process loads data to the target.
When the pipeline contains only a source definition, source qualifier, and a target definition, the data
bypasses the transformation threads, proceeding directly from the reader buffers to the writer. This type of
pipeline is a pass-through pipeline.
The following figure shows the threads for a pass-through pipeline with one partition:
Thread Types
The master thread creates different types of threads for a session. The types of threads the master thread
creates depend on the pre- and post-session properties, as well as the types of transformations in the
mapping.
The master thread can create the following types of threads:
Mapping threads
Reader threads
Transformation threads
Writer threads
Processing Threads
261
Mapping Threads
The master thread creates one mapping thread for each session. The mapping thread fetches session and
mapping information, compiles the mapping, and cleans up after session execution.
Reader Threads
The master thread creates reader threads to extract source data. The number of reader threads depends on
the partitioning information for each pipeline. The number of reader threads equals the number of partitions.
Relational sources use relational reader threads, and file sources use file reader threads.
The PowerCenter Integration Service creates an SQL statement for each reader thread to extract data from a
relational source. For file sources, the PowerCenter Integration Service can create multiple threads to read a
single source.
Transformation Threads
The master thread creates one or more transformation threads for each partition. Transformation threads
process data according to the transformation logic in the mapping.
The master thread creates transformation threads to transform data received in buffers by the reader thread,
move the data from transformation to transformation, and create memory caches when necessary. The
number of transformation threads depends on the partitioning information for each pipeline.
Transformation threads store transformed data in a buffer drawn from the memory pool for subsequent
access by the writer thread.
If the pipeline contains a Rank, Joiner, Aggregator, Sorter, or a cached Lookup transformation, the
transformation thread uses cache memory until it reaches the configured cache size limits. If the
transformation thread requires more space, it pages to local cache files to hold additional data.
When the PowerCenter Integration Service runs in ASCII mode, the transformation threads pass character
data in single bytes. When the PowerCenter Integration Service runs in Unicode mode, the transformation
threads use double bytes to move character data.
Writer Threads
The master thread creates writer threads to load target data. The number of writer threads depends on the
partitioning information for each pipeline. If the pipeline contains one partition, the master thread creates one
writer thread. If it contains multiple partitions, the master thread creates multiple writer threads.
Each writer thread creates connections to the target databases to load data. If the target is a file, each writer
thread creates a separate file. You can configure the session to merge these files.
If the target is relational, the writer thread takes data from buffers and commits it to session targets. When
loading targets, the writer commits data based on the commit interval in the session properties. You can
configure a session to commit data based on the number of source rows read, the number of rows written to
the target, or the number of rows that pass through a transformation that generates transactions, such as a
Transaction Control transformation.
262
Pipeline Partitioning
When running sessions, the PowerCenter Integration Service process can achieve high performance by
partitioning the pipeline and performing the extract, transformation, and load for each partition in parallel. To
accomplish this, use the following session and PowerCenter Integration Service configuration:
You can configure the partition type at most transformations in the pipeline. The PowerCenter Integration
Service can partition data using round-robin, hash, key-range, database partitioning, or pass-through
partitioning.
You can also configure a session for dynamic partitioning to enable the PowerCenter Integration Service to
set partitioning at run time. When you enable dynamic partitioning, the PowerCenter Integration Service
scales the number of session partitions based on factors such as the source database partitions or the
number of nodes in a grid.
For relational sources, the PowerCenter Integration Service creates multiple database connections to a
single source and extracts a separate range of data for each connection.
The PowerCenter Integration Service transforms the partitions concurrently, it passes data between the
partitions as needed to perform operations such as aggregation. When the PowerCenter Integration Service
loads relational data, it creates multiple database connections to the target and loads partitions of data
concurrently. When the PowerCenter Integration Service loads data to file targets, it creates a separate file
for each partition. You can choose to merge the target files.
DTM Processing
When you run a session, the DTM process reads source data and passes it to the transformations for
processing. To help understand DTM processing, consider the following DTM process actions:
Reading source data. The DTM reads the sources in a mapping at different times depending on how you
configure the sources, transformations, and targets in the mapping.
Blocking data. The DTM sometimes blocks the flow of data at a transformation in the mapping while it
processes a row of data from a different source.
Block processing. The DTM reads and processes a block of rows at a time.
DTM Processing
263
The following figure shows a mapping that contains two target load order groups and three source pipelines:
In the mapping, the DTM processes the target load order groups sequentially. It first processes Target Load
Order Group 1 by reading Source A and Source B at the same time. When it finishes processing Target Load
Order Group 1, the DTM begins to process Target Load Order Group 2 by reading Source C.
Blocking Data
You can include multiple input group transformations in a mapping. The DTM passes data to the input groups
concurrently. However, sometimes the transformation logic of a multiple input group transformation requires
that the DTM block data on one input group while it waits for a row from a different input group.
Blocking is the suspension of the data flow into an input group of a multiple input group transformation. When
the DTM blocks data, it reads data from the source connected to the input group until it fills the reader and
transformation buffers. After the DTM fills the buffers, it does not read more source rows until the
transformation logic allows the DTM to stop blocking the source. When the DTM stops blocking a source, it
processes the data in the buffers and continues to read from the source.
The DTM blocks data at one input group when it needs a specific row from a different input group to perform
the transformation logic. After the DTM reads and processes the row it needs, it stops blocking the source.
Block Processing
The DTM reads and processes a block of rows at a time. The number of rows in the block depend on the row
size and the DTM buffer size. In the following circumstances, the DTM processes one row in a block:
264
Log row errors. When you log row errors, the DTM processes one row in a block.
Connect CURRVAL. When you connect the CURRVAL port in a Sequence Generator transformation, the
session processes one row in a block. For optimal performance, connect only the NEXTVAL port in
mappings.
Configure array-based mode for Custom transformation procedure. When you configure the data access
mode for a Custom transformation procedure to be row-based, the DTM processes one row in a block. By
default, the data access mode is array-based, and the DTM processes multiple rows in a block.
Grids
When you run a PowerCenter Integration Service on a grid, a master service process runs on one node and
worker service processes run on the remaining nodes in the grid. The master service process runs the
workflow and workflow tasks, and it distributes the Session, Command, and predefined Event-Wait tasks to
itself and other nodes. A DTM process runs on each node where a session runs. If you run a session on a
grid, a worker service process can run multiple DTM processes on different nodes to distribute session
threads.
Workflow on a Grid
When you run a workflow on a grid, the PowerCenter Integration Service designates one service process as
the master service process, and the service processes on other nodes as worker service processes. The
master service process can run on any node in the grid.
The master service process receives requests, runs the workflow and workflow tasks including the Scheduler,
and communicates with worker service processes on other nodes. Because it runs on the master service
process node, the Scheduler uses the date and time for the master service process node to start scheduled
workflows. The master service process also runs the Load Balancer, which dispatches tasks to nodes in the
grid.
Worker service processes running on other nodes act as Load Balancer agents. The worker service process
runs predefined Event-Wait tasks within its process. It starts a process to run Command tasks and a DTM
process to run Session tasks.
The master service process can also act as a worker service process. So the Load Balancer can distribute
Session, Command, and predefined Event-Wait tasks to the node that runs the master service process or to
other nodes.
For example, you have a workflow that contains two Session tasks, a Command task, and a predefined
Event-Wait task.
The following figure shows an example of service process distribution when you run the workflow on a grid
with three nodes:
When you run the workflow on a grid, the PowerCenter Integration Service process distributes the tasks in
the following way:
On Node 1, the master service process starts the workflow and runs workflow tasks other than the
Session, Command, and predefined Event-Wait tasks. The Load Balancer dispatches the Session,
Command, and predefined Event-Wait tasks to other nodes.
On Node 2, the worker service process starts a process to run a Command task and starts a DTM process
to run Session task 1.
Grids
265
On Node 3, the worker service process runs a predefined Event-Wait task and starts a DTM process to
run Session task 2.
Session on a Grid
When you run a session on a grid, the master service process runs the workflow and workflow tasks,
including the Scheduler. Because it runs on the master service process node, the Scheduler uses the date
and time for the master service process node to start scheduled workflows. The Load Balancer distributes
Command tasks as it does when you run a workflow on a grid. In addition, when the Load Balancer
dispatches a Session task, it distributes the session threads to separate DTM processes.
The master service process starts a temporary preparer DTM process that fetches the session and prepares
it to run. After the preparer DTM process prepares the session, it acts as the master DTM process, which
monitors the DTM processes running on other nodes.
The worker service processes start the worker DTM processes on other nodes. The worker DTM runs the
session. Multiple worker DTM processes running on a node might be running multiple sessions or multiple
partition groups from a single session depending on the session configuration.
For example, you run a workflow on a grid that contains one Session task and one Command task. You also
configure the session to run on the grid.
The following figure shows the service process and DTM distribution when you run a session on a grid on
three nodes:
When the PowerCenter Integration Service process runs the session on a grid, it performs the following
tasks:
266
On Node 1, the master service process runs workflow tasks. It also starts a temporary preparer DTM
process, which becomes the master DTM process. The Load Balancer dispatches the Command task and
session threads to nodes in the grid.
On Node 2, the worker service process runs the Command task and starts the worker DTM processes that
run the session threads.
On Node 3, the worker service process starts the worker DTM processes that run the session threads.
System Resources
To allocate system resources for read, transformation, and write processing, you should understand how the
PowerCenter Integration Service allocates and uses system resources. The PowerCenter Integration Service
uses the following system resources:
CPU usage
Cache memory
CPU Usage
The PowerCenter Integration Service process performs read, transformation, and write processing for a
pipeline in parallel. It can process multiple partitions of a pipeline within a session, and it can process multiple
sessions in parallel.
If you have a symmetric multi-processing (SMP) platform, you can use multiple CPUs to concurrently process
session data or partitions of data. This provides increased performance, as true parallelism is achieved. On a
single processor platform, these tasks share the CPU, so there is no parallelism.
The PowerCenter Integration Service process can use multiple CPUs to process a session that contains
multiple partitions. The number of CPUs used depends on factors such as the number of partitions, the
number of threads, the number of available CPUs, and amount or resources required to process the
mapping.
Cache Memory
The DTM process creates in-memory index and data caches to temporarily store data used by the following
transformations:
Rank transformation
Joiner transformation
System Resources
267
You can configure memory size for the index and data cache in the transformation properties. By default, the
PowerCenter Integration Service determines the amount of memory to allocate for caches. However, you can
manually configure a cache size for the data and index caches.
By default, the DTM creates cache files in the directory configured for the $PMCacheDir service process
variable. If the DTM requires more space than it allocates, it pages to local index and data files.
The DTM process also creates an in-memory cache to store data for the Sorter transformations and XML
targets. You configure the memory size for the cache in the transformation properties. By default, the
PowerCenter Integration Service determines the cache size for the Sorter transformation and XML target at
run time. The PowerCenter Integration Service allocates a minimum value of 16,777,216 bytes for the Sorter
transformation cache and 10,485,760 bytes for the XML target. The DTM creates cache files in the directory
configured for the $PMTempDir service process variable. If the DTM requires more cache space than it
allocates, it pages to local cache files.
When processing large amounts of data, the DTM may create multiple index and data files. The session does
not fail if it runs out of cache memory and pages to the cache files. It does fail, however, if the local directory
for cache files runs out of disk space.
After the session completes, the DTM releases memory used by the index and data caches and deletes any
index and data files. However, if the session is configured to perform incremental aggregation or if a Lookup
transformation is configured for a persistent lookup cache, the DTM saves all index and data cache
information to disk for the next session run.
268
The PowerCenter Integration Service converts data from the source character set to UCS-2 before
processing, processes the data, and then converts the UCS-2 data to the target code page character set
before loading the data. The PowerCenter Integration Service allots two bytes for each character when
moving data through a mapping. It also treats all numerics as U.S. Standard and all dates as binary data.
The PowerCenter Integration Service code page must be a subset of the PowerCenter repository code page.
PowerCenter Integration Service process properties. Service process variables set in the PowerCenter
Integration Service process properties contain the default setting.
2.
Operating system profile. Service process variables set in an operating system profile override service
process variables set in the PowerCenter Integration Service properties. If you use operating system
profiles, the PowerCenter Integration Service saves workflow recovery files to the $PMStorageDir
configured in the PowerCenter Integration Service process properties. The PowerCenter Integration
Service saves session recovery files to the $PMStorageDir configured in the operating system profile.
3.
Parameter file. Service process variables set in parameter files override service process variables set in
the PowerCenter Integration Service process properties or an operating system profile.
4.
Session or workflow properties. Service process variables set in the session or workflow properties
override service process variables set in the PowerCenter Integration Service properties, a parameter
file, or an operating system profile.
For example, if you set the $PMSessionLogFile in the operating system profile and in the session properties,
the PowerCenter Integration Service uses the location specified in the session properties.
The PowerCenter Integration Service creates the following output files:
Workflow log
Session log
Reject files
Control file
Post-session email
269
Output file
Cache files
When the PowerCenter Integration Service process on UNIX creates any file other than a recovery file, it sets
the file permissions according to the umask of the shell that starts the PowerCenter Integration Service
process. For example, when the umask of the shell that starts the PowerCenter Integration Service process is
022, the PowerCenter Integration Service process creates files with rw-r--r-- permissions. To change the file
permissions, you must change the umask of the shell that starts the PowerCenter Integration Service process
and then restart it.
The PowerCenter Integration Service process on UNIX creates recovery files with rw------- permissions.
The PowerCenter Integration Service process on Windows creates files with read and write permissions.
Workflow Log
The PowerCenter Integration Service process creates a workflow log for each workflow it runs. It writes
information in the workflow log such as initialization of processes, workflow task run information, errors
encountered, and workflow run summary. Workflow log error messages are categorized into severity levels.
You can configure the PowerCenter Integration Service to suppress writing messages to the workflow log file.
You can view workflow logs from the PowerCenter Workflow Monitor. You can also configure the workflow to
write events to a log file in a specified directory.
As with PowerCenter Integration Service logs and session logs, the PowerCenter Integration Service process
enters a code number into the workflow log file message along with message text.
Session Log
The PowerCenter Integration Service process creates a session log for each session it runs. It writes
information in the session log such as initialization of processes, session validation, creation of SQL
commands for reader and writer threads, errors encountered, and load summary. The amount of detail in the
session log depends on the tracing level that you set. You can view the session log from the PowerCenter
Workflow Monitor. You can also configure the session to write the log information to a log file in a specified
directory.
As with PowerCenter Integration Service logs and workflow logs, the PowerCenter Integration Service
process enters a code number along with message text.
Session Details
When you run a session, the PowerCenter Workflow Manager creates session details that provide load
statistics for each target in the mapping. You can monitor session details during the session or after the
session completes. Session details include information such as table name, number of rows written or
rejected, and read and write throughput. To view session details, double-click the session in the PowerCenter
Workflow Monitor.
270
You can also view performance details in the PowerCenter Workflow Monitor if you configure the session to
collect performance details.
Reject Files
By default, the PowerCenter Integration Service process creates a reject file for each target in the session.
The reject file contains rows of data that the writer does not write to targets.
The writer may reject a row in the following circumstances:
A field in the row was truncated or overflowed, and the target database is configured to reject truncated or
overflowed data.
By default, the PowerCenter Integration Service process saves the reject file in the directory entered for the
service process variable $PMBadFileDir in the PowerCenter Workflow Manager, and names the reject file
target_table_name.bad.
Note: If you enable row error logging, the PowerCenter Integration Service process does not create a reject
file.
Control File
When you run a session that uses an external loader, the PowerCenter Integration Service process creates a
control file and a target flat file. The control file contains information about the target flat file such as data
format and loading instructions for the external loader. The control file has an extension of .ctl. The
PowerCenter Integration Service process creates the control file and the target flat file in the PowerCenter
Integration Service variable directory, $PMTargetFileDir, by default.
271
Email
You can compose and send email messages by creating an Email task in the Workflow Designer or Task
Developer. You can place the Email task in a workflow, or you can associate it with a session. The Email task
allows you to automatically communicate information about a workflow or session run to designated
recipients.
Email tasks in the workflow send email depending on the conditional links connected to the task. For postsession email, you can create two different messages, one to be sent if the session completes successfully,
the other if the session fails. You can also use variables to generate information about the session name,
status, and total rows loaded.
Indicator File
If you use a flat file as a target, you can configure the PowerCenter Integration Service to create an indicator
file for target row type information. For each target row, the indicator file contains a number to indicate
whether the row was marked for insert, update, delete, or reject. The PowerCenter Integration Service
process names this file target_name.ind and stores it in the PowerCenter Integration Service variable
directory, $PMTargetFileDir, by default.
Output File
If the session writes to a target file, the PowerCenter Integration Service process creates the target file based
on a file target definition. By default, the PowerCenter Integration Service process names the target file
based on the target definition name. If a mapping contains multiple instances of the same target, the
PowerCenter Integration Service process names the target files based on the target instance name.
The PowerCenter Integration Service process creates this file in the PowerCenter Integration Service
variable directory, $PMTargetFileDir, by default.
Cache Files
When the PowerCenter Integration Service process creates memory cache, it also creates cache files. The
PowerCenter Integration Service process creates cache files for the following mapping objects:
Aggregator transformation
Joiner transformation
Rank transformation
Lookup transformation
Sorter transformation
XML target
By default, the DTM creates the index and data files for Aggregator, Rank, Joiner, and Lookup
transformations and XML targets in the directory configured for the $PMCacheDir service process variable.
The PowerCenter Integration Service process names the index file PM*.idx, and the data file PM*.dat. The
PowerCenter Integration Service process creates the cache file for a Sorter transformation in the
$PMTempDir service process variable directory.
272
aggregation. By default, the DTM creates the index and data files in the directory configured for the
$PMCacheDir service process variable. The PowerCenter Integration Service process names the index file
PMAGG*.dat and the data file PMAGG*.idx.
273
CHAPTER 12
Resilience, 274
Recovery, 278
Restart and failover. If the PowerCenter Integration Service process becomes unavailable, the Service
Manager can restart the process or fail it over to another node.
Recovery. When the PowerCenter Integration Service restarts or fails over a service process, it can
automatically recover interrupted workflows that are configured for recovery.
Resilience
Based on your license, the PowerCenter Integration Service is resilient to the temporary unavailability of
PowerCenter Integration Service clients and external components such databases and FTP servers.
The PowerCenter Integration Service tries to reconnect to PowerCenter Integration Service clients within the
PowerCenter Integration Service resilience timeout period. The PowerCenter Integration Service resilience
timeout period is based on the resilience properties that you configure for the PowerCenter Integration
274
Service, PowerCenter Integration Service clients, and the domain. The PowerCenter Integration Service tries
to reconnect to external components within the resilience timeout for the database or FTP connection object.
Resilience
275
Example
You configure a retry period of 180 for an Oracle relational database connection object. If the PowerCenter
Integration Service loses connectivity to the database during the initial connection or when it reads data from
the database, it tries to reconnect for 180 seconds. If it cannot reconnect to the database, the session fails.
If a completed task reported a completed status to the PowerCenter Integration Service process prior to
the PowerCenter Integration Service failure, the task will not restart.
If a completed task did not report a completed status to the PowerCenter Integration Service process prior
to the PowerCenter Integration Service failure, the task will restart.
Normal. When you restart the process, the workflow fails over on the same node. The PowerCenter
Integration Service can recover the workflow based on the workflow state and recovery strategy. If
the workflow is enabled for high availability recovery, the PowerCenter Integration Service restores
the state of operation for the workflow and recovers the workflow from the point of interruption. The
PowerCenter Integration Service performs failover and recovers the schedules, requests, and
workflows. If a scheduled workflow is not enabled for high availability recovery, the PowerCenter
Integration Service removes the workflow from the schedule.
Safe. When you restart the process, the workflow does not fail over and the PowerCenter Integration
Service does not recover the workflow. It performs failover and recovers the schedules, requests, and
workflows when you enable the service in normal mode.
Service
When the PowerCenter Integration Service becomes unavailable, you must enable the service and start
the service processes. You can manually recover workflows and sessions based on the state and the
configured recovery strategy.
276
The workflows that run after you start the service processes depend on the operating mode:
Normal. Workflows start if they are configured to run continuously or on initialization. You must
reschedule all other workflows.
Safe. Scheduled workflows do not start. You must enable the service in normal mode for the
scheduled workflows to run.
Node
When the node becomes unavailable, the restart and failover behavior is the same as restart and failover
for the service process, based on the operating mode.
Normal. The PowerCenter Integration Service can recover the workflow based on the workflow state
and recovery strategy. If the workflow was enabled for high availability recovery, the PowerCenter
Integration Service restores the state of operation for the workflow and recovers the workflow from
the point of interruption. The PowerCenter Integration Service performs failover and recovers the
schedules, requests, and workflows. If a scheduled workflow is not enabled for high availability
recovery, the PowerCenter Integration Service removes the workflow from the schedule.
Safe. The PowerCenter Integration Service does not run scheduled workflows and it disables
schedule failover, automatic workflow recovery, workflow failover, and client request recovery. It
performs failover and recovers the schedules, requests, and workflows when you enable the service
in normal mode.
Service
When the PowerCenter Integration Service becomes unavailable, you must enable the service and start
the service processes. You can manually recover workflows and sessions based on the state and the
configured recovery strategy. Workflows start if they are configured to run continuously or on
initialization. You must reschedule all other workflows.
The workflows that run after you start the service processes depend on the operating mode:
Normal. Workflows start if they are configured to run continuously or on initialization. You must
reschedule all other workflows.
Safe. Scheduled workflows do not start. You must enable the service in normal mode to run the
scheduled workflows.
Node
When the node becomes unavailable, the failover behavior is the same as the failover for the service
process, based on the operating mode.
277
Running on a Grid
When a service is running on a grid, the failover behavior depends on the following sources of failure:
Master Service Process
If you disable the master service process, the Service Manager elects another node to run the master
service process. If the master service process shuts down unexpectedly, the Service Manager tries to
restart the process before electing another node to run the master service process.
The master service process then reconfigures the grid to run on one less node. The PowerCenter
Integration Service restores the state of operation, and the workflow fails over to the newly elected
master service process.
The PowerCenter Integration Service can recover the workflow based on the workflow state and
recovery strategy. If the workflow was enabled for high availability recovery, the PowerCenter Integration
Service restores the state of operation for the workflow and recovers the workflow from the point of
interruption. When the PowerCenter Integration Service restores the state of operation for the service, it
restores workflow schedules, service requests, and workflows. The PowerCenter Integration Service
performs failover and recovers the schedules, requests, and workflows.
If a scheduled workflow is not enabled for high availability recovery, the PowerCenter Integration Service
removes the workflow from the schedule.
Worker Service Process
If you disable a worker service process, the master service process reconfigures the grid to run on one
less node. If the worker service process shuts down unexpectedly, the Service Manager tries to restart
the process before the master service process reconfigures the grid.
After the master service process reconfigures the grid, it can recover tasks based on task state and
recovery strategy.
Because workflows do not run on the worker service process, workflow failover is not applicable.
Service
When the PowerCenter Integration Service becomes unavailable, you must enable the service and start
the service processes. You can manually recover workflows and sessions based on the state and the
configured recovery strategy. Workflows start if they are configured to run continuously or on
initialization. You must reschedule all other workflows.
Node
When the node running the master service process becomes unavailable, the failover behavior is the
same as the failover for the master service process. When the node running the worker service process
becomes unavailable, the failover behavior is the same as the failover for the worker service process.
Note: You cannot configure a PowerCenter Integration Service to fail over in safe mode when it runs on a
grid.
Recovery
Based on your license, the PowerCenter Integration Service can automatically recover workflows and tasks
based on the recovery strategy, the state of the workflows and tasks, and the PowerCenter Integration
Service operating mode.
278
Running Workflows
You can configure automatic task recovery in the workflow properties. When you configure automatic task
recovery, the PowerCenter Integration Service can recover terminated tasks while the workflow is running.
You can also configure the number of times that the PowerCenter Integration Service tries to recover the
task. If the PowerCenter Integration Service cannot recover the task in the configured number of times for
recovery, the task and the workflow are terminated.
The PowerCenter Integration Service behavior for task recovery does not depend on the operating mode.
Suspended Workflows
The PowerCenter Integration Service can restore the workflow state after a suspended workflow fails over to
another node if you enable recovery in the workflow properties.
If a service process shuts down while a workflow is suspended, the PowerCenter Integration Service marks
the workflow as terminated. It fails the workflow over to another node, and changes the workflow state to
terminated. The PowerCenter Integration Service does not recover any workflow task. You can fix the errors
that caused the workflow to suspend, and manually recover the workflow.
279
Process state information includes information about which node was running the master PowerCenter
Integration Service process and which node was running each session. You can configure the PowerCenter
Integration Service to store process state information on a cluster file system or in the PowerCenter
repository database.
280
CHAPTER 13
Create a database for the repository tables. Before you can create the repository tables, you need to
create a database to store the tables. If you create a PowerCenter Repository Service for an existing
repository, you do not need to create a new database. You can use the existing database, as long as it
meets the minimum requirements for a repository database.
Create the PowerCenter Repository Service. Create the PowerCenter Repository Service to manage the
repository. When you create a PowerCenter Repository Service, you can choose to create the repository
tables. If you do not create the repository tables, you can create them later or you can associate the
PowerCenter Repository Service with an existing repository.
Configure the PowerCenter Repository Service. After you create a PowerCenter Repository Service, you
can configure its properties. You can configure properties such as the error severity level or maximum
user connections.
Based on your license, the PowerCenter Repository Service can be highly available.
281
Determine repository requirements. Determine whether the repository needs to be version-enabled and
whether it is a local, global, or standalone repository.
Verify license. Verify that you have a valid license to run application services. Although you can create a
PowerCenter Repository Service without a license, you need a license to run the service. In addition, you
need a license to configure some options related to version control and high availability.
Determine code page. Determine the code page to use for the PowerCenter repository. The PowerCenter
Repository Service uses the character set encoded in the repository code page when writing data to the
repository. The repository code page must be compatible with the code pages for the PowerCenter Client
and all application services in the Informatica domain.
Tip: After you create the PowerCenter Repository Service, you cannot change the code page in the
PowerCenter Repository Service properties. To change the repository code page after you create the
PowerCenter Repository Service, back up the repository and restore it to a new PowerCenter Repository
Service. When you create the new PowerCenter Repository Service, you can specify a compatible code
page.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
In the Domain Navigator, select the folder where you want to create the PowerCenter Repository
Service.
Note: If you do not select a folder, you can move the PowerCenter Repository Service into a folder after
you create it.
3.
282
In the Domain Actions menu, click New > PowerCenter Repository Service.
Description
Name
Description
Location
Domain and folder where the service is created. Click Select Folder to choose a
different folder. You can also move the PowerCenter Repository Service to a
different folder after you create it.
License
License that allows use of the service. If you do not select a license when you create
the service, you can assign a license later. The options included in the license
determine the selections you can make for the repository. For example, you must
have the team-based development option to create a versioned repository. Also, you
need the high availability option to run the PowerCenter Repository Service on more
than one node.
Node
Node on which the service process runs. Required if you do not select a license with
the high availability option. If you select a license with the high availability option, this
property does not appear.
Primary Node
Node on which the service process runs by default. Required if you select a license
with the high availability option. This property appears if you select a license with the
high availability option.
Backup Nodes
Nodes on which the service process can run if the primary node is unavailable.
Optional if you select a license with the high availability option. This property appears
if you select a license with the high availability option.
Database Type
Code Page
Repository code page. The PowerCenter Repository Service uses the character set
encoded in the repository code page when writing data to the repository. You cannot
change the code page in the PowerCenter Repository Service properties after you
create the PowerCenter Repository Service.
Connect String
Native connection string the PowerCenter Repository Service uses to access the
repository database. For example, use servername@dbname for Microsoft SQL
Server and dbname.world for Oracle.
Username
Account for the repository database. Set up this account using the appropriate
database client tools.
Password
283
Property
Description
TablespaceName
Tablespace name for IBM DB2 and Sybase repositories. When you specify the
tablespace name, the PowerCenter Repository Service creates all repository tables
in the same tablespace. You cannot use spaces in the tablespace name.
To improve repository performance on IBM DB2 EEE repositories, specify a
tablespace name with one node.
Creation Mode
Enable the
Repository Service
5.
Enables the service. When you select this option, the service starts running when it
is created. Otherwise, you need to click the Enable button to run the service. You
need a valid license to run a PowerCenter Repository Service.
If you create a PowerCenter Repository Service for a repository with existing content and the repository
existed in a different Informatica domain, verify that users and groups with privileges for the
PowerCenter Repository Service exist in the current domain.
The Service Manager periodically synchronizes the list of users and groups in the repository with the
users and groups in the domain configuration database. During synchronization, users and groups that
do not exist in the current domain are deleted from the repository. You can use infacmd to export users
and groups from the source domain and import them into the target domain.
6.
Click OK.
284
Database
Example
IBM DB2
<database name>
mydatabase
sqlserver@mydatabase
Oracle
oracle.world
Sybase
sybaseserver@mydatabase
Node assignments. If you have the high availability option, configure the primary and backup nodes to run
the service.
Database properties. Configure repository database properties, such as the database user name,
password, and connection string.
Advanced properties. Configure advanced repository properties, such as the maximum connections and
locks on the repository.
Custom properties. Configure custom properties that are unique to specific environments.
To view and update properties, select the PowerCenter Repository Service in the Navigator. The Properties
tab for the service appears.
Node Assignments
If you have the high availability option, you can designate primary and backup nodes to run the service. By
default, the service runs on the primary node. If the node becomes unavailable, the service fails over to a
backup node.
General Properties
To edit the general properties, select the PowerCenter Repository Service in the Navigator, select the
Properties view, and then click Edit in the General Properties section.
The following table describes the general properties for the service:
Property
Description
Name
Name of the service. The name is not case sensitive and must be unique within the
domain. It cannot exceed 128 characters or begin with @. It also cannot contain spaces
or the following special characters:
`~%^*+={}\;:'"/?.,<>|!()][
You cannot change the name of the service after you create it.
Description
License
Primary Node
Node on which the service runs. To assign the PowerCenter Repository Service to a
different node, you must first disable the service.
Repository Properties
You can configure some of the repository properties when you create the service.
285
Description
Operating Mode
Mode in which the PowerCenter Repository Service is running. Values are Normal and
Exclusive. Run the PowerCenter Repository Service in exclusive mode to perform some
administrative tasks, such as promoting a local repository to a global repository or
enabling version control. To apply changes, restart the PowerCenter Repository
Service.
Tracks changes made to users, groups, privileges, and permissions. The Log Manager
tracks the changes. To apply changes, restart the PowerCenter Repository Service.
Global Repository
Creates a global repository. If the repository is a global repository, you cannot revert
back to a local repository. To promote a local repository to a global repository, the
PowerCenter Repository Service must be running in exclusive mode.
Version Control
Creates a versioned repository. After you enable a repository for version control, you
cannot disable the version control.
To enable a repository for version control, you must run the PowerCenter Repository
Service in exclusive mode. This property appears if you have the team-based
development option.
Database Properties
Database properties provide information about the database that stores the repository metadata. You specify
the database properties when you create the PowerCenter Repository Service. After you create a repository,
you may need to modify some of these properties. For example, you might need to change the database user
name and password, or you might want to adjust the database connection timeout.
The following table describes the database properties:
Property
Description
Database Type
Code Page
Connect String
286
Property
Description
Tablespace name for IBM DB2 and Sybase repositories. When you specify
the tablespace name, the PowerCenter Repository Service creates all
repository tables in the same tablespace. You cannot use spaces in the
tablespace name.
You cannot change the tablespace name in the repository database properties
after you create the service. If you create a PowerCenter Repository Service
with the wrong tablespace name, delete the PowerCenter Repository Service
and create a new one with the correct tablespace name.
To improve repository performance on IBM DB2 EEE repositories, specify a
tablespace name with one node.
To apply changes, restart the PowerCenter Repository Service.
Default is disabled.
Database Username
Account for the database containing the repository. Set up this account using
the appropriate database client tools. To apply changes, restart the
PowerCenter Repository Service.
Database Password
Advanced Properties
Advanced properties control the performance of the PowerCenter Repository Service and the repository
database.
287
Description
When you specify a severity level, the log includes all errors at that level and
above. For example, if the severity level is Warning, fatal, error, and warning
messages are logged. Use Trace or Debug if Informatica Global Customer
Support instructs you to use that logging level for troubleshooting purposes.
Default is INFO.
Resilience Timeout
288
Number of objects that the cache can contain when repository agent caching
is enabled. You can increase the number of objects if there is available
memory on the machine where the PowerCenter Repository Service process
runs. The value must not be less than 100. Default is 10,000.
Property
Description
If you update the following properties, restart the PowerCenter Repository Service for the modifications to
take effect:
Make sure Metadata Manager is running. Create a Metadata Manager Service in the Administrator tool or
verify that an enabled Metadata Manager Service exists in the domain that contains the PowerCenter
Repository Service for the PowerCenter repository.
Load the PowerCenter repository metadata. Create a resource for the PowerCenter repository in
Metadata Manager and load the PowerCenter repository metadata into the Metadata Manager warehouse.
Description
Metadata Manager
Service
Name of the Metadata Manager Service used to run data lineage. Select from the
available Metadata Manager Services in the domain.
Resource Name
289
Custom properties. Configure custom properties that are unique to specific environments.
Environment variables. Configure environment variables for each PowerCenter Repository Service
process.
To view and update properties, select a PowerCenter Repository Service in the Navigator and click the
Processes view.
Environment Variables
The database client path on a node is controlled by an environment variable.
Set the database client path environment variable for the PowerCenter Repository Service process if the
PowerCenter Repository Service process requires a different database client than another PowerCenter
Repository Service process that is running on the same node.
The database client code page on a node is usually controlled by an environment variable. For example,
Oracle uses NLS_LANG, and IBM DB2 uses DB2CODEPAGE. All PowerCenter Integration Services and
PowerCenter Repository Services that run on this node use the same environment variable. You can
configure a PowerCenter Repository Service process to use a different value for the database client code
page environment variable than the value set for the node.
You can configure the code page environment variable for a PowerCenter Repository Service process when
the PowerCenter Repository Service process requires a different database client code page than the
PowerCenter Integration Service process running on the same node.
For example, the PowerCenter Integration Service reads from and writes to databases using the UTF-8 code
page. The PowerCenter Integration Service requires that the code page environment variable be set to
UTF-8. However, you have a Shift-JIS repository that requires that the code page environment variable be
set to Shift-JIS. Set the environment variable on the node to UTF-8. Then add the environment variable to the
PowerCenter Repository Service process properties and set the value to Shift-JIS.
290
Resilience. The PowerCenter Repository Service is resilient to the temporary unavailability of other
services and the repository database. PowerCenter Repository Service clients are resilient to connections
with the PowerCenter Repository Service.
Restart and failover. If the PowerCenter Repository Service fails, the Service Manager can restart the
service or fail it over to another node, based on node availability.
Recovery. After restart or failover, the PowerCenter Repository Service can recover operations from the
point of interruption.
Resilience
The PowerCenter Repository Service is resilient to temporary unavailability of PowerCenter Repository
Service clients and the PowerCenter Repository database.
An application service can be unavailable because of network failure or because a service process fails. You
can configure the resilience timeout for the connection between the PowerCenter Repository Service and the
following components:
PowerCenter Repository Service Clients
A PowerCenter Repository Service client can be a PowerCenter Client or a PowerCenter service that
depends on the PowerCenter Repository Service. For example, the PowerCenter Integration Service is a
PowerCenter Repository Service client because it depends on the PowerCenter Repository Service for a
connection to the repository.
The PowerCenter Repository Service resilience timeout period is based on the resilience properties that
you configure for the PowerCenter Repository Service, PowerCenter Repository Service clients, and the
domain.
Note: The Web Services Hub is not resilient to the PowerCenter Repository Service.
PowerCenter Repository Database
The PowerCenter repository database might become unavailable because of network failure or because
the repository database system becomes unavailable. If the repository database becomes unavailable,
the PowerCenter Repository Service tries to reconnect to the repository database within the period
specified by the database connection timeout configured in the PowerCenter Repository Service
properties.
Tip: If the repository database system has high availability features, set the database connection timeout
to allow the repository database system enough time to become available before the PowerCenter
Repository Service tries to reconnect to it. Test the database system features that you plan to use to
determine the optimum database connection timeout.
291
The PowerCenter Repository Service process fails and the primary node is not available.
After failover, PowerCenter Repository Service clients synchronize and connect to the PowerCenter
Repository Service process without loss of service.
You can disable a PowerCenter Repository Service process to shut down a node for maintenance. If you
disable a PowerCenter Repository Service process in complete or abort mode, the PowerCenter Repository
Service process fails over to another node.
Recovery
After a PowerCenter Repository Service restarts or fails over, it restores the state of operation from the
repository and recovers operations from the point of interruption.
The PowerCenter Repository Service maintains the state of operation in the repository. The state of
operations includes information about repository locks, requests in progress, and connected clients.
The PowerCenter Repository Service performs the following tasks to recover operations:
292
Reconnects to clients, such as the PowerCenter Designer and the PowerCenter Integration Service
Sends outstanding notifications about metadata changes, such as workflow schedule changes
CHAPTER 14
PowerCenter Repository
Management
This chapter includes the following topics:
293
Upgrade a repository.
Upgrade a PowerCenter Repository Service and its dependent services to the latest service version.
Assign privileges and roles to users and groups for the PowerCenter Repository Service.
Upgrade content.
Register plug-ins.
You must disable the PowerCenter Repository Service to run it in it exclusive mode.
Note: Before you disable a PowerCenter Repository Service, verify that all users are disconnected from the
repository. You can send a repository notification to inform users that you are disabling the service.
294
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
3.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
3.
4.
In the Disable Repository Service, select to abort all service processes immediately or allow services
processes to complete.
5.
Click OK.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
In the Domain Navigator, select the PowerCenter Repository Service associated with the service process
you want to enable.
3.
4.
5.
In the Manage tab Actions menu, click Enable Process to enable the service process on the node.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
In the Domain Navigator, select the PowerCenter Repository Service associated with the service process
you want to disable.
3.
4.
5.
295
6.
In the dialog box that appears, select to abort service processes immediately or allow service processes
to complete.
7.
Click OK.
Operating Mode
You can run the PowerCenter Repository Service in normal or exclusive operating mode. When you run the
PowerCenter Repository Service in normal mode, you allow multiple users to access the repository to update
content. When you run the PowerCenter Repository Service in exclusive mode, you allow only one user to
access the repository. Set the operating mode to exclusive to perform administrative tasks that require a
single user to access the repository and update the configuration. If a PowerCenter Repository Service has
no content associated with it or if a PowerCenter Repository Service has content that has not been upgraded,
the PowerCenter Repository Service runs in exclusive mode only.
When the PowerCenter Repository Service runs in exclusive mode, it accepts connection requests from the
Administrator tool and pmrep.
Run a PowerCenter Repository Service in exclusive mode to perform the following administrative tasks:
Delete repository content. Delete the repository database tables for the PowerCenter repository.
Enable version control. If you have the team-based development option, you can enable version control
for the repository. A versioned repository can store multiple versions of an object.
Promote a PowerCenter repository. Promote a local repository to a global repository to build a repository
domain.
Register a local repository. Register a local repository with a global repository to create a repository
domain.
Register a plug-in. Register or unregister a repository plug-in that extends PowerCenter functionality.
Before running a PowerCenter Repository Service in exclusive mode, verify that all users are disconnected
from the repository. You must stop and restart the PowerCenter Repository Service to change the operating
mode.
When you run a PowerCenter Repository Service in exclusive mode, repository agent caching is disabled,
and you cannot assign privileges and roles to users and groups for the PowerCenter Repository Service.
Note: You cannot use pmrep to log in to a new PowerCenter Repository Service running in exclusive mode if
the Service Manager has not synchronized the list of users and groups in the repository with the list in the
domain configuration database. To synchronize the list of users and groups, restart the PowerCenter
Repository Service.
296
1.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
3.
4.
5.
Click OK.
The Administrator tool prompts you to restart the PowerCenter Repository Service.
6.
Verify that you have notified users to disconnect from the repository, and click Yes if you want to log out
users who are still connected.
A warning message appears.
7.
Choose to allow processes to complete or abort all processes, and then click OK.
The PowerCenter Repository Service stops and then restarts. The service status at the top of the right
pane indicates when the service has restarted. The Disable button for the service appears when the
service is enabled and running.
Note: PowerCenter does not provide resilience for a repository client when the PowerCenter Repository
Service runs in exclusive mode.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
3.
4.
5.
Click OK.
The Administrator tool prompts you to restart the PowerCenter Repository Service.
Note: You can also use the infacmd UpdateRepositoryService command to change the operating mode.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
In the Domain Navigator, select a PowerCenter Repository Service that has no content associated with
it.
3.
On the Manage tab Actions menu, select Repository Content > Create.
The page displays the options to create content.
4.
5.
297
You must have the team-based development option to enable version control. Enable version control if
you are certain you want to use a versioned repository. You can convert a non-versioned repository to a
versioned repository at any time, but you cannot convert a versioned repository to a non-versioned
repository.
6.
Click OK.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
In the Domain Navigator, select the PowerCenter Repository Service from which you want to delete the
content.
3.
4.
On the Manage tab Actions menu, click Repository Content > Delete.
5.
6.
If the repository is a global repository, choose to unregister local repositories when you delete the
content.
The delete operation does not proceed if it cannot unregister the local repositories. For example, if a
Repository Service for one of the local repositories is running in exclusive mode, you may need to
unregister that repository before you delete the global repository.
7.
Click OK.
The activity log displays the results of the delete operation.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
In the Domain Navigator, select the PowerCenter Repository Service for the repository you want to
upgrade.
3.
On the Manage tab Actions menu, click Repository Contents > Upgrade.
4.
5.
Click OK.
The activity log displays the results of the upgrade operation.
298
2.
In the Administrator tool, click the Manage tab > Services and Nodes view.
3.
4.
5.
6.
7.
8.
Click OK.
The Repository Authentication dialog box appears.
9.
10.
Promote metadata from a local repository to a global repository, making it accessible to all local
repositories in the repository domain.
299
Network connections between the PowerCenter Repository Services and PowerCenter Integration
Services.
Create a repository and configure it as a global repository. You can specify that a repository is the global
repository when you create the PowerCenter Repository Service. Alternatively, you can promote an
existing local repository to a global repository.
2.
Register local repositories with the global repository. After a local repository is registered, you can
connect to the global repository from the local repository and you can connect to the local repository
from the global repository.
3.
Create user accounts for users performing cross-repository work. A user who needs to connect to
multiple repositories must have privileges for each PowerCenter Repository Service.
When the global and local repositories exist in different Informatica domains, the user must have an
identical user name, password, and security domain in each Informatica domain. Although the user
name, password, and security domain must be the same, the user can be a member of different user
groups and can have a different set of privileges for each PowerCenter Repository Service.
4.
Configure the user account used to access the repository associated with the PowerCenter Integration
Service. To run a session that uses a global shortcut, the PowerCenter Integration Service must access
the repository in which the mapping is saved and the global repository with the shortcut information. You
enable this behavior by configuring the user account used to access the repository associated with the
PowerCenter Integration Service. This user account must have privileges for the following services:
The local PowerCenter Repository Service associated with the PowerCenter Integration Service
300
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
In the Domain Navigator, select the PowerCenter Repository Service for the repository you want to
promote.
3.
If the PowerCenter Repository Service is running in normal mode, change the operating mode to
exclusive.
4.
5.
6.
7.
8.
Click OK.
After you promote a local repository, the value of the GlobalRepository property is true in the general
properties for the PowerCenter Repository Service.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
In the Domain Navigator, select the PowerCenter Repository Service associated with the local
repository.
3.
If the PowerCenter Repository Service is running in normal mode, change the operating mode to
exclusive.
4.
5.
To register a local repository, on the Manage tab Actions menu, click Repository Domain > Register
Local Repository. Continue to the next step. To unregister a local repository, on the Manage tab Actions
menu, click Repository Domain > Unregister Local Repository. Skip to step 11.
6.
Select the Informatica domain of the PowerCenter Repository Service for the global repository.
If the PowerCenter Repository Service is in a domain that does not appear in the list of Informatica
domains, click Manage Domain List to update the list.
The Manage List of Domains dialog box appears.
301
7.
8.
Description
Domain Name
Host Name
Machine hosting the master gateway node for the linked domain. The machine hosting the
master gateway for the local Informatica Domain must have a network connection to this
machine.
Host Port
Click Add to add more than one domain to the list, and repeat step 7 for each domain.
To edit the connection information for a linked domain, go to the section for the domain you want to
update and click Edit.
To remove a linked domain from the list, go to the section for the domain you want to remove and click
Delete.
9.
10.
11.
Enter the user name, password, and security domain for the user who manages the global PowerCenter
Repository Service.
The Security Domain field appears when the Informatica Domain contains an LDAP security domain.
12.
Enter the user name, password, and security domain for the user who manages the local PowerCenter
Repository Service.
13.
Click OK.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
In the Domain Navigator, select the PowerCenter Repository Service that manages the local or global
repository.
3.
On the Manage tab Actions menu, click Repository Domain > View Registered Repositories.
For a global repository, a list of local repositories appears.
For a local repository, the name of the global repository appears.
Note: The Administrator tool displays a message if a local repository is not registered with a global
repository or if a global repository has no registered local repositories.
302
Unregister the local repositories. For each local repository, follow the procedure to unregister a local
repository from a global repository. To move a global repository to another Informatica domain,
unregister all local repositories associated with the global repository.
2.
Create the PowerCenter Repository Services using existing content. For each repository in the target
domain, follow the procedure to create a PowerCenter Repository Service using the existing repository
content in the source Informatica domain.
Verify that users and groups with privileges for the source PowerCenter Repository Service exist in the
target domain. The Service Manager periodically synchronizes the list of users and groups in the
repository with the users and groups in the domain configuration database. During synchronization,
users and groups that do not exist in the target domain are deleted from the repository.
You can use infacmd to export users and groups from the source domain and import them into the target
domain.
3.
Register the local repositories. For each local repository in the target Informatica domain, follow the
procedure to register a local repository with a global repository.
View locks. View object locks and lock type. The PowerCenter repository locks repository objects and
folders by user. The repository uses locks to prevent users from duplicating or overwriting work. The
repository creates different types of locks depending on the task.
Close connections and release locks. Terminate residual connections and locks. When you close a
connection, you release all locks associated with that connection.
Viewing Locks
You can view locks and identify residual locks in the Administrator tool.
1.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
In the Domain Navigator, select the PowerCenter Repository Service with the locks that you want to
view.
3.
4.
303
Description
Server Thread ID
Folder
Object Type
Object Name
Lock Type
Lock Name
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
In the Domain Navigator, select the PowerCenter Repository Service with the locks that you want to
view.
3.
4.
304
Property
Description
Connection ID
Status
Connection status.
Username
Security Domain
Application
Service
Host Name
Host Address
Host Port
Port number of the machine hosting the repository client used to communicate with the
repository.
Property
Description
Process ID
Login Time
Time of the last metadata transaction between the repository client and the repository.
A residual repository connection also retains all repository locks associated with the connection. If an object
or folder is locked when one of these events occurs, the repository does not release the lock. This lock is
called a residual lock.
If a system or network problem causes a repository client to lose connectivity to the repository, the
PowerCenter Repository Service detects and closes the residual connection. When the PowerCenter
Repository Service closes the connection, it also releases all repository locks associated with the connection.
A PowerCenter Integration Service may have multiple connections open to the repository. If you close one
PowerCenter Integration Service connection to the repository, you close all connections for that service.
Important: Closing an active connection can cause repository inconsistencies. Close residual connections
only.
To close a connection and release locks:
1.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
In the Domain Navigator, select the PowerCenter Repository Service with the connection you want to
close.
3.
4.
5.
6.
7.
Click OK.
The PowerCenter Repository Service closes connections and releases all locks associated with the
connections.
305
2.
3.
4.
Click OK.
The PowerCenter Repository Service sends the notification message to the PowerCenter Client users. A
message box informs users that the notification was received. The message text appears on the
Notifications tab of the PowerCenter Client Output window.
306
1.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
In the Domain Navigator, select the PowerCenter Repository Service for the repository you want to back
up.
3.
On the Manage tab Actions menu, select Repository Contents > Back Up.
4.
5.
Enter a file name and description for the repository backup file.
Use an easily distinguishable name for the file. For example, if the name of the repository is
DEVELOPMENT, and the backup occurs on May 7, you might name the file DEVELOPMENTMay07.rep.
If you do not include the .rep extension, the PowerCenter Repository Service appends that extension to
the file name.
6.
If you use the same file name that you used for a previous backup file, select whether or not to replace
the existing file with the new backup file.
To overwrite an existing repository backup file, select Replace Existing File. If you specify a file name
that already exists in the repository backup directory and you do not choose to replace the existing file,
the PowerCenter Repository Service does not back up the repository.
7.
Choose to skip or back up workflow and session logs, deployment group history, and MX data. You
might want to skip these operations to increase performance when you restore the repository.
8.
Click OK.
The results of the backup operation appear in the activity log.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
In the Domain Navigator, select the PowerCenter Repository Service for a repository that has been
backed up.
3.
On the Manage tab Actions menu, select Repository Contents > View Backup Files.
The list of the backup files shows the repository version and the options skipped during the backup.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
In the Domain Navigator, select the PowerCenter Repository Service that manages the repository
content you want to restore.
3.
On the Manage tab Actions menu, click Repository Contents > Restore.
The Restore Repository Contents options appear.
307
4.
5.
6.
Optionally, choose to skip restoring the workflow and session logs, deployment group history, and
Metadata Exchange (MX) data to improve performance.
7.
Click OK.
The activity log indicates whether the restore operation succeeded or failed.
Note: When you restore a global repository, the repository becomes a standalone repository. After
restoring the repository, you need to promote it to a global repository.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
In the Domain Navigator, select the PowerCenter Repository Service to which you want to add copied
content.
You cannot copy content to a repository that has content. If necessary, back up and delete existing
repository content before copying in the new content.
3.
On the Manage tab Actions menu, click Repository Contents > Copy From.
The dialog box displays the options for the Copy From operation.
4.
5.
Enter a user name, password, and security domain for the user who manages the repository from which
you want to copy content.
The Security Domain field appears when the Informatica domain contains an LDAP security domain.
6.
308
To skip copying the workflow and session logs, deployment group history, and Metadata Exchange (MX)
data, select the check boxes in the advanced options. Skipping this data can increase performance.
7.
Click OK.
The activity log displays the results of the copy operation.
2.
In the Administrator tool, click the Manage tab > Services and Nodes view.
3.
In the Domain Navigator, select the PowerCenter Repository Service to which you want to add the plugin.
4.
5.
6.
On the Register Plugin page, click the Browse button to locate the plug-in file.
7.
If the plug-in was registered previously and you want to overwrite the registration, select the check box
to update the existing plug-in registration. For example, you can select this option when you upgrade a
plug-in to the latest version.
8.
9.
Click OK.
The PowerCenter Repository Service registers the plug-in with the repository. The results of the
registration operation appear in the activity log.
10.
2.
In the Administrator tool, click the Manage tab > Services and Nodes view.
3.
In the Domain Navigator, select the PowerCenter Repository Service from which you want to remove the
plugin.
309
4.
5.
6.
7.
Click OK.
8.
Audit Trails
You can track changes to users, groups, and permissions on repository objects by selecting the
SecurityAuditTrail configuration option in the PowerCenter Repository Service properties in the Administrator
tool. When you enable the audit trail, the PowerCenter Repository Service logs security changes to the
PowerCenter Repository Service log. The audit trail logs the following operations:
Repository Statistics
Almost all PowerCenter repository tables use at least one index to speed up queries. Most databases keep
and use column distribution statistics to determine which index to use to execute SQL queries optimally.
Database servers do not update these statistics continuously.
In frequently used repositories, these statistics can quickly become outdated, and SQL query optimizers
might not choose the best query plan. In large repositories, choosing a sub-optimal query plan can have a
negative impact on performance. Over time, repository operations gradually become slower.
Informatica identifies and updates the statistics of all repository tables and indexes when you copy, upgrade,
and restore repositories. You can also update statistics using the pmrep UpdateStatistics command.
310
By skipping this information, you reduce the time it takes to copy, back up, or restore a repository.
You can also skip this information when you use the pmrep commands.
311
CHAPTER 15
312
Create a service.
You can also use the infacmd pwx commands to perform many of these tasks.
Before you create a Listener Service, install PowerExchange and configure a PowerExchange Listener on the
node where you want to create the Listener Service. When you create a Listener Service, the Service
Manager associates it with the PowerExchange Listener on the node. When you start or stop the Listener
Service, the PowerExchange Listener also starts or stops.
If you created the application service through Informatica Administrator, the node name value that
you specified in the Start Parameters property.
If you created the application service through the infacmd pwx CreateListenerService command, the
node name value that you specified for the -StartParameters option on the command.
Use the same port number that you specify for the SVCNODE Port Number configuration property for the
service.
Define the following DBMOVER statement on each node where an Informatica client tool or integration
service that connects to the Listener runs:
NODE
Configures the Informatica client tool or integration service to connect to the PowerExchange Listener at
the specified IP address or host name or to locate the Listener Service in the domain.
313
To configure the client tool or integration service to locate the Listener Service in the domain, include the
optional service_name parameter in the NODE statement. The service_name parameter identifies the
node, and the port parameter in the NODE statement identifies the port number.
Note: If the NODE statement does not include the service_name parameter, the Informatica client tool or
integration service connects directly to the Listener at the specified IP address or host name. It does not
locate the Listener Service in the domain.
For more information about customizing the DBMOVER configuration file for bulk data movement or CDC
sessions, see the following guides:
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
3.
Enter the general properties for the service, and click Next.
For more information, see PowerExchange Listener Service General Properties on page 315.
4.
5.
Click OK.
6.
To enable the Listener Service, select the service in the Domain Navigator and click Enable the
Service.
314
Description
Name
Name of the service. The name is not case sensitive and must be unique within the
domain. It cannot exceed 128 characters or begin with @. It also cannot contain
spaces or the following special characters:
`~%^*+={}\;:'"/?.,<>|!()][
You cannot change the name of the service after you create it.
Description
Location
Domain and folder where the service is created. Click Browse to choose a different
folder. You can move the service after you create it.
Node
License
Backup Nodes
If your license includes high availability, nodes on which the service can run if the
primary node is unavailable.
315
Description
Service Process
Read only. Type of PowerExchange process that the service manages. For the
Listener Service, the service process is named Listener.
Start Parameters
Parameters to include when you start the Listener Service. Separate the
parameters with the space character.
You can include the following parameters:
- service_name
Required. Name that identifies the Listener Service. This name must match the name
in the LISTENER statement in the DBMOVER configuration file on the machine where
the PowerExchange Listener runs.
- config=directory
Optional. Specifies the full path and file name for a DBMOVER configuration file that
overrides the default dbmover.cfg file in the installation directory.
This override file takes precedence over any other override configuration file that you
optionally specify with the PWX_CONFIG environment variable.
- license=directory/license_key_file
Optional. Specifies the full path and file name for any license key file that you want to
use instead of the default license.key file in the installation directory. This override
license key file must have a file name or path that is different from that of the default
file.
This override file takes precedence over any other override license key file that you
optionally specify with the PWX_LICENSE environment variable.
Note: In the config and license parameters, you must provide the full path only if the
file does not reside in the installation directory. Include double quotation marks
around any path and file name that contains spaces.
SVC NODE Port Number
Specifies the port on which the Listener Service connects to the PowerExchange
Listener.
Use the same port number that is specified in the SVCNODE statement of the
DBMOVER file.
Description
Environment Variables
Environment variables that are defined for the Listener Service process.
316
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
3.
4.
5.
Click OK.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
3.
4.
Select the service in the Domain Navigator, and click Disable the Service.
2.
3.
Complete. Allows all Listener subtasks to run to completion before shutting down the service and the
Listener Service process. Corresponds to the PowerExchange Listener CLOSE command.
Stop. Waits up to 30 seconds for subtasks to complete, and then shuts down the service and the
Listener Service process. Corresponds to the PowerExchange Listener CLOSE FORCE command.
Abort. Stops all processes immediately and shuts down the service.
Click OK.
317
For more information about the CLOSE and CLOSE FORCE commands, see the PowerExchange Command
Reference.
Note: After you select an option and click OK, the Administrator tool displays a busy icon until the service
stops. If you select the Complete option but then want to disable the service more quickly with the Stop or
Abort option, you must issue the infacmd isp disableService command.
In the Logs tab, select the Domain view. You can filter on any of the columns.
In the Logs tab, click the Service view. In the Service Type column, select PowerExchange Listener
Service. In the Service Name list, optionally select the name of the service.
On the Manage tab, click the Domain view. Click the Listener Service Actions menu, and then select
View Logs.
Messages appear by default in time stamp order, with the most recent messages on top.
318
CHAPTER 16
Create a service.
You can use the Administrator tool or the infacmd command line program to administer the Logger Service.
319
Before you create a Logger Service, install PowerExchange and configure a PowerExchange Logger on the
node where you want to create the Logger Service. When you create a Logger Service, the Service Manager
associates it with the PowerExchange Logger that you specify. When you start or stop the Logger Service,
you also start or stop the Logger Service process.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
3.
320
4.
Click OK.
5.
To enable the Logger Service, select the service in the Navigator and click Enable the Service.
Description
Name
Name of the service. The name is not case sensitive and must be unique within
the domain. It cannot exceed 128 characters or begin with @. It also cannot
contain spaces or the following special characters:
`~%^*+={}\;:'"/?.,<>|!()][
You cannot change the name of the service after you create it.
Description
Location
Domain and folder where the service is created. Click Browse to choose a
different folder. You can move the service after you create it.
Node
License
Backup Nodes
If your license includes high availability, nodes on which the service can run if the
primary node is unavailable.
coldstart={Y|N}
Indicates whether to cold start or warm start the Logger Service. Enter Y to cold start the Logger
Service. If the CDCT file contains log entries, the Logger Service deletes these entries. Enter N to
warm start the Logger Service from the restart point that is specified in the CDCT file. If no restart
information exists in the CDCT file, the Logger Service ends with an error.
Default is N.
321
config=directory/pwx_config_file
Specifies the full path and file name for a dbmover configuration file that overrides the default
dbmover.cfg file. The override file must have a path or file name that is different from that of the
default file. This override file takes precedence over any configuration file that you optionally specify
in the PWX_CONFIG environment variable.
cs=directory/pwxlogger_config_file
Specifies the full path and file name for a Logger Service configuration file that overrides the default
pwxccl.cfg configuration file. The override file must have a path or file name that is different from that
of the default file.
encryptepwd=encrypted_password
A password in encrypted format for enabling the encryption of PowerExchange Logger log files. With
this password, the PowerExchange Logger can generate a unique encryption key for each Logger log
file. The password is stored in the CDCT file in encrypted format. For security purposes, the
password is not stored in CDCT backup files and is not displayed in the CDCT reports that you can
generate with the PowerExchange PWXUCDCT utility.
If you specify this parameter, you must also specify coldstart=Y.
If you specify this parameter and also specify the ENCRYPTEPWD parameter in the PowerExchange
Logger configuration file, pwxccl.cfg, the parameter in the configuration file takes precedence. If you
specify this parameter and also specify the ENCRYPTPWD parameter in the PowerExchange Logger
configuration file, an error occurs.
You can set the AES algorithm to use for log file encryption in the ENCRYPTOPT parameter of the
pwxccl.cfg file. The default is AES128.
Tip: For optimal security, Informatica recommends that you specify the encryption password when
cold starting the PowerExchange Logger rather than in the pwxccl.cfg configuration file. This practice
can reduce the risk of malicious access to the encryption password for the following reasons: 1) The
encryption password is not stored in the pwxccl.cfg file, and 2) You can remove the password from
the command line after a successful cold start. If you specify the encryption password for a cold start
and then need to restore the CDCT file later, you must enter the same encryption password in the
RESTORE_CDCT command of the PWXUCDCT utility.
To not encrypt PowerExchange Logger log files, do not enter an encryption password.
license=directory/license_key_file
Specifies the full path and file name for a license key file that overrides the default license.key file.
The override file must have a path or file name that is different from that of the default file. This
override file takes precedence over any license key file that you optionally specify in the
PWX_LICENSE environment variable.
specialstart={Y|N}
Indicates whether to perform a special start of the PowerExchange Logger. A special start begins
PowerExchange capture processing from the point in the change stream that you specify in the
pwxccl.cfg file. This start point overrides the restart point from the CDCT file for the PowerExchange
Logger run. A special start does not delete any content from the CDCT file.
322
Use this parameter to skip beyond problematic parts in the source logs without losing captured data.
For example, use a special start in the following situations:
- You do not want the PowerExchange Logger to capture an upgrade of an Oracle catalog. In this
case, stop the PowerExchange Logger before the upgrade. After the upgrade is complete, generate
new sequence and restart tokens for the PowerExchange Logger based on the post-upgrade SCN.
Enter these token values in the SEQUENCE_TOKEN and RESTART_TOKEN parameters in the
pwxccl.cfg, and then special start the PowerExchange Logger.
- You do not want the PowerExchange Logger to reprocess old, unavailable logs that were caused by
outstanding UOWs that are not of CDC interest. In this case, stop the PowerExchange Logger. Edit
the RESTART_TOKEN value to reflect the SCN of the earliest available log, and then perform a
special start. If any of the outstanding UOWs that started before this restart point are of CDC
interest, data might be lost.
Valid values:
- Y. Perform a special start of the PowerExchange Logger from the point in the change stream that is
parameter.
Default is N.
Note: In the config, cs, and license parameters, provide the full path only if the file does not reside in the
PowerExchange installation directory. Include quotes around any path and file name that contains
spaces.
SVC NODE Port Number
Specifies the port on which the Logger Service connects to the PowerExchange Logger.
Use the same port number that is in the SVCNODE statement of the DBMOVER file.
2.
323
3.
4.
Click OK.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
3.
4.
Description
Environment Variables
324
Select the service in the Domain Navigator, and click Disable the Service.
2.
3.
Complete. Initiates a controlled shutdown of all processes and shuts down the service. Corresponds
to the PowerExchange SHUTDOWN command.
Abort. Stops all processes immediately and shuts down the service.
Click OK.
In the Logs tab, select the Domain view. You can filter on any of the columns.
In the Logs tab, click the Service view. In the Service Type column, select PowerExchange Logger
Service. In the Service Name list, optionally select the name of the service.
On the Manage tab, click the Domain view. Click the Logger Service Actions menu, and then select
View Logs.
Messages appear by default in time stamp order, with the most recent messages on top.
325
CHAPTER 17
Reporting Service
This chapter includes the following topics:
PowerCenter repository. Choose the associated PowerCenter Repository Service and specify the
PowerCenter repository details to run PowerCenter Repository Reports.
Metadata Manager warehouse. Choose the associated Metadata Manager Service and specify the
Metadata Manager warehouse details to run Metadata Manager Reports.
Data Profiling warehouse. Choose the Data Profiling option and specify the data profiling warehouse
details to run Data Profiling Reports.
Other reporting sources. Choose the Other Reporting Sources option and specify the data warehouse
details to run custom reports.
Data Analyzer stores metadata for schemas, metrics and attributes, queries, reports, user profiles, and other
objects in the Data Analyzer repository. When you create a Reporting Service, specify the Data Analyzer
repository details. The Reporting Service configures the Data Analyzer repository with the metadata
corresponding to the selected data source.
You can create multiple Reporting Services on the same node. Specify a data source for each Reporting
Service. To use multiple data sources with a single Reporting Service, create additional data sources in Data
Analyzer. After you create the data sources, follow the instructions in the Data Analyzer Schema Designer
Guide to import table definitions and create metrics and attributes for the reports.
When you enable the Reporting Service, the Administrator tool starts Data Analyzer. Click the URL in the
Properties view to access Data Analyzer.
The name of the Reporting Service is the name of the Data Analyzer instance and the context path for the
Data Analyzer URL. The Data Analyzer context path can include only alphanumeric characters, hyphens (-),
326
and underscores (_). If the name of the Reporting Service includes any other character, PowerCenter
replaces the invalid characters with an underscore and the Unicode value of the character. For example, if
the name of the Reporting Service is ReportingService#3, the context path of the Data Analyzer URL is the
Reporting Service name with the # character replaced with _35. For example:
http://<HostName>:<PortNumber>/ReportingService_353
Source and target metadata. Includes shortcuts, descriptions, and corresponding database names and
field-level attributes.
Transformation metadata in mappings and mapplets. Includes port-level details for each transformation.
Mapping and mapplet metadata. Includes the targets, transformations, and dependencies for each
mapping.
Workflow and worklet metadata. Includes schedules, instances, events, and variables.
Session metadata. Includes session execution details and metadata extensions defined for each session.
Change management metadata. Includes versions of sources, targets, labels, and label properties.
Composite reports. Display a set of sub-reports and the associated metadata. The sub-reports can be
multiple report types in Data Analyzer.
Metadata reports. Display basic metadata about a data profile. The Metadata reports provide the sourcelevel and column-level functions in a data profile, and historic statistics on previous runs of the same data
profile.
Summary reports. Display data profile results for source-level and column-level functions in a data profile.
327
Create the Data Analyzer repository. Create a database for the Data Analyzer repository. If you create a
Reporting Service for an existing Data Analyzer repository, you can use the existing database. When you
enable a Reporting Service that uses an existing Data Analyzer repository, PowerCenter does not import
the metadata for the prepackaged reports.
Create PowerCenter Repository Services and Metadata Manager Services. To create a Reporting Service
for the PowerCenter Repository Service or Metadata Manager Service, create the application service in
the domain.
1.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
3.
Description
Name
Name of the Reporting Service. The name is not case sensitive and must be unique
within the domain. It cannot exceed 128 characters or begin with @. It also cannot
contain spaces or the following special characters:
`~%^*+={}\;:'"/?.,<>|!()][
328
Description
Description of the Reporting Service. The description cannot exceed 765 characters.
Location
Domain and folder where the service is created. Click Browse to choose a different
folder. You can move the Reporting Service after you create it.
License
License that allows the use of the service. Select from the list of licenses available in the
domain.
Primary Node
Node on which the service process runs. Since the Reporting Service is not highly
available, it can run on one node.
Property
Description
Enable HTTP
on port
The TCP port that the Reporting Service uses. Enter a value between 1 and 65535.
Enable HTTPS
on port
The SSL port that the Reporting Service uses for secure connections. You can edit the
value if you have configured the HTTPS port for the node where you create the Reporting
Service. Enter a value between 1 and 65535 and ensure that it is not the same as the
HTTP port. If the node where you create the Reporting Service is not configured for the
HTTPS port, you cannot configure HTTPS for the Reporting Service.
Edit mode that determines where you can edit Datasource properties.
When enabled, the edit mode is advanced, and the value is true. In advanced edit mode,
you can edit Datasource and Dataconnector properties in the Administrator tool and the
Data Analyzer instance.
When disabled, the edit mode is basic, and the value is false. In basic edit mode, you can
edit Datasource properties in the Administrator tool.
Note: After you enable the Reporting Service in advanced edit mode, you cannot change
it back to basic edit mode.
4.
Click Next.
5.
Description
Database Type
Repository Host
Repository Port
The port number on which you configure the database server listener service.
Repository Name
SID/Service Name
For database type Oracle only. Indicates whether to use the SID or service name
in the JDBC connection string. For Oracle RAC databases, select from Oracle
SID or Oracle Service Name. For other Oracle databases, select Oracle SID.
Repository Username
Account for the Data Analyzer repository database. Set up this account from the
appropriate database client tools.
Repository Password
Tablespace Name
Tablespace name for DB2 repositories. When you specify the tablespace name,
the Reporting Service creates all repository tables in the same tablespace.
Required if you choose DB2 as the Database Type.
Note: Data Analyzer does not support DB2 partitioned tablespaces for the
repository.
Additional JDBC
Parameters
6.
Click Next.
329
7.
Description
Reporting Source
Source of data for the reports. Choose from one of the following options:
- Data Profiling
- PowerCenter Repository Services
- Metadata Manager Services
- Other Reporting Sources
Data Source
Driver
Data Source
JDBC URL
Displays the JDBC URL based on the database driver you select. For example, if you
select the Oracle driver as your data source driver, the data source JDBC URL
displays the following: jdbc:informatica:oracle://[host]:1521;SID=[sid];.
Enter the database host name and the database service name.
For an Oracle data source driver, specify the SID or service name of the Oracle
instance to which you want to connect. To indicate the service name, modify the JDBC
URL to use the ServiceName parameter:
jdbc:informatica:oracle://[host]:1521;ServiceName=[Service Name];
To configure Oracle RAC as a data source, specify the following URL:
jdbc:informatica:oracle://[hostname]:1521;ServiceName=[Service Name];
AlternateServers=(server2:1521);LoadBalancing=true
8.
Data Source
Password
Displays the table name used to test the connection to the data source. The table
name depends on the data source driver you select.
Enter the PowerCenter repository user name, the Metadata Manager repository user
name, or the data warehouse user name based on the service you want to report on.
Click Finish.
330
Note: You must disable the Reporting Service in the Administrator tool to perform tasks related to repository
content.
Function
Basic Mode
Advanced Mode
Datasource
No
Yes
Datasource
Enable/disable
Yes
Yes
Dataconnector
Activate/deactivate
Yes
Yes
Dataconnector
No
Yes
Dataconnector
No
Yes
Dataconnector
Yes
Yes
Dataconnector
No
Yes
Basic Mode
When you configure the Data Source Advanced Mode to be false for basic mode, you can manage
Datasource in the Administrator tool. Datasource and Dataconnector properties are read-only in the Data
Analyzer instance. You can edit the Primary Time Dimension Property of the data source. By default, the edit
mode is basic.
Advanced Mode
When you configure the Data Source Advanced Mode to be true for advanced mode, you can manage
Datasource and Dataconnector in the Administrator tool and the Data Analyzer instance. You cannot return to
the basic edit mode after you select the advanced edit mode. Dataconnector has a primary data source that
can be configured to JDBC, Web Service, or XML data source types.
331
When you enable a Reporting Service, the Administrator tool starts Data Analyzer on the node designated to
run the service. Click the URL in the Properties view to open Data Analyzer in a browser window and run the
reports.
You can also launch Data Analyzer from the PowerCenter Client tools, from Metadata Manager, or by
accessing the Data Analyzer URL from a browser.
To enable the service, select the service in the Navigator and click Actions > Enable.
To disable the service, select the service in the Navigator and click Actions > Disable.
Note: Before you disable a Reporting Service, ensure that all users are disconnected from Data Analyzer.
To recycle the service, select the service in the Navigator and click Actions > Recycle.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
In the Domain Navigator, select the Reporting Service that manages the repository for which you want to
create content.
3.
4.
Select the user assigned the Administrator role for the domain.
5.
Click OK.
The activity log indicates the status of the content creation action.
6.
Enable the Reporting Service after you create the repository content.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
In the Domain Navigator, select the Reporting Service that manages the repository content you want to
back up.
3.
4.
332
Or you can enter a full directory path with the backup file name to copy the backup file to a different
location.
5.
6.
Click OK.
The activity log indicates the results of the backup action.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
In the Domain Navigator, select the Reporting Service that manages the repository content you want to
restore.
3.
4.
Select a repository backup file, or select other and provide the full path to the backup file.
5.
Click OK.
The activity log indicates the status of the restore operation.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
In the Domain Navigator, select the Reporting Service that manages the repository content you want to
delete.
3.
4.
Verify that you backed up the repository before you delete the contents.
5.
Click OK.
The activity log indicates the status of the delete operation.
333
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
In the Domain Navigator, select the Reporting Service for which you want to view the last activity log.
3.
General Properties. Include the Data Analyzer license key used and the name of the node where the
service runs.
Reporting Service Properties. Include the TCP port where the Reporting Service runs, the SSL port if you
have specified it, and the Data Source edit mode.
Data Source Properties. Include the data source driver, the JDBC URL, and the data source database
user account and password.
Repository Properties. Include the Data Analyzer repository database user account and password.
To view and update properties, select the Reporting Service in the Navigator. In the Properties view, click
Edit in the properties section that you want to edit. If you update any of the properties, restart the Reporting
Service for the modifications to take effect.
General Properties
You can view and edit the general properties after you create the Reporting Service.
Click Edit in the General Properties section to edit the general properties.
The following table describes the general properties for the service:
Property
Description
Name
Name of the service. The name is not case sensitive and must be unique within the domain. It
cannot exceed 128 characters or begin with @. It also cannot contain spaces or the following
special characters:
`~%^*+={}\;:'"/?.,<>|!()][
You cannot change the name of the service after you create it.
Description
334
Property
Description
License
License object that allows use of the service. To apply changes, restart the Reporting Service.
Node
Node on which the service runs. You can move a Reporting Service to another node in the
domain. Informatica disables the Reporting Service on the original node and enables it in the
new node. You can see the Reporting Service on both the nodes, but it runs only on the new
node.
If you move the Reporting Service to another node, you must reapply the custom color
schemes to the Reporting Service. Informatica does not copy the color schemes to the
Reporting Service on the new node, but retains them on the original node.
Description
HTTP Port
The TCP port that the Reporting Service uses. You can change this value. To apply changes,
restart the Reporting Service.
HTTPS Port
The SSL port that the Reporting Service uses for secure connections. You can edit the value if
you have configured the HTTPS port for the node where you create the Reporting Service. If
the node where you create the Reporting Service is not configured for the HTTPS port, you
cannot configure HTTPS for the Reporting Service. To apply changes, restart the Reporting
Service.
Data Source
Advanced
Mode
Edit mode that determines where you can edit Datasource properties.
When enabled, the edit mode is advanced, and the value is true. In advanced edit mode, you
can edit Datasource and Dataconnector properties in the Data Analyzer instance.
When disabled, the edit mode is basic, and the value is false. In basic edit mode, you can edit
Datasource properties in the Administrator tool.
Note: After you enable the Reporting Service in advanced edit mode, you cannot change it
back to basic edit mode.
Note: If multiple Reporting Services run on the same node, you need to stop all the Reporting Services on
that node to update the port configuration.
Use the Administrator tool to manage the data source and data connector for the reporting source. To view or
edit the Datasource or Dataconnector in the advanced mode, click the data source or data connector link in
the Administrator tool.
335
You can create multiple data sources in Data Analyzer. You manage the data sources you create in Data
Analyzer within Data Analyzer. Changes you make to data sources created in Data Analyzer will not be lost
when you restart the Reporting Service.
The following table describes the data source properties that you can edit:
Property
Description
Reporting Source
The service which the Reporting Service uses as the data source.
The driver that the Reporting Service uses to connect to the data source.
Note: The Reporting Service uses the DataDirect drivers included with the
Informatica installation. Informatica does not support the use of any other
database driver.
The JDBC connect string that the Reporting Service uses to connect to the data
source.
The test table that the Reporting Service uses to verify the connection to the data
source.
Repository Properties
Repository properties provide information about the database that stores the Data Analyzer repository
metadata. Specify the database properties when you create the Reporting Service. After you create a
Reporting Service, you can modify some of these properties.
Note: If you edit a repository property or restart the system that hosts the repository database, you need to
restart the Reporting Service.
Click Edit in the Repository Properties section to edit the properties.
The following table describes the repository properties that you can edit:
336
Property
Description
Database Driver
The JDBC driver that the Reporting Service uses to connect to the Data Analyzer repository
database. To apply changes, restart the Reporting Service.
Repository Host
Name of the machine that hosts the database server. To apply changes, restart the
Reporting Service.
Property
Description
Repository Port
The port number on which you have configured the database server listener service. To
apply changes, restart the Reporting Service.
Repository Name
The name of the database service. To apply changes, restart the Reporting Service.
SID/Service Name
For repository type Oracle only. Indicates whether to use the SID or service name in the
JDBC connection string. For Oracle RAC databases, select from Oracle SID or Oracle
Service Name. For other Oracle databases, select Oracle SID.
Repository User
Account for the Data Analyzer repository database. To apply changes, restart the Reporting
Service.
Repository
Password
Data Analyzer repository database password corresponding to the database user. To apply
changes, restart the Reporting Service.
Tablespace Name
Tablespace name for DB2 repositories. When you specify the tablespace name, the
Reporting Service creates all repository tables in the same tablespace. To apply changes,
restart the Reporting Service.
Additional JDBC
Parameters
User accounts. Create users in the Informatica domain. Use the Security tab of the Administrator tool to
create users.
Privileges and roles. You assign privileges and roles to users and groups for a Reporting Service. Use the
Security tab of the Administrator tool to assign privileges and roles to a user.
337
CHAPTER 18
Reports, 343
JasperReports Overview
JasperReports is an open source reporting library that users can embed into any Java application.
Jaspersoft iReports Designer is an application that you can use with JasperReports Server to design reports.
You can run Jaspersoft iReports Designer from the shortcut menu after you install the PowerCenter Client.
338
Informatica does not support creating custom reports or modifying reports that Informatica provides in
Jaspersoft iReports Designer. For more information about the Jaspersoft iReports Designer, visit the
Jaspersoft community.
Configuration Prerequisites
Before you configure the Reporting and Dashboards Service, you must configure the Jaspersoft repository
based on your environment.
The repository database type can be IBM DB2, Microsoft SQL Server, or Oracle.
339
Description
Name
Name of the service. The name is not case sensitive and must be unique within the
domain. It cannot exceed 128 characters or begin with @. It also cannot contain
spaces or the following special characters:
`~%^*+={}\;:'"/?.,<>|!()][
You cannot change the name of the service after you create it.
Description
Location
Domain and folder where the service is created. Click Browse to choose a different
folder. You can move the service after you create it.
License
License object that allows use of the service. To apply changes, restart the
Reporting and Dashboards Service.
Node
Description
HTTP Port
Unique HTTP port number for the Reporting and Dashboards Service.
HTTPS Port
HTTPS port number for the Reporting and Dashboards Service when you enable the TLS
protocol. Use a different port number than the HTTP port number.
Keystore File
Path and file name of the keystore file that contains the private or public key pairs and
associated certificates. Required if you enable TLS and use HTTPS connections for the
Reporting and Dashboards Service.
You can create a keystore file with keytool. keytool is a utility that generates and stores private
or public key pairs and associated certificates in a keystore file. When you generate a public or
private key pair, keytool wraps the public key into a self-signed certificate. You can use the selfsigned certificate or use a certificate signed by a certificate authority.
Keystore
Password
340
Description
Database Type
Database type for the Jaspersoft repository database. Select one of the following values
based on the database type:
- oracle
- db2
- sqlserver
Database Password
Connection String
<databaseName>:driverType=4;fullyMaterializeLobData=true;ful
lyMaterializeInputStreams=true;progressiveStreaming=2;progre
sssiveLocators=2;currentSchema=<databaseName>;
- Microsoft SQL Server. jdbc:sqlserver://
<hostname>:<port>;databaseName=<databaseName>;SelectMethod=c
ursor
Note: When you use instance name for Microsoft SQL Server, use the following connection
string: jdbc:sqlserver://
<hostname>;instanceName=<dbInstance>;databaseName=<databaseN
ame>;SelectMethod=cursor
- Oracle. jdbc:oracle:thin:@<hostname>:<port>:<SID>
Note: When you use a service name for Oracle, use the following connection string:
jdbc:oracle:thin:@<hostname>:<port>/<ServiceName>
341
Description
Maximum Heap
Size
Amount of RAM allocated to the Java Virtual Machine (JVM) that runs the Service. Use this
property to increase the performance. Append one of the following letters to the value to
specify the units:
-
b for bytes
k for kilobytes
m for megabytes
g for gigabytes
Java Virtual Machine (JVM) command line options to run Java-based programs. When you
configure the JVM options, you must set the Java SDK classpath, Java SDK minimum
memory, and Java SDK maximum memory properties.
Description
Name
Value
342
1.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
3.
Specify the general properties of the Reporting and Dashboards Service, and click Next.
4.
Specify the security properties for the Reporting and Dashboards Service, and click Next.
5.
Specify the database properties for the Reporting and Dashboards Service.
6.
Click Test Connection to verify that the database connection configuration is correct.
7.
To use existing content, select Do Not Create New Content.You can create the Reporting and
Dashboard Service with the repository content that exists in the database. Select this option if the
specified database already contains Jasper repository content. This is the default.
To create new content, select Create New Content.You can create Jasper repository content if no
content exists in the database. Select this option to create Jasper repository content in the specified
database.
When you create repository content, the Informatica service platform creates database schema that the
Reporting and Dashboards Service needs. If the specified database already contains the database
schema of an existing Reporting and Dashboards Service, you can use the database without creating
new content.
8.
Click Finish.
After you create a Reporting and Dashboards Service, you can edit the advanced properties for the service in
the Processes tab.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
In the Domain Navigator, select the Reporting and Dashboard Service for the repository that you want to
upgrade.
3.
On the Manage tab Actions menu, click Repository Contents > Upgrade Jasper Repository
Contents.
Reports
You can run the PowerCenter and Metadata Manager reports from JasperReports Server. You can also run
the reports from the PowerCenter Client and Metadata Manager to view them in JasperReports Server.
Reporting Source
To run reports associated with a service, you must add a reporting source for the Reporting and Dashboards
Service.
When you add a reporting source, choose the data source to report against. To run the reports against the
PowerCenter repository, select the associated PowerCenter Repository Service and specify the PowerCenter
repository details. To run the Metadata Manager reports, select the associated Metadata Manager Service
and specify the repository details.
The database type of the reporting source can be IBM DB2, Oracle, Microsoft SQL Server, or Sybase ASE.
Based on the database type, specify the database driver, JDBC URL, and database user credentials. For the
343
JDBC connect string, specify the host name and the port number. Additionally, specify the SID for Oracle and
specify the database name for IBM DB2, Microsoft SQL Server, and Sybase ASE.
For an instance of the Reporting and Dashboards Service, you can create multiple reporting data sources.
For example, to one Reporting and Dashboards Service, you can add a PowerCenter data source and a
Metadata Manager data source.
Select the Reporting and Dashboards Service in the Navigator and click Action > Add Reporting
Source.
2.
Select the PowerCenter Reporting Service or Metadata Manager Service that you want to use as the
data source.
3.
4.
Specify the database driver that the Reporting and Dashboards Service uses to connect to the data
source.
5.
Specify the JDBC connect string based on the database driver you select.
6.
7.
8.
Running Reports
After you create a Reporting and Dashboards Service, add a reporting source to run reports against the data
in the data source.
All reports available for the specified reporting source are available in Jaspersoft Server. Click View >
Repository > Service Name to view the reports.
2.
3.
Repeat the process for all the report folders that you want to export.
344
2.
3.
Repeat the process for all the report folders that you want to export.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
3.
Modify values for the Reporting and Dashboards Service general properties.
Note: You cannot enable the Reporting and Dashboards Service if you change the node.
4.
345
5.
346
Click Edit to create repository contents or to modify the security properties, the database properties, the
advanced properties, and the environment variables.
CHAPTER 19
SAP BW Service
This chapter includes the following topics:
Load Balancing for the SAP BW System and the SAP BW Service, 354
Use the Administrator tool to complete the following SAP BW Service tasks:
View messages that the SAP BW Service sends to the Log Manager.
347
2.
3.
4.
To create an SAP BW Service for PowerCenter, click Actions > New > PowerCenter SAP BW
Service. The New PowerCenter SAP BW Service window appears.
To create an SAP BW Service for the Developer tool, click Actions > New > SAP BW Service. The
New SAP BW Service window appears.
Description
Name
Description
348
Location
Name of the domain and folder in which the Administrator tool must create the SAP
BW Service. By default, the Administrator tool creates the SAP BW Service in the
domain where you are connected. Click Browse to select a new folder in the domain.
License
License file.
Node
SAP Destination
R Type
DEST entry defined in the sapnwrfc.ini file to connect to the SAP BW Service.
Associated
Integration
Service
The PowerCenter Integration Service that you want to associate with the SAP BW
Service.
Repository User
Name
Property
Description
Repository
Password
Security Domain
Note: If secure communication is enabled for the domain, you do not need to specify
the repository password.
Security domain for the user. Appears when the Informatica domain contains an LDAP
security domain.
The following table describes the information that you must enter when you create an SAP BW Service
for the Developer tool:
Property
Description
Name
Description
Location
Name of the domain and folder in which the Administrator tool must create the SAP
BW Service. By default, the Administrator tool creates the SAP BW Service in the
domain where you are connected. Click Browse to select a new folder in the domain.
License
License file.
Node
Program ID
Program ID for the logical system that you create in SAP BW for the SAP BW Service.
The Program ID in SAP BW must match this parameter, including case.
Gateway Host
Gateway Server
SAP Connection
Trace
Use this option to track the JCo calls that the SAP system makes. SAP stores the
information about the JCo calls in a trace file.
Specify one of the following values:
- 0. Off
- 1. Full
Default is 0.
You can access the trace files from the following directory on the machine where you
installed the Informatica services:
<Informatica installation directory>/tomcat/bin
349
Property
Description
Other Connection
Parameters
Associated Data
Integration
Service
The Data Integration Service that you want to associate with the SAP BW Service.
Repository User
Name
Repository
Password
5.
Click OK.
The SAP BW Service is created.
350
When you disable the SAP BW Service, select one of the following options:
Complete. Disables the SAP BW Service after all service processes complete.
Abort. Aborts all processes immediately and then disables the SAP BW Service. You might choose abort if
a service process stops responding.
In the Domain Navigator of the Administrator tool, select the SAP BW Service.
2.
In the Domain Navigator of the Administrator tool, select the SAP BW Service.
2.
3.
2.
In the Properties tab, click Edit corresponding to the category of properties that you want to update.
3.
Update the property values and restart the SAP BW Service for the changes to take effect.
General Properties
The following table describes the general properties for the service:
Property
Description
Name
Name of the service. The name is not case sensitive and must be unique within the domain. It
cannot exceed 128 characters or begin with @. It also cannot contain spaces or the following
special characters:
`~%^*+={}\;:'"/?.,<>|!()][
You cannot change the name of the service after you create it.
Description
License
Node
351
Retry Period
Description
DEST entry defined in the sapnwrfc.ini file for a connection to an RFC server program.
Edit this property if you have created a different DEST entry in the sapnwrfc.ini file for
the SAP BW Service.
Number of seconds the SAP BW Service waits before trying to connect to the SAP BW
system if a previous connection failed. The SAP BW Service tries to connect five times.
Between connection attempts, it waits the number of seconds you specify. After five
unsuccessful attempts, the SAP BW Service shuts down.
Default is 5 seconds.
The following table describes the SAP BW Service properties for the Developer tool:
Property
Description
Program ID
Program ID for the logical system you create in SAP BW for the SAP BW Service.
The Program ID in SAP BW must match this parameter, including case.
Gateway Host
Gateway Server
SAP Connection
SAP connection.
Specify a connection to a specific SAP application server or an SAP load balancing
connection.
Trace
Use this option to track the JCo calls that the SAP system makes. SAP stores the information
about the JCo calls in a trace file.
Specify one of the following values:
- 0. Off
- 1. Full
Default is 0.
You can access the trace files from the following directory on the machine where you
installed the Informatica services:
<Informatica installation directory>/tomcat/bin
Other
Connection
Parameters
Retry Period
352
2.
3.
4.
To configure an SAP BW Service for the Developer tool, click Associated Data Integration Service.
Description
Associated Integration
Service
or
Assoicated Data
Integration Service
Repository User Name
Repository Password
Security Domain
5.
Security domain for the user. Appears when the Informatica domain contains an
LDAP security domain.
2.
3.
Click Processes.
4.
Click Edit.
353
5.
Description
ParamFileDir
Temporary parameter file directory. The SAP BW Service stores SAP BW data selection
entries in the parameter file when you filter data to load into SAP BW.
The directory must exist on the node where the SAP BW Service runs. Verify that the
directory you specify has read and write permissions enabled.
The default directory is <Informatica installation directory>/services/
shared/BWParam.
Administrator tool. On the Logs tab, enter search criteria to find log events that the SAP BW Service
captures when extracting from or loading into SAP NetWeaver BI.
SAP BW Monitor. In the Monitor - Administrator Workbench window, you can view log events that the SAP
BW Service captures for an InfoPackage that is included in a process chain to load data into SAP BW.
SAP BW pulls the messages from the SAP BW Service and displays them in the monitor. The SAP BW
Service must be running to view the messages in the SAP BW Monitor.
To view log events about how the Integration Service processes an SAP BW workflow, view the session log
or workflow log.
354
CHAPTER 20
Search Service
This chapter includes the following topics:
355
When you create the Search Service, you specify the associated Model Repository Service. The Search
Service determines the associated Data Integration Service based on the Model Repository Service.
To enable search across multiple repositories, the Search Service builds a search index based on content in
one Model repository and one profiling warehouse. To enable search on multiple Model repositories or
multiple profiling warehouses, create multiple Search Services.
The Search Service extracts content, including business glossary terms, from the Model repository
associated with the Model Repository Service. The Search Service extracts column profile results and
domain discovery results from the profiling warehouse associated with the Data Integration Service. The
Search Service also extracts permission information to ensure that the user who submits a search request
has permission to view each object returned in the search results. The Search Service stores the permission
information in a permission cache.
Users can perform a search in the Analyst tool or Business Glossary Desktop. When a user performs a
search in the Analyst tool, the Analyst Service submits the request to the Search Service. When a user
performs a search in Business Glossary Desktop, Business Glossary Desktop submits the request to the
Search Service. The Search Service returns results from the search index based on permissions in the
permission cache.
356
Search Index
The Search Service performs each search on a search index instead of the Model repository or profiling
warehouse. The search index enables faster searches and searches on content from the Model repository
and profiling warehouse.
The Search Service generates the search index based on content in the Model repository and profiling
warehouse. The Search Service contains extractors to extract content from each repository.
The Search Service contains the following extractors:
Model Repository extractor
Extracts content from a Model repository.
Business Glossary extractor
Extracts business glossary terms from the Model repository.
Profiling Warehouse extractor
Extracts column profiling results and domain discovery results from a profiling warehouse.
The Search Service indexes all content that it extracts. The Search Service maintains one search index for all
extracted content. If a search index does not exist when the Search Service starts, the Search Service
generates the search index.
During the initial extraction, the Search Service extracts and indexes all content. After the first extraction, the
Search Service updates the search index based on content that has been added to, changed in, or removed
from the Model repository and profiling warehouse since the previous extraction. You can configure the
interval at which the Search Service generates the search index.
The Search Service extracts and indexes batches of objects. If it fails to extract or index an object, it retries
again. After the third attempt, the Search Service ignores the object, writes an error message to the Search
Service log, and then processes the next object.
The Search Service stores the search index in files in the extraction directory that you specify when you
create the Search Service.
Extraction Interval
The Search Service extracts content based on the interval that you configure. You can configure the interval
when you create the Search Service or update the service properties.
The extraction interval is the number of seconds between each extraction.
The Search Service returns search results from the search index. The search results depend on the
extraction interval. For example, if you set the extraction interval to 360 seconds, a user may have to wait up
to 360 seconds before an object appears in the search results.
Search Index
357
A user enters search criteria in the Analyst tool or Business Glossary Desktop.
2.
For a search in the Analyst tool, the corresponding Analyst Service sends the search request to the
Search Service. For a search in Business Glossary Desktop, Business Glossary Desktop sends the
search request to the Search Service.
3.
The Search Service retrieves the search results from the search index based on the search criteria.
4.
The Search Service verifies permissions on each search result and returns objects on which the user
has read permission.
Note: The domain administrator must start the Search Service before the Search Service can return any
search results. If the Search Service is not running when a user performs a search, an error appears.
General properties
Logging options
Search options
Custom properties
If you update any of the properties, recycle the Search Service for the modifications to take effect.
358
License
License object that allows use of the service.
Node
Node on which the service runs.
Error. Writes ERROR code messages to the log. ERROR messages include connection failures, failures
to save or retrieve metadata, service errors.
Warning. Writes WARNING and ERROR code messages to the log. WARNING errors include recoverable
system failures or warnings.
Info. Writes INFO, WARNING, and ERROR code messages to the log. INFO messages include system
and service change messages.
Tracing. Writes TRACE, INFO, WARNING, and ERROR code messages to the log. TRACE messages log
user request failures such as SQL request failures, mapping run request failures, and deployment failures.
Debug. Writes DEBUG, TRACE, INFO, WARNING, and ERROR code messages to the log. DEBUG
messages are user request logs.
Default is INFO.
359
Password
An encrypted version of the user password to access the Model repository. Not available for a domain
with Kerberos authentication.
Modify Password
Select to specify a different password than the one associated with the Model repository user. Select this
option if the password changes for a user. Not available for a domain with Kerberos authentication.
Security Domain
LDAP security domain for the Model repository user. The field appears when the Informatica domain
contains an LDAP security domain. Not available for a domain with Kerberos authentication.
Advanced properties
Environment variables
Custom properties
If you update any of the process properties, restart the Search Service for the modifications to take effect.
360
b for bytes.
k for kilobytes.
m for megabytes.
g for gigabytes.
Default is 768 megabytes. Specify 1 gigabyte if you run the Search Service on a 64-bit machine.
JVM Command Line Options
Java Virtual Machine (JVM) command line options to run Java-based programs.
You must set the following JVM command line options:
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
3.
361
4.
Optionally, click Browse in the Location field to select the location in the Navigator where you want to
the service to appear.
The Select Folder dialog box appears.
5.
6.
Click OK.
The Select Folder dialog box closes.
7.
Click Next.
The New Search Service - Step 2 of 2 window appears.
8.
9.
Click Finish.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
In the Domain Navigator of the Administrator tool, select the Search Service.
3.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
In the Domain Navigator of the Administrator tool, select the Search Service.
3.
Click the Disable the Service button or the Recycle the Service button.
The Disable Service or Recycle Service dialog box appears.
4.
362
Stop. Waits up to 30 seconds to complete jobs that are running before disabling or recycling the
service.
Abort. Tries to stop all jobs before aborting them and disabling or recycling the service.
363
CHAPTER 21
System Services
This chapter includes the following topics:
By default, system services are disabled and are assigned to run on the master gateway node. You can
change the node assignment and enable the service to use the functionality that the service provides.
The domain includes the following system services:
Email Service
The Email Service emails notifications for business glossaries and workflows. Enable the Email Service
to allow users to configure email notifications.
364
Email Service
The Email Service emails notifications for business glossaries and workflows. Enable the Email Service to
allow users to configure email notifications.
The Email Service emails the following notifications:
Workflow notifications. Workflow notifications include emails sent from Human tasks and Notification tasks
in workflows that the Data Integration Service runs.
The Email Service is associated with a Model Repository Service. The Model repository stores metadata for
the email notifications that users configure. Both the Model Repository Service and the Email Service must
be available for the Email Service to send email notifications.
The Email Service is highly available. High availability enables the Service Manager and the Email Service to
react to network failures and failures of the Email Service. The Email Service has the restart and failover high
availability feature. If a Email Service becomes unavailable, the Service Manager can restart the service on
the same node or on a back-up node.
If the domain uses Kerberos authentication and you set the service principal level at the process level,
create a keytab file for the service. For more information about creating the service principal names and
keytab files, see the Informatica Security Guide.
Email Service
365
click Edit in the Properties view. You can change the properties while the service is running, but you must
recycle the service for the changed properties to take effect.
General Properties
The following table describes the general properties for the service:
Property
Description
Name
Name of the service. You cannot change the name of the Email Service.
Description
Node
Backup Nodes
Nodes on which the service can run if the primary node is unavailable.
Description
Model Repository
Service
Username
User name of an administrator user in the Informatica domain. Not available for a domain
with Kerberos authentication.
Password
Password of the administrator user in the Informatica domain. Not available for a domain
with Kerberos authentication.
366
Workflow notifications. Workflow notifications include emails sent from Human tasks and Notification tasks
in workflows that the Data Integration Service runs.
The following table describes the email server properties for the service:
Property
Description
The SMTP outbound mail server host name. For example, enter the Microsoft
Exchange Server for Microsoft Outlook.
Default is localhost.
Port number used by the outbound SMTP mail server. Valid values are from 1 to
65535. Default is 25.
User name for authentication upon sending, if required by the outbound SMTP mail
server.
Password for authentication upon sending, if required by the outbound SMTP mail
server.
SMTP Authentication
Enabled
Indicates that the SMTP server is enabled for authentication. If true, the outbound mail
server requires a user name and password.
Default is false.
Indicates that the SMTP server uses the TLS protocol. If true, enter the TLS port
number for the SMTP server port property.
Default is false.
Indicates that the SMTP server uses the SLL protocol. If true, enter the SSL port
number for the SMTP server port property.
Default is false.
Email address that the Email Service uses in the From field when sending notification
emails from a workflow. Default is [email protected].
Process Configuration. The state of the process configured to run on the node. The state can be Enabled
or Disabled.
Process State. The state of the service process running on the node. The state can be Enabled or
Disabled.
Node Role. Indicates whether the node has the service role, the compute role, or both roles.
Node Status. The state of the node that the process is running on. The state can be Enabled or Disabled.
Email Service
367
Email Service, a service process starts on the node designated to run the service. The service is available to
send emails based on the notification properties that users configure.
You might disable the Email Service if you need to perform maintenance. You might recycle the Email
Service if you connect to a different Model Repository Service.
When you recycle or disable an Email Service, you must choose a mode to recycle or disable it in. You can
choose one of the following options:
Optionally, you can choose to specify whether the action was planned or unplanned, and enter comments
about the action. If you complete these options, the information appears in the Events and History panels in
the Domain view on the Manage tab.
To enable the service, select the service in the Domain Navigator and click Enable the Service.
To disable the service, select the service in the Domain Navigator and click Disable the Service.
To recycle the service, select the service in the Domain Navigator and click Recycle the Service. When you
recycle the service, the Service Manager restarts the service. You must recycle the Email Service whenever
you change a property for the service.
368
When you enable a Data Integration Service that runs on the grid, the Data Integration Service designates
one node with the compute role as the master compute node. The Service Manager on the master compute
node communicates with the Resource Manager Service to find an available worker compute node to run job
requests.
General Properties
In the general properties, configure the primary and back-up nodes for the Resource Manager Service.
The following table describes the general properties for the service:
Property
Description
Name
Name of the service. You cannot change the name of the Resource Manager Service.
Description
Node
Backup Nodes
Nodes on which the service can run if the primary node is unavailable.
Logging Options
The following table describes the log level property for the Resource Manager Service:
Property
Description
Log Level
Determines the default severity level for the service logs. Choose one of the following options:
- Fatal. Writes FATAL messages to the log. FATAL messages include nonrecoverable system failures that
cause the service to shut down or become unavailable.
- Error. Writes FATAL and ERROR code messages to the log. ERROR messages include connection
failures, failures to save or retrieve metadata, service errors.
- Warning. Writes FATAL, WARNING, and ERROR messages to the log. WARNING errors include
recoverable system failures or warnings.
- Info. Writes FATAL, INFO, WARNING, and ERROR messages to the log. INFO messages include
system and service change messages.
- Trace. Write FATAL, TRACE, INFO, WARNING, and ERROR code messages to the log. TRACE
messages log user request failures.
- Debug. Write FATAL, DEBUG, TRACE, INFO, WARNING, and ERROR messages to the log. DEBUG
messages are user request logs.
369
Environment Variables
You can configure environment variables for the Resource Manager Service process.
The following table describes the environment variables:
Property
Description
Environment Variable
Advanced Options
The following table describes the advanced options:
Property
Description
Amount of RAM allocated to the Java Virtual Machine (JVM) that runs the service
process. Use this property to increase the performance. Append one of the
following letters to the value to specify the units:
-
b for bytes.
k for kilobytes.
m for megabytes.
g for gigabytes.
Java Virtual Machine (JVM) command line options to run Java-based programs.
When you configure the JVM options, you must set the Java SDK classpath, Java
SDK minimum memory, and Java SDK maximum memory properties.
You must set the following JVM command line options:
- Xms. Minimum heap size. Default value is 256 m.
- MaxPermSize. Maximum permanent generation size. Default is 128 m.
- Dfile.encoding. File encoding. Default is UTF-8.
370
When you disable a Resource Manager Service, you must choose the mode to disable it in. You can choose
one of the following options:
Optionally, you can choose to specify whether the action was planned or unplanned, and enter comments
about the action. If you complete these options, the information appears in the Events and Command
History panels in the Domain view on the Manage tab.
To enable the service, select the service in the Domain Navigator and click Enable the Service.
To disable the service, select the service in the Domain Navigator and click Disable the Service.
To recycle the service, select the service in the Domain Navigator and click Recycle the Service.
Note: If the Resource Manager Service is configured to run on primary and back-up nodes, you can enable or
disable a Resource Manager Service process on the Processes view. Disabling a service process does not
disable the service. Disabling a service process that is running causes the service to fail over to another
node.
Scheduler Service
The Scheduler Service manages schedules for deployed mappings and workflows that the Data Integration
Service runs.
Use schedules to run deployed mappings and workflows at a specified time. You can schedule the objects to
run one time, or on an interval. Enable the Scheduler Service to create, manage, and run schedules.
The Scheduler Service is associated with a Model Repository Service. The Model repository stores metadata
for the schedules that users configure. Both the Model Repository Service and the Scheduler Service must
be available for scheduled objects to run.
The Scheduler Service is highly available. High availability enables the Service Manager and the Scheduler
Service to react to network failures and failures of the Scheduler Service. The Scheduler Service has the
restart and failover high availability feature. If a Scheduler Service becomes unavailable, the Service
Manager can restart the service on the same node or on a back-up node.
If the domain uses Kerberos authentication and you set the service principal level at the process level,
create a keytab file for the service. For more information about creating the service principal names and
keytab files, see the Informatica Security Guide.
Scheduler Service
371
Edit in the Properties view. You can change the properties while the service is running, but you must recycle
the service for the modifications to take effect.
General Properties
The following table describes the general properties for the service:
Property
Description
Name
Name of the service. You cannot change the name of the Scheduler Service.
Description
Node
Backup Nodes
Nodes on which the service can run if the primary node is unavailable.
Logging Options
Configure the Logging Level property to determine the level of error messages that are written to the
Scheduler Service log.
The following table describes the logging level properties for the service:
Property
Description
Logging
Level
Determines the default severity level for the service logs. Choose one of the following options:
- Fatal. Writes FATAL messages to the log. FATAL messages include nonrecoverable system failures
that cause the service to shut down or become unavailable.
- Error. Writes FATAL and ERROR code messages to the log. ERROR messages include connection
failures, failures to save or retrieve metadata, service errors.
- Warning. Writes FATAL, WARNING, and ERROR messages to the log. WARNING errors include
recoverable system failures or warnings.
- Info. Writes FATAL, INFO, WARNING, and ERROR messages to the log. INFO messages include
system and service change messages.
- Trace. Write FATAL, TRACE, INFO, WARNING, and ERROR code messages to the log. TRACE
messages log user request failures.
- Debug. Write FATAL, DEBUG, TRACE, INFO, WARNING, and ERROR messages to the log. DEBUG
messages are user request logs.
372
The following table describes the Model repository options for the service:
Property
Description
Model Repository
Service
Username
User name of an administrator user in the Informatica domain. Not available for a domain
with Kerberos authentication.
Password
Password of the administrator user in the Informatica domain. Not available for a domain
with Kerberos authentication.
Security Domain
LDAP security domain for the user who manages the Scheduler Service. The security
domain field does not appear for users with Native or Kerberos authentication.
Storage Properties
Configure a temporary file location when you configure the Scheduler Service to run on multiple nodes. Use
the temporary file location to store parameter files for deployed mappings and workflows. The file location
must be a directory that all of the nodes can access.
The following table describes the Temporary File Location property:
Property
Description
Path to the directory where parameter files are read from and written to.
Security Properties
When you set the HTTP protocol type for the Scheduler Service to HTTPS or both, you enable the Transport
Layer Security (TLS) protocol for the service. Depending on the HTTP protocol type of the service, you define
the HTTP port, the HTTPS port, or both ports for the service process.
Scheduler Service
373
Description
HTTP Port
Unique HTTP port number for the Scheduler Service process when the service uses the HTTP
protocol.
Default is 6211.
HTTPS Port
Unique HTTPS port number for the Scheduler Service process when the service uses the HTTPS
protocol.
When you set an HTTPS port number, you must also define the keystore file that contains the
required keys and certificates.
374
Property
Description
Keystore File
Path and file name of the keystore file that contains the keys and certificates. Required if
you use HTTPS connections for the service. You can create a keystore file with a keytool.
Keytool is a utility that generates and stores private or public key pairs and associated
certificates in a keystore file. You can use the self-signed certificate or use a certificate
signed by a certificate authority.
Keystore
Password
Truststore File
Path and file name of the truststore file that contains authentication certificates trusted by
the service.
Truststore
Password
SSL Protocol
Advanced Options
You can configure maximum heap size and JVM command line options for the Scheduler Service.
The following table describes the advanced options:
Property
Description
Amount of RAM allocated to the Java Virtual Machine (JVM) that runs the service
process. Use this property to increase the performance. Append one of the
following letters to the value to specify the units:
-
b for bytes.
k for kilobytes.
m for megabytes.
g for gigabytes.
Java Virtual Machine (JVM) command line options to run Java-based programs.
When you configure the JVM options, you must set the Java SDK classpath, Java
SDK minimum memory, and Java SDK maximum memory properties.
You must set the following JVM command line options:
-
Environment Variables
You can configure environment variables for the Scheduler Service process.
The following table describes the environment variables:
Property
Description
Environment Variable
Optionally, you can choose to specify whether the action is planned or unplanned, and enter comments about
the action. If you complete these options, then the information appears in the service Events and Command
History panels in the Domain view on the Manage tab.
To enable the service, select the service in the Domain Navigator and click Enable the Service.
Scheduler Service
375
To disable the service, select the service in the Domain Navigator and click Disable the Service.
To recycle the service, select the service in the Domain Navigator and click Recycle the Service. When you
recycle the service, the Service Manager restarts the service. You must recycle the Scheduler Service
whenever you change a property for the service.
376
CHAPTER 22
General properties
Service properties
Advanced properties
377
If you update a property, restart the Test Data Manager Service to apply the update.
General Properties
The following table describes the general properties for the service:
Property
Description
Name
Name of the service. The name is not case sensitive and must be unique within the domain. It
cannot exceed 128 characters or begin with @. It also cannot contain spaces or the following
special characters:
`~%^*+={}\;:'"/?.,<>|!()][
You cannot change the name of the service after you create it.
Description
Location
Domain and folder where the service is created. Click Browse to choose a different folder. You can
move the service after you create it.
License
Node
Service Properties
The following table describes the service properties that you configure for the Test Data Manager service:
378
Property
Description
PowerCenter
Repository Service
PowerCenter Repository Service that the Test Data Manager service uses to load
metadata into the Test Data Manager repository.
PowerCenter
Integration Service
PowerCenter Integration service that runs the workflows that you generate in Test Data
Manager for data subset and data masking operations.
Model Repository
Service
Name of the Model Repository Service that you want to associate with the Test Data
Manager service.
User Name
Password
The password of the user name to access the Model Repository Service.
Security Domain
Name of the security domain that the user belongs to. Select the security domain from
the list.
Data Integration
Service
Name of the Data Integration Service that performs data discovery operations. If you
have enabled profiling, or if you use Hadoop connections, you must select the Data
Integration Service in the domain.
Analyst Service
Name of the Analyst Service that TDM uses for asset linking. Required if you want to
link TDM global objects to the Business Glossary assets.
Property
Description
Required if you use the TDM setup for data discovery or profiling. Select True to enable
data profiling or False to disable data profiling.
Required if you want to configure a test data warehouse. Select this option to allow you
to configure the test data repository and test data mart from Test Data Manager.
Description
Database Type
Oracle
Microsoft SQL Server
DB2
Custom. Select this option to use custom database drivers instead of the Informatica database
drivers.
If you select Custom, you must save the JDBC driver JAR to the following locations:
- <INFA_HOME>/tomcat/endorsed. If the endorsed folder does not exist, create the
folder. Restart the domain after you copy the JAR.
- <INFA_HOME>/TDM/lib.
- <INFA_HOME>/TDM/offline/lib.
- <INFA_HOME>/services/TDMService.
Use Trusted
Connection
Available for Microsoft SQL Server. Select this if you want to log in using Windows login
credentials.
Custom Driver
Class
Custom JDBC parameters. Required if you select Custom database type. Enter the custom
JDBC driver parameters.
Username
User account for the TDM repository database. Set up this account using the appropriate
database client tools. To apply changes, restart the Test Data Manager Service.
Password
Password for the TDM repository database user. Must be in 7-bit ASCII. To apply changes,
restart the Test Data Manager Service.
JDBC URL
Connection
String
Native connect string to the TDM repository database. The Test Data Manager Service uses
the connect string to create a connection object to the TDM repository and the PowerCenter
repository. To apply changes, restart the Test Data Manager Service.
Schema Name
Available for Microsoft SQL Server. Name of the schema for the domain configuration tables. If
not selected, the service creates the tables in the default schema.
379
Property
Description
Tablespace
Name
Available for DB2. Name of the tablespace in which to create the tables. You must define the
tablespace on a single node and the page size must be 32 KB. In a multipartition database, you
must select this option. In a single-partition database, if you do not select this option, the
installer creates the tables in the default tablespace.
Creation
options for the
New Test Data
Manager
Service
Options to create content, or use existing content, and upgrade existing content.
- Do not create new content. Creates the repository without creating content. Select this option if the
database content exists. If the content is of a previous version, the service prompts you to upgrade
the content to the current version.
- Previous Test Data Manager Service Name: Enter the name of the previous Test Data Manager
Service. Required if you create the service with a different name.
Note: If you create the Test Data Manager Service with a different name, the source and target
connections do not appear in Test Data Manager. Import the connections again if the
connections do not appear in Test Data Manager.
- Upgrade TDM Repository Contents. Upgrades the content to the current version.
- Create new content. Creates repository content.
380
Property
Description
HTTP Port
Port number that the TDM application runs on. The default is 6605.
Enable Transport
Layer Security (TLS)
Secures communication between the Test Data Manager Service and the domain.
HTTPS Port
Keystore File
Path and file name of the keystore file. The keystore file contains the keys and
certificates required if you use the SSL security protocol with the Test Data Manager
application. Required if you select Enable Secured Socket Layer.
Keystore Password
Password for the keystore file. Required if you select Enable Secured Socket Layer.
SSL Protocol
Advanced Properties
The following table describes the advanced properties that you can configure for the Test Data Manager
Service:
Property
Description
JVM Params
Connection Pool
Size
JMX Port
Shutdown Port
Port number that controls the server shutdown for TDM. The TDM Server listens for shutdown
commands on this port. Default is 6607.
Example
IBM DB2
dbname
mydatabase
servername@dbname
sqlserver@mydatabase
Oracle
oracle.world
Set up the TDM repository database. You enter the database information when you create the Test Data
Manager Service.
2.
Create a PowerCenter Repository Service, PowerCenter Integration Service, and Model Repository
Service.
381
3.
Optional. Create a Data Integration Service. Required if you use the data profiling feature or if you use
Hadoop connections in TDM.
4.
Optional. Create an Analyst Service. Required if you use the asset linking feature. The Analyst Service
license must support Business Glossary.
5.
Create the Test Data Manager Service and configure the service properties.
6.
2.
3.
4.
5.
6.
Enter the repository configuration properties and test the connection. The repository connection
information must be valid for the service to work.
a.
If no content exists, select Create new content. You cannot select this option if the database has
content.
b.
If the database content exists, select Do not create new content. If you entered a different name
for the Test Data Manager Service, you are prompted to enter the name of the previous Test Data
Manager Service. The application checks the version of the content. If the content is of a previous
version, an option to upgrade the repository content appears. Upgrade the repository content.
Creating the service without upgrading the content to the current version generates a warning.
7.
Choose to enable the Test Data Manager Service, and click Next.
8.
Enter values for the server configuration properties, and click Next.
9.
382
You can enable, disable, and recycle the Test Data Manager Service from the service Actions menu in the
Administrator tool. You can also use the tdm command line program to enable and disable the service.
2.
Select the TDM Service in the Domain Navigator to open the service properties.
Warning messages appear if the repository content is of an older version or if the content does not exist.
3.
Click Actions > Create Contents to create content, or click Actions > Upgrade Contents to upgrade
repository content.
2.
3.
Select a different node for the Node property, and then click OK.
4.
If the Test Data Manager Service is running in HTTPS security mode, change the Keystore File Location
to the path on the new node. Click Edit in the Server Configuration Properties section and update the
Keystore File location, and click OK.
5.
2.
3.
4.
383
5.
Select the Test Data Manager Service from the Assigned Services list and click Remove to unassign it.
6.
7.
8.
9.
Select the Test Data Manager Service from the Unassigned Services list and click Add to assign it.
10.
Click OK.
11.
Select the Test Data Manager from the Domain navigator in the Administrator tool.
2.
Disable the Test Data Manager Service by clicking Actions > Disable Service.
3.
384
CHAPTER 23
Create a Web Services Hub. You can create multiple Web Services Hub Services in a domain.
Enable or disable the Web Services Hub. You must enable the Web Services Hub to run web service
workflows. You can disable the Web Services Hub to prevent external clients from accessing the web
services while performing maintenance on the machine or modifying the repository.
Configure the Web Services Hub properties. You can configure Web Services Hub properties such as the
length of time a session can remain idle before time out and the character encoding to use for the service.
Configure the associated repository. You must associate a repository with a Web Services Hub. The Web
Services Hub exposes the web-enabled workflows in the associated repository.
View the logs for the Web Services Hub. You can view the event logs for the Web Services Hub in the Log
Viewer.
Remove a Web Services Hub. You can remove a Web Services Hub if it becomes obsolete.
385
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
On the Domain Navigator Actions menu, click New > Web Services Hub.
The New Web Services Hub Service window appears.
3.
Description
Name
Name of the Web Services Hub. The characters must be compatible with the code
page of the associated repository. The name is not case sensitive and must be
unique within the domain. It cannot exceed 128 characters or begin with @. It also
cannot contain spaces or the following special characters:
`~%^*+={}\;:'"/?.,<>|!()][
386
Description
Description of the Web Services Hub. The description cannot exceed 765
characters.
Location
Domain folder in which the Web Services Hub is created. Click Browse to select
the folder in the domain where you want to create the Web Services Hub.
License
License to assign to the Web Services Hub. If you do not select a license now,
you can assign a license to the service later. Required before you can enable the
Web Services Hub.
Node
Node on which the Web Services Hub runs. A Web Services Hub runs on a single
node. A node can run more than one Web Services Hub.
Associated Repository
Service
PowerCenter Repository Service to which the Web Services Hub connects. The
repository must be enabled before you can associate it with a Web Services Hub.
Repository Password
Security Domain
Security domain for the user. Appears when the Informatica domain contains an
LDAP security domain.
Property
Description
URLScheme
Indicates the security protocol that you configure for the Web Services Hub:
- HTTP. Run the Web Services Hub on HTTP only.
- HTTPS. Run the Web Services Hub on HTTPS only.
- HTTP and HTTPS. Run the Web Services Hub in HTTP and HTTPS modes.
HubHostName
HubPortNumber (http)
Optional. Port number for the Web Services Hub on HTTP. Default is 7333.
HubPortNumber
(https)
Port number for the Web Services Hub on HTTPS. Appears when the URL
scheme selected includes HTTPS. Required if you choose to run the Web
Services Hub on HTTPS. Default is 7343.
KeystoreFile
Path and file name of the keystore file that contains the keys and certificates
required if you use the SSL security protocol with the Web Services Hub.
Required if you run the Web Services Hub on HTTPS.
Keystore Password
Password for the keystore file. The value of this property must match the
password you set for the keystore file. If this property is empty, the Web Services
Hub assumes that the password for the keystore file is the default password
changeit.
InternalHostName
Host name on which the Web Services Hub listens for connections from the
PowerCenter Integration Service. If not specified, the default is the Web Services
Hub host name.
Note: If the host machine has more than one network card that results in multiple
IP addresses for the host machine, set the value of InternalHostName to the
internal IP address.
InternalPortNumber
4.
Port number on which the Web Services Hub listens for connections from the
PowerCenter Integration Service. Default is 15555.
Click Create.
After you create the Web Services Hub, the Administrator tool displays the URL for the Web Services Hub
Console. If you run the Web Services Hub on HTTP and HTTPS, the Administrator tool displays the URL for
both.
If you configure a logical URL for an external load balancer to route requests to the Web Services Hub, the
Administrator tool also displays the URL.
Click the service URL to start the Web Services Hub Console from the Administrator tool. If the Web Services
Hub is not enabled, you cannot connect to the Web Services Hub Console.
387
Services, at least one of the PowerCenter Repository Services must be running before you enable the Web
Services Hub.
If you enable the service but it fails to start, review the logs for the Web Services Hub to determine the
reason for the failure. After you resolve the problem, you must disable and then enable the Web Services
Hub to start it again.
When you disable a Web Services Hub, you must choose the mode to disable it in. You can choose one of
the following modes:
Stop. Stops all web enabled workflows and disables the Web Services Hub.
Abort. Aborts all web-enabled workflows immediately and disables the Web Services Hub.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
3.
4.
5.
6.
To disable the Web Services Hub with the default disable mode and then immediately enable the
service, click the Restart the Service button.
By default, when you restart a Web Services Hub, the disable mode is Stop.
Service properties. Configure service properties such as host name and port number.
Advanced properties. Configure advanced properties such as the level of errors written to the Web
Services Hub logs.
Custom properties. Configure custom properties that are unique to specific environments.
1.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
3.
4.
To edit the properties of the service, click Edit for the category of properties you want to update.
The Edit Web Services Hub Service window displays the properties in the category.
5.
388
General Properties
Select the node on which to run the Web Services Hub. You can run multiple Web Services Hub on the same
node.
Disable the Web Services Hub before you assign it to another node. To edit the node assignment, select the
Web Services Hub in the Navigator, click the Properties tab, and then click Edit in the Node Assignments
section. Select a new node.
When you change the node assignment for a Web Services Hub, the host name for the web services running
on the Web Services Hub changes. You must update the host name and port number of the Web Services
Hub to match the new node. Update the following properties of the Web Services Hub:
HubHostName
InternalHostName
To access the Web Services Hub on a new node, you must update the client application to use the new host
name. For example, you must regenerate the WSDL for the web service to update the host name in the
endpoint URL. You must also regenerate the client proxy classes to update the host name.
The following table describes the general properties for the service:
Property
Description
Name
Name of the service. The name is not case sensitive and must be unique within the
domain. It cannot exceed 128 characters or begin with @. It also cannot contain
spaces or the following special characters:
`~%^*+={}\;:'"/?.,<>|!()][
You cannot change the name of the service after you create it.
Description
License
Node
Service Properties
You must restart the Web Services Hub before changes to the service properties can take effect.
The following table describes the service properties for a Web Services Hub:
Property
Description
HubHostName
Name of the machine hosting the Web Services Hub. Default is the name of the
machine where the Web Services Hub is running. If you change the node on which the
Web Services Hub runs, update this property to match the host name of the new node.
To apply changes, restart the Web Services Hub.
HubPortNumber (http)
Port number for the Web Services Hub running on HTTP. Required if you run the Web
Services Hub on HTTP. Default is 7333. To apply changes, restart the Web Services
Hub.
HubPortNumber (https)
Port number for the Web Services Hub running on HTTPS. Required if you run the
Web Services Hub on HTTPS. Default is 7343. To apply changes, restart the Web
Services Hub.
389
Property
Description
CharacterEncoding
Character encoding for the Web Services Hub. Default is UTF-8. To apply changes,
restart the Web Services Hub.
URLScheme
Indicates the security protocol that you configure for the Web Services Hub:
- HTTP. Run the Web Services Hub on HTTP only.
- HTTPS. Run the Web Services Hub on HTTPS only.
- HTTP and HTTPS. Run the Web Services Hub in HTTP and HTTPS modes.
If you run the Web Services Hub on HTTPS, you must provide information on the
keystore file. To apply changes, restart the Web Services Hub.
InternalHostName
Host name on which the Web Services Hub listens for connections from the Integration
Service. If you change the node assignment of the Web Services Hub, update the
internal host name to match the host name of the new node. To apply changes, restart
the Web Services Hub.
InternalPortNumber
Port number on which the Web Services Hub listens for connections from the
Integration Service. Default is 15555. To apply changes, restart the Web Services Hub.
KeystoreFile
Path and file name of the keystore file that contains the keys and certificates required if
you use the SSL security protocol with the Web Services Hub. Required if you run the
Web Services Hub on HTTPS.
KeystorePass
Password for the keystore file. The value of this property must match the password you
set for the keystore file.
Advanced Properties
The following table describes the advanced properties for a Web Services Hub:
Property
Description
HubLogicalAddress
URL for the third party load balancer that manages the Web Services Hub. This
URL is published in the WSDL for all web services that run on a Web Services Hub
managed by the load balancer.
DTMTimeout
Length of time, in seconds, that the Web Services Hub tries to connect or reconnect
to the DTM to run a session. Default is 60 seconds.
SessionExpiryPeriod
Number of seconds that a session can remain idle before the session times out and
the session ID becomes invalid. The Web Services Hub resets the start of the
timeout period every time a client application sends a request with a valid session
ID. If a request takes longer to complete than the amount of time set in the
SessionExpiryPeriod property, the session can time out during the operation. To
avoid timing out, set the SessionExpiryPeriod property to a higher value. The Web
Services Hub returns a fault response to any request with an invalid session ID.
Default is 3600 seconds. You can set the SessionExpiryPeriod between 1 and
2,592,000 seconds.
MaxISConnections
390
Property
Description
Log Level
Configure the Log Level property to set the logging level. The following values are
valid:
- Fatal. Writes FATAL messages to the log. FATAL messages include nonrecoverable
system failures that cause the service to shut down or become unavailable.
- Error. Writes FATAL and ERROR code messages to the log. ERROR messages
include connection failures, failures to save or retrieve metadata, service errors.
- Warning. Writes FATAL, WARNING, and ERROR messages to the log. WARNING
errors include recoverable system failures or warnings.
- Info. Writes FATAL, INFO, WARNING, and ERROR messages to the log. INFO
messages include system and service change messages.
- Trace. Write FATAL, TRACE, INFO, WARNING, and ERROR code messages to the
log. TRACE messages log user request failures.
- Debug. Write FATAL, DEBUG, TRACE, INFO, WARNING, and ERROR messages to
the log. DEBUG messages are user request logs.
MaxQueueLength
Maximum queue length for incoming connection requests when all possible request
processing threads are in use. Any request received when the queue is full is
rejected. Default is 5000.
MaxStatsHistory
Number of days that Informatica keeps statistical information in the history file.
Informatica keeps a history file that contains information regarding the Web
Services Hub activities. The number of days you set in this property determines the
number of days available for which you can display historical statistics in the Web
Services Report page of the Administrator tool.
Amount of RAM allocated to the Java Virtual Machine (JVM) that runs the Web
Services Hub. Use this property to increase the performance. Append one of the
following letters to the value to specify the units:
-
b for bytes.
k for kilobytes.
m for megabytes.
g for gigabytes.
Java Virtual Machine (JVM) command line options to run Java-based programs.
When you configure the JVM options, you must set the Java SDK classpath, Java
SDK minimum memory, and Java SDK maximum memory properties.
You must set the following JVM command line option:
- Dfile.encoding. File encoding. Default is UTF-8.
Use the MaxConcurrentRequests property to set the number of clients that can connect to the Web Services
Hub and the MaxQueueLength property to set the number of client requests the Web Services Hub can
process at one time.
You can change the parameter values based on the number of clients you expect to connect to the Web
Services Hub. In a test environment, set the parameters to smaller values. In a production environment, set
the parameters to larger values. If you increase the values, more clients can connect to the Web Services
Hub, but the connections use more system resources.
391
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
On the Domain Navigator of the Administrator tool, select the Web Services Hub.
3.
4.
Click Add.
The Select Repository section appears.
392
5.
6.
Description
Associated Repository
Service
Name of the PowerCenter Repository Service to which the Web Services Hub
connects. To apply changes, restart the Web Services Hub.
User name to access the repository. Not available for a domain with Kerberos
authentication.
Repository Password
Password for the user. Not available for a domain with Kerberos authentication.
Security Domain
Security domain for the user. Appears when the Informatica domain contains an
LDAP security domain.
In the Administrator tool, click the Manage tab > Services and Nodes view.
2.
In the Domain Navigator, select the Web Services Hub for which you want to change an associated
repository.
3.
4.
In the section for the repository you want to edit, click Edit.
The Edit associated repository window appears.
5.
6.
Description
Associated Repository
Service
Name of the PowerCenter Repository Service to which the Web Services Hub
connects. To apply changes, restart the Web Services Hub.
Repository User
Name
User name to access the repository. Not available for a domain with Kerberos
authentication.
Repository Password
Password for the user. Not available for a domain with Kerberos authentication.
Security Domain
Security domain for the user. Appears when the Informatica domain contains an
LDAP security domain.
393
CHAPTER 24
394
Create, Edit, and Delete Projects privilege for the Model Repository Service and write permission on
projects.
To upgrade the Model Repository Service from the Actions menu or from the command line, a user must
have the following credentials:
Manage Services privilege for the domain and permission on the Model Repository Service.
Create, Edit, and Delete Projects privilege for the Model Repository Service and write permission on
projects.
2.
3.
4.
5.
Note: When you upgrade all other application services, the upgrade process upgrades the database contents
of the databases associated with the service.
395
services and associated databases that require an upgrade. You can also save the current or previous
upgrade report.
Note: The Metadata Manager Service must be disabled before the upgrade. All other services must be
enabled before the upgrade.
1.
2.
3.
4.
Click Next.
5.
If dependency errors exist, the Dependency Errors dialog box appears. Review the dependency errors
and click OK. Then, resolve dependency errors and click Next.
6.
7.
Click Next.
The service upgrade wizard upgrades each application service and associated database and displays
the status and processing details.
8.
When the upgrade completes, the Summary section displays the list of application services and their
upgrade status. Click each service to view the upgrade details in the Service Details section.
9.
10.
Click Close.
11.
If you did not choose to automatically recycle application services after the upgrade, restart the
upgraded services.
You can view the upgrade report and save the upgrade report. The second time you run the service upgrade
wizard, the Save Previous Report option appears in the service upgrade wizard. If you did not save the
upgrade report after upgrading services, you can select this option to view or save the previous upgrade
report.
396
If the upgrade process encounters a fatal error while rebuilding the object dependency graph, then the
upgrade of the service succeeds. You cannot view object dependencies in the Developer tool until you
rebuild the object dependency graph.
After you upgrade the Model Repository Service, verify that the Model Repository Service log includes the
following message:
MRS_50431 "Finished rebuilding the object dependency graph for project group '<project
group>'."
If the message does not exist in the log, run the infacmd mrs rebuildDependencyGraph command to rebuild
the object dependency graph. Users must not access Model repository objects until the rebuild process
completes, or the object dependency graph might not be accurate. Ask the users to log out of the Model
Repository Service before service upgrade.
The infacmd mrs rebuildDependencyGraph command uses the following syntax:
rebuildDependencyGraph
<-DomainName|-dn> domain_name
[<-SecurityDomain|-sdn> security_domain]
<-UserName|-un> user_name
<-Password|-pd> password
<-ServiceName|-sn> service_name
[<-ResilienceTimeout|-re> timeout_period_in_seconds]
397
APPENDIX A
398
Workflow repository
Jaspersoft repository
Model repository
PowerCenter repository
Profiling warehouse
To prepare the databases, verify the database requirements and set up the database. The database
requirements depend on the application services that you create in the domain and the number of data
integration objects that you build and store in the repositories.
The database user account must have permissions to create and drop tables, indexes, and views, and to
select, insert, update, and delete data from tables.
To prevent database errors in one repository from affecting any other repository, create each repository in
a separate database schema with a different database user account. Do not create a repository in the
same database schema as the domain configuration repository or any other repository in the domain.
If you create more than one domain, each domain configuration repository must have a separate user
account.
Oracle
Sybase ASE
Informatica does not support IBM DB2 table aliases for repository tables. Verify that table aliases have not
been created for any tables in the database.
399
If you create the repository in Microsoft SQL Server 2005, Microsoft SQL Server must be installed with
case-insensitive collation.
If you create the repository in Microsoft SQL Server 2005, the repository database must have a database
compatibility level of 80 or earlier. Data Analyzer uses non-ANSI SQL statements that Microsoft SQL
Server supports only on a database with a compatibility level of 80 or earlier.
To set the database compatibility level to 80, run the following query against the database:
sp_dbcmptlevel <DatabaseName>, 80
Or open the Microsoft SQL Server Enterprise Manager, right-click the database, and select Properties >
Options. Set the compatibility level to 80 and click OK.
Set the storage size for the tablespace to a small number to prevent the repository from using an
excessive amount of space. Also verify that the default tablespace for the user that owns the repository
tables is set to a small size.
The following example shows how to set the recommended storage parameter for a tablespace named
REPOSITORY:
ALTER TABLESPACE "REPOSITORY" DEFAULT STORAGE ( INITIAL 10K NEXT 10K MAXEXTENTS
UNLIMITED PCTINCREASE 50 );
Verify or change the storage parameter for a tablespace before you create the repository.
Verify that the database user has the CONNECT, RESOURCE, and CREATE VIEW privileges.
Informatica does not support Oracle public synonyms for repository tables. Verify that public synonyms
have not been created for any tables in the database.
Set the database server page size to 8K or higher. This is a one-time configuration and cannot be
changed afterwards.
The database for the Data Analyzer repository requires a page size of at least 8 KB. If you set up a Data
Analyzer database on a Sybase ASE instance with a page size smaller than 8 KB, Data Analyzer can
generate errors when you run reports. Sybase ASE relaxes the row size restriction when you increase the
page size.
Data Analyzer includes a GROUP BY clause in the SQL query for the report. When you run the report,
Sybase ASE stores all GROUP BY and aggregate columns in a temporary worktable. The maximum index
row size of the worktable is limited by the database page size. For example, if Sybase ASE is installed
with the default page size of 2 KB, the index row size cannot exceed 600 bytes. However, the GROUP BY
clause in the SQL query for most Data Analyzer reports generates an index row size larger than 600
bytes.
400
Verify the database user has CREATE TABLE and CREATE VIEW privileges.
Enable the Distributed Transaction Management (DTM) option on the database server.
Create a DTM user account and grant the dtm_tm_role to the user.
The following table lists the DTM configuration setting for the dtm_tm_role value:
DTM Configuration
Value
Distributed Transaction
Management privilege
sp_role "grant"
dtm_tm_role, username
Oracle
Verify that the database user account has CREATETAB and CONNECT privileges.
Informatica does not support IBM DB2 table aliases for repository tables. Verify that table aliases have not
been created for any tables in the database.
Set the NPAGES parameter to at least 5000. The NPAGES parameter determines the number of pages in
the tablespace.
Verify that the database user account has CONNECT and CREATE TABLE privileges.
Verify that the database user has the CONNECT, RESOURCE, and CREATE VIEW privileges.
Informatica does not support Oracle public synonyms for repository tables. Verify that public synonyms
have not been created for any tables in the database.
401
Oracle
Verify that the database user account has CREATETAB and CONNECT privileges.
Informatica does not support IBM DB2 table aliases for repository tables. Verify that table aliases have not
been created for any tables in the database.
Verify that the database user account has CONNECT and CREATE TABLE privileges.
Verify that the database user account has CONNECT and RESOURCE privileges.
Informatica does not support Oracle public synonyms for repository tables. Verify that public synonyms
have not been created for any tables in the database.
Oracle
402
For more information about configuring the database, see the documentation for your database system.
The database user account that creates the repository must have privileges to perform the following
operations:
ALTER TABLE
CREATE FUNCTION
CREATE INDEX
CREATE PROCEDURE
CREATE TABLE
CREATE VIEW
DROP PROCEDURE
DROP TABLE
INSERT INTO
The database user that creates the repository must be able to create tablespaces with page sizes of 32
KB.
Set up system temporary tablespaces larger than the default page size of 4 KB and update the heap
sizes.
Queries running against tables in tablespaces defined with a page size larger than 4 KB require system
temporary tablespaces with a page size larger than 4 KB. If there are no system temporary table spaces
defined with a larger page size, the queries can fail. The server displays the following error:
SQL 1585N A system temporary table space with sufficient page size does not exist.
SQLSTATE=54048
Create system temporary tablespaces with page sizes of 8 KB, 16 KB, and 32 KB. Run the following SQL
statements on each database to configure the system temporary tablespaces and update the heap sizes:
CREATE Bufferpool RBF IMMEDIATE SIZE 1000 PAGESIZE 32 K EXTENDED STORAGE ;
CREATE Bufferpool STBF IMMEDIATE SIZE 2000 PAGESIZE 32 K EXTENDED STORAGE ;
CREATE REGULAR TABLESPACE REGTS32 PAGESIZE 32 K MANAGED BY SYSTEM USING ('C:
\DB2\NODE0000\reg32' ) EXTENTSIZE 16 OVERHEAD 10.5 PREFETCHSIZE 16 TRANSFERRATE 0.33
BUFFERPOOL RBF;
CREATE SYSTEM TEMPORARY TABLESPACE TEMP32 PAGESIZE 32 K MANAGED BY SYSTEM USING
('C:\DB2\NODE0000\temp32' ) EXTENTSIZE 16 OVERHEAD 10.5 PREFETCHSIZE 16 TRANSFERRATE
0.33 BUFFERPOOL STBF;
GRANT USE OF TABLESPACE REGTS32 TO USER <USERNAME>;
UPDATE DB CFG FOR <DB NAME> USING APP_CTL_HEAP_SZ 16384
UPDATE DB CFG FOR <DB NAME> USING APPLHEAPSZ 16384
UPDATE DBM CFG USING QUERY_HEAP_SZ 8000
UPDATE DB CFG FOR <DB NAME> USING LOGPRIMARY 100
UPDATE DB CFG FOR <DB NAME> USING LOGFILSIZ 2000
UPDATE DB CFG FOR <DB NAME> USING LOCKLIST 1000
UPDATE DB CFG FOR <DB NAME> USING DBHEAP 2400
"FORCE APPLICATIONS ALL"
DB2STOP
DB2START
Set the locking parameters to avoid deadlocks when you load metadata into a Metadata Manager
repository on IBM DB2.
403
The following table lists the locking parameters you can configure:
Parameter Name
Value
LOCKLIST
8192
MAXLOCKS
10
LOCKTIMEOUT
300
DLCHKTIME
10000
Also, for IBM DB2 9.7 and earlier, set the DB2_RR_TO_RS parameter to YES to change the read policy
from Repeatable Read to Read Stability.
Informatica does not support IBM DB2 table aliases for repository tables. Verify that table aliases have not
been created for any tables in the database.
Note: If you use IBM DB2 as a metadata source, the source database has the same configuration
requirements.
The database user account that creates the repository must have privileges to perform the following
operations:
ALTER TABLE
CREATE CLUSTERED INDEX
CREATE INDEX
CREATE PROCEDURE
CREATE TABLE
CREATE VIEW
DROP PROCEDURE
DROP TABLE
INSERT INTO
404
If the repository must store metadata in a multibyte language, set the database collation to that multibyte
language when you install Microsoft SQL Server. For example, if the repository must store metadata in
Japanese, set the database collation to a Japanese collation when you install Microsoft SQL Server. This
is a one-time configuration and cannot be changed.
The database user account that creates the repository must have privileges to perform the following
operations:
ALTER TABLE
CREATE CLUSTER
CREATE INDEX
CREATE OR REPLACE FORCE VIEW
CREATE OR REPLACE PROCEDURE
CREATE OR REPLACE VIEW
CREATE TABLE
DROP TABLE
INSERT INTO TABLE
If the repository must store metadata in a multibyte language, set the NLS_LENGTH_SEMANTICS
parameter to CHAR on the database instance. Default is BYTE.
Informatica does not support Oracle public synonyms for repository tables. Verify that public synonyms
have not been created for any tables in the database.
405
Oracle
Allow 3 GB of disk space for DB2. Allow 200 MB of disk space for all other database types.
For more information about configuring the database, see the documentation for your database system.
If the repository is in an IBM DB2 9.7 database, verify that IBM DB2 Version 9.7 Fix Pack 7 or a later fix
pack is installed.
On the IBM DB2 instance where you create the database, set the following parameters to ON:
- DB2_SKIPINSERTED
- DB2_EVALUNCOMMITTED
- DB2_SKIPDELETED
- AUTO_RUNSTATS
Value
applheapsz
8192
appl_ctl_heap_sz
8192
For IBM DB2 9.5 only.
logfilsiz
8000
maxlocks
98
locklist
50000
auto_stmt_stats
ON
406
Set the NPAGES parameter to at least 5000. The NPAGES parameter determines the number of pages in
the tablespace.
Verify that the database user has CREATETAB, CONNECT, and BINDADD privileges.
Informatica does not support IBM DB2 table aliases for repository tables. Verify that table aliases have not
been created for any tables in the database.
In the DataDirect Connect for JDBC utility, update the DynamicSections parameter to 3000.
The default value for DynamicSections is too low for the Informatica repositories. Informatica requires a
larger DB2 package than the default. When you set up the DB2 database for the domain configuration
repository or a Model repository, you must set the DynamicSections parameter to at least 3000. If the
DynamicSections parameter is set to a lower number, you can encounter problems when you install or run
Informatica services.
For more information about updating the DynamicSections parameter, see Appendix D, Updating the
DynamicSections Parameter of a DB2 Database on page 447.
The database user account must have the CONNECT, CREATE TABLE, and CREATE VIEW privileges.
Verify that the database user has the CONNECT, RESOURCE, and CREATE VIEW privileges.
Informatica does not support Oracle public synonyms for repository tables. Verify that public synonyms
have not been created for any tables in the database.
Oracle
Sybase ASE
407
Note: Ensure that you install the database client on the machine on which you want to run the PowerCenter
Repository Service.
For more information about configuring the database, see the documentation for your database system.
To optimize repository performance, set up the database with the tablespace on a single node. When the
tablespace is on one node, PowerCenter Client and PowerCenter Integration Service access the
repository faster than if the repository tables exist on different database nodes.
Specify the single-node tablespace name when you create, copy, or restore a repository. If you do not
specify the tablespace name, DB2 uses the default tablespace.
Informatica does not support IBM DB2 table aliases for repository tables. Verify that table aliases have not
been created for any tables in the database.
Set the database server page size to 8K or higher. This is a one-time configuration and cannot be
changed afterwards.
Verify that the database user account has the CONNECT, CREATE TABLE, and CREATE VIEW
privileges.
Set the storage size for the tablespace to a small number to prevent the repository from using an
excessive amount of space. Also verify that the default tablespace for the user that owns the repository
tables is set to a small size.
The following example shows how to set the recommended storage parameter for a tablespace named
REPOSITORY:
ALTER TABLESPACE "REPOSITORY" DEFAULT STORAGE ( INITIAL 10K NEXT 10K MAXEXTENTS
UNLIMITED PCTINCREASE 50 );
Verify or change the storage parameter for a tablespace before you create the repository.
Verify that the database user has the CONNECT, RESOURCE, and CREATE VIEW privileges.
Informatica does not support Oracle public synonyms for repository tables. Verify that public synonyms
have not been created for any tables in the database.
408
Set the database server page size to 8K or higher. This is a one-time configuration and cannot be
changed afterwards.
Verify the database user has CREATE TABLE and CREATE VIEW privileges.
Value
5000
5000
8000
Number of locks
100000
Oracle
The database user account must have the CREATETAB, CONNECT, CREATE VIEW, and CREATE
FUNCTION privileges.
Informatica does not support IBM DB2 table aliases for repository tables. Verify that table aliases have not
been created for any tables in the database.
Set the NPAGES parameter to at least 5000. The NPAGES parameter determines the number of pages in
the tablespace.
The database user account must have the CONNECT, CREATE TABLE, CREATE VIEW, and CREATE
FUNCTION privileges.
409
Verify that the database user has CONNECT, RESOURCE, CREATE VIEW, CREATE PROCEDURE, and
CREATE FUNCTION privileges.
Informatica does not support Oracle public synonyms for repository tables. Verify that public synonyms
have not been created for any tables in the database.
Oracle
Verify that the database user account has CREATETAB and CONNECT privileges.
Verify that the database user has SELECT privileges on the SYSCAT.DBAUTH and
SYSCAT.DBTABAUTH tables.
Informatica does not support IBM DB2 table aliases for repository tables. Verify that table aliases have not
been created for any tables in the database.
Set the NPAGES parameter to at least 5000. The NPAGES parameter determines the number of pages in
the tablespace.
410
Verify that the database user account has CONNECT and CREATE TABLE privileges.
Verify that the database user account has CONNECT and RESOURCE privileges.
Informatica does not support Oracle public synonyms for repository tables. Verify that public synonyms
have not been created for any tables in the database.
Oracle
Verify that the database user account has CREATETAB and CONNECT privileges.
Informatica does not support IBM DB2 table aliases for repository tables. Verify that table aliases have not
been created for any tables in the database.
Set the NPAGES parameter to at least 5000. The NPAGES parameter determines the number of pages in
the tablespace.
Value
128
120 seconds
411
Verify that the database user account has CONNECT and CREATE TABLE privileges.
Value
128
120 seconds
Verify that the database user has the CONNECT, RESOURCE, and CREATE VIEW privileges.
Informatica does not support Oracle public synonyms for repository tables. Verify that public synonyms
have not been created for any tables in the database.
Parameter
Value
128
120 seconds
Optionally, configure the database for Oracle Advanced Security Option (ASO). You can activate Oracle
ASO for the database if the Informatica installation supports Oracle ASO.
For information about preparing the Informatica installation for Oracle ASO, consult the following
Informatica Knowledge Base article:
Can Oracle Advanced Security Option (ASO) be used with Informatica Data Quality Services? (KB
152376)
412
Source and target databases. Reads data from source databases and writes data to target
databases.
Profiling source databases. Reads from relational source databases to run profiles against the
sources.
Reference tables. Runs mappings to transfer data between the reference tables and the external data
sources.
When the Data Integration Service runs on a single node or on primary and back-up nodes, install
database client software and configure connectivity on the machines where the Data Integration Service
runs.
When the Data Integration Service runs on a grid, install database client software and configure
connectivity on each machine that represents a node with the compute role or a node with both the
service and compute roles.
PowerCenter Repository Service
The PowerCenter Repository Service uses native database drivers to connect to the PowerCenter
repository database.
Install database client software and configure connectivity on the machines where the PowerCenter
Repository Service and the PowerCenter Repository Service processes run.
PowerCenter Integration Service
The PowerCenter Integration Service uses native database drivers to connect to the following
databases:
Source and target databases. Reads from the source databases and writes to the target databases.
Metadata Manager source databases. Loads the relational data sources in Metadata Manager.
Install database client software associated with the relational data sources and the repository databases
on the machines where the PowerCenter Integration Service runs.
413
Install the following database client software based on the type of database that the application service
accesses:
IBM DB2 Client Application Enabler (CAE)
Configure connectivity on the required machines by logging in to the machine as the user who starts
Informatica services.
Microsoft SQL Server 2012 Native Client
Download the client from the following Microsoft website:
http://www.microsoft.com/en-in/download/details.aspx?id=29065.
Oracle client
Install compatible versions of the Oracle client and Oracle database server. You must also install the
same version of the Oracle client on all machines that require it. To verify compatibility, contact Oracle.
Sybase Open Client (OCS)
Install an Open Client version that is compatible with the Sybase ASE database server. You must also
install the same version of Open Client on the machines hosting the Sybase ASE database and
Informatica. To verify compatibility, contact Sybase.
Environment Variable
Name
Database
Utility
Oracle
ORACLE_HOME
sqlplus
PATH
IBM DB2
Sybase
ASE
414
DB2DIR
Value
db2connect
DB2INSTANCE
PATH
Add: <DatabasePath>/bin
SYBASE15
isql
SYBASE_ASE
SYBASE_OCS
PATH
Add: ${SYBASE_ASE}/bin:${SYBASE_OCS}/bin:
$PATH
APPENDIX B
415
Reporting Service
Analyst Service
Verify that the following environment variable settings have been established by IBM DB2 Client
Application Enabler (CAE):
DB2HOME=C:\IBM\SQLLIB
DB2INSTANCE=DB2
DB2CODEPAGE=1208 (Sometimes required. Use only if you encounter problems. Depends on
the locale, you may use other values.)
2.
Verify that the PATH environment variable includes the IBM DB2 bin directory. For example:
PATH=C:\WINNT\SYSTEM32;C:\SQLLIB\BIN;...
3.
4.
Configure the IBM DB2 client to connect to the database that you want to access. To configure the IBM
DB2 client:
a.
b.
c.
Run the following command in the IBM DB2 Command Line Processor to verify that you can connect to
the IBM DB2 database:
CONNECT TO <dbalias> USER <username> USING <password>
5.
If the connection is successful, run the TERMINATE command to disconnect from the database. If the
connection fails, see the database documentation.
416
Create an ODBC data source using the DataDirect ODBC Wire Protocol driver for Informix provided by
Informatica.
2.
Verify that you can connect to the Informix database using the ODBC data source.
2.
To avoid using empty string or nulls, use the reserved words PmNullUser for the user name and
PmNullPasswd for the password when you create a database connection.
417
If you choose the OLEDB provider type, you must install the Microsoft SQL Server 2012 Native Client to
configure native connectivity to the Microsoft SQL Server database. If you cannot to connect to the database,
verify that you correctly entered all of the connectivity information.
You can download the Microsoft SQL Server 2012 Native Client from the following Microsoft website:
http://www.microsoft.com/en-in/download/details.aspx?id=29065.
After you upgrade, the Microsoft SQL Server connection is set to the OLEDB provider type by default. It is
recommended that you upgrade all your Microsoft SQL Server connections to use the ODBC provider type.
You can upgrade all your Microsoft SQL Server connections to the ODBC provider type by using the following
commands:
If you are using PowerCenter, run the following command: pmrep upgradeSqlServerConnection
If you are using the Informatica platform, run the following command: infacmd.sh isp
upgradeSQLSConnection
If you want to use a Microsoft SQL Server connection without using a Data Source Name (DSN less
connection), you must configure the odbcinst.ini environment variable.
If you are using a DSN connection, you must add the entry "EnableQuotedIdentifiers=1" to the ODBC
DSN. If you do not add the entry, data preview and mapping run fail.
You can use the Microsoft SQL Server NTLM authentication on a DSN less Microsoft SQL Server
connection on the Microsoft Windows platform.
If the Microsoft SQL Server table contains a UUID data type and if you are reading data from an SQL
table and writing data to a flat file, the data format might not be consistent between the OLE DB and
ODBC connection types.
You cannot use SSL connection on a DSN less connection. If you want to use SSL, you must use the
DSN connection. Enable the Use DSN option and configure the SSL options in the odbc.ini file.
If the Microsoft SQL Server uses Kerberos authentication, you must set the GSSClient property to point to
the Informatica Kerberos libraries. Use the following path and filename: <Informatica installation
directory>/server/bin/libgssapi_krb5.so.2.Create an entry for the GSSClient property in the DSN
entries section in odbc.ini for a DSN connection or in the SQL Server wire protocol section in
odbcinst.ini for a connection that does not use DSN.
2.
3.
4.
Change the value of the Default Buffer Block size to 5 MB. You can also use the following command:
$INFA_HOME/server/bin/./pmrep massupdate -t session_config_property -n "Default buffer
block size" -v "5MB" -f $<folderName>
To get optimum throughput for a row size of 1 KB, you must set the Buffer Block size to 5 MB.
5.
418
6.
Change the Commit Interval to 100000 if the session contains a relational target.
7.
Set the DTM Buffer Size. The optimum DTM Buffer Size is ((10 x Block Buffer size) x number of
partitions).
PowerCenter Integration Service. Install the Netezza ODBC driver on the machine where the
PowerCenter Integration Service process runs. Use the Microsoft ODBC Data Source Administrator to
configure ODBC connectivity.
PowerCenter Client. Install the Netezza ODBC driver on each PowerCenter Client machine that
accesses the Netezza database. Use the Microsoft ODBC Data Source Administrator to configure ODBC
connectivity. Use the Workflow Manager to create a database connection object for the Netezza database.
Create an ODBC data source for each Netezza database that you want to access.
To create the ODBC data source, use the driver provided by Netezza.
Create a System DSN if you start the Informatica service with a Local System account logon. Create a
User DSN if you select the This account log in option to start the Informatica service.
After you create the data source, configure the properties of the data source.
2.
3.
Enter the IP address/host name and port number for the Netezza server.
4.
Enter the name of the Netezza schema where you plan to create database objects.
5.
Configure the path and file name for the ODBC log file.
6.
419
2.
Verify that the PATH environment variable includes the Oracle bin directory.
For example, if you install Net8, the path might include the following entry:
PATH=C:\ORANT\BIN;
3.
Configure the Oracle client to connect to the database that you want to access.
Launch SQL*Net Easy Configuration Utility or edit an existing tnsnames.ora file to the home directory
and modify it.
Note: By default, the tnsnames.ora file is stored in the following directory: <OracleInstallationDir>
\network\admin.
Enter the correct syntax for the Oracle connect string, typically databasename.world. Make sure the SID
entered here matches the database server instance ID defined on the Oracle server.
Here is a sample tnsnames.ora file. Enter the information for the database.
mydatabase.world =
(DESCRIPTION
(ADDRESS_LIST =
(ADDRESS =
(COMMUNITY = mycompany.world
(PROTOCOL = TCP)
(Host = mymachine)
(Port = 1521)
)
)
(CONNECT_DATA =
(SID = MYORA7)
(GLOBAL_NAMES = mydatabase.world)
4.
Set the NLS_LANG environment variable to the locale, including language, territory, and character set,
you want the database client and server to use with the login.
The value of this variable depends on the configuration. For example, if the value is
american_america.UTF8, you must set the variable as follows:
NLS_LANG=american_america.UTF8;
To determine the value of this variable, contact the database administrator.
5.
To set the default session time zone when the Data Integration Service reads or writes the Timestamp
with Local Time Zone data, specify the ORA_SDTZ environment variable.
You can set the ORA_SDTZ environment variable to any of the following values:
You can set the environment variable at the machine where Informatica server runs.
6.
420
If the tnsnames.ora file is not in the same location as the Oracle client installation location, set the
TNS_ADMIN environment variable to the directory where the tnsnames.ora file resides.
For example, if the tnsnames.ora file is in the C:\oracle\files directory, set the variable as follows:
TNS_ADMIN= C:\oracle\files
7.
Verify that the SYBASE environment variable refers to the Sybase ASE directory.
For example:
SYBASE=C:\SYBASE
2.
Verify that the PATH environment variable includes the Sybase OCS directory.
For example:
PATH=C:\SYBASE\OCS-15_0\BIN;C:\SYBASE\OCS-15_0\DLL
3.
Configure Sybase Open Client to connect to the database that you want to access.
Use SQLEDIT to configure the Sybase client, or copy an existing SQL.INI file (located in the %SYBASE
%\INI directory) and make any necessary changes.
Select NLWNSCK as the Net-Library driver and include the Sybase ASE server name.
Enter the host name and port number for the Sybase ASE server. If you do not know the host name and
port number, check with the system administrator.
4.
421
Integration Service. Install the Teradata client, the Teradata ODBC driver, and any other Teradata client
software that you might need on the machine where the Data Integration Service and PowerCenter
Integration Service run. You must also configure ODBC connectivity.
Informatica Developer. Install the Teradata client, the Teradata ODBC driver, and any other Teradata
client software that you might need on each machine that hosts a Developer tool that accesses Teradata.
You must also configure ODBC connectivity.
PowerCenter Client. Install the Teradata client, the Teradata ODBC driver, and any other Teradata client
software that you might need on each PowerCenter Client machine that accesses Teradata. Use the
Workflow Manager to create a database connection object for the Teradata database.
Note: Based on a recommendation from Teradata, Informatica uses ODBC to connect to Teradata. ODBC is
a native interface for Teradata.
Create an ODBC data source for each Teradata database that you want to access.
To create the ODBC data source, use the driver provided by Teradata.
Create a System DSN if you start the Informatica service with a Local System account logon. Create a
User DSN if you select the This account log in option to start the Informatica service.
2.
Enter the name for the new ODBC data source and the name of the Teradata server or its IP address.
To configure a connection to a single Teradata database, enter the DefaultDatabase name. To create a
single connection to the default database, enter the user name and password. To connect to multiple
databases, using the same ODBC data source, leave the DefaultDatabase field and the user name and
password fields empty.
3.
4.
5.
422
APPENDIX C
Use native drivers to connect to IBM DB2, Oracle, or Sybase ASE databases.
423
To configure connectivity on the machine where the Data Integration Service, PowerCenter Integration
Service, or PowerCenter Repository Service process runs, log in to the machine as a user who can start
a service process.
2.
3.
Set the shared library variable to include the DB2 lib directory.
The IBM DB2 client software contains a number of shared library components that the Data Integration
Service, PowerCenter Integration Service, and PowerCenter Repository Service processes load
424
dynamically. Set the shared library environment variable so that the services can find the shared libraries
at run time.
The shared library path must also include the Informatica installation directory (server_dir).
Set the shared library environment variable based on the operating system.
The following table describes the shared library variables for each operating system:
Operating System
Variable
Solaris
LD_LIBRARY_PATH
Linux
LD_LIBRARY_PATH
AIX
LIBPATH
HP-UX
SHLIB_PATH
For example, use the following syntax for Solaris and Linux:
Using a C shell:
$ setenv LD_LIBRARY_PATH ${LD_LIBRARY_PATH}:$HOME/server_dir:$DB2DIR/lib
For AIX:
Using a C shell:
Using a C shell:
$ setenv SHLIB_PATH ${SHLIB_PATH}:$HOME/server_dir:$DB2DIR/lib
4.
Edit the .cshrc or .profile to include the complete set of shell commands. Save the file and either log out
and log in again or run the source command.
Using a Bourne shell:
$ source .profile
Using a C shell:
$ source .cshrc
5.
If the DB2 database resides on the same machine on which the Data Integration Service, PowerCenter
Integration Service, or PowerCenter Repository Service process runs, configure the DB2 instance as a
remote instance.
Run the following command to verify if there is a remote entry for the database:
DB2 LIST DATABASE DIRECTORY
The command lists all the databases that the DB2 client can access and their configuration properties. If
this command lists an entry for Directory entry type of Remote, skip to step 6.
425
If the database is not configured as remote, run the following command to verify whether a TCP/IP node
is cataloged for the host:
DB2 LIST NODE DIRECTORY
If the node name is empty, you can create one when you set up a remote database. Use the following
command to set up a remote database and, if needed, create a node:
db2 CATALOG TCPIP NODE <nodename> REMOTE <hostname_or_address> SERVER <port number>
Run the following command to catalog the database:
db2 CATALOG DATABASE <dbname> as <dbalias> at NODE <nodename>
For more information about these commands, see the database documentation.
6.
Verify that you can connect to the DB2 database. Run the DB2 Command Line Processor and run the
command:
CONNECT TO <dbalias> USER <username> USING <password>
If the connection is successful, clean up with the CONNECT RESET or TERMINATE command.
Set the ODBCHOME environment variable to the ODBC installation directory. For example:
Using a Bourne shell:
$ ODBCHOME=<Informatica server home>/ODBC7.1; export ODBCHOME
Using a C shell:
$ setenv ODBCHOME <Informatica server home>/ODBC7.1
2.
Set the ODBCINI environment variable to the location of the odbc.ini file. For example, if the odbc.ini file
is in the $ODBCHOME directory:
Using a Bourne shell:
ODBCINI=$ODBCHOME/odbc.ini; export ODBCINI
Using a C shell:
$ setenv ODBCINI $ODBCHOME/odbc.ini
3.
Edit the existing odbc.ini file in the $ODBCHOME directory or copy this odbc.ini file to the UNIX home
directory and edit it.
$ cp $ODBCHOME/odbc.ini $HOME/.odbc.ini
4.
Add an entry for the Informix data source under the section [ODBC Data Sources] and configure the data
source. For example:
[Informix Wire Protocol]
Driver=/export/home/Informatica/10.0.0/ODBC7.1/lib/DWifcl27.so
Description=DataDirect 7.1 Informix Wire Protocol
AlternateServers=
ApplicationUsingThreads=1
426
CancelDetectInterval=0
ConnectionRetryCount=0
ConnectionRetryDelay=3
Database=<database_name>
HostName=<Informix_host>
LoadBalancing=0
LogonID=
Password=
PortNumber=<Informix_server_port>
ReportCodePageConversionErrors=0
ServerName=<Informix_server>
TrimBlankFromIndexName=1
5.
Set the PATH and shared library environment variables by executing the script odbc.sh or odbc.csh in
the $ODBCHOME directory.
Using a Bourne shell:
sh odbc.sh
Using a C shell:
source odbc.csh
6.
Verify that you can connect to the Informix database using the ODBC data source. If the connection fails,
see the database documentation.
If you are using PowerCenter, run the following command: pmrep upgradeSqlServerConnection
If you are using the Informatica platform, run the following command: infacmd.sh isp
upgradeSQLSConnection
After you run the upgrade command, you must set the environment variable on each machine that hosts the
Developer tool and on the machine that hosts Informatica services in the following format:
ODBCINST=<INFA_HOME>/ODBC7.1/odbcinst.ini
After you set the environment variable, you must restart the node that hosts the Informatica services.
For specific connectivity instructions, see the database documentation.
427
If you want to use a Microsoft SQL Server connection without using a Data Source Name (DSN less
connection), you must configure the odbcinst.ini environment variable.
If you are using a DSN connection, you must add the entry "EnableQuotedIdentifiers=1" to the ODBC
DSN. If you do not add the entry, data preview and mapping run fail.
You can use the Microsoft SQL Server NTLM authentication on a DSN less Microsoft SQL Server
connection on the Microsoft Windows platform.
If the Microsoft SQL Server table contains a UUID data type and if you are reading data from an SQL
table and writing data to a flat file, the data format might not be consistent between the OLE DB and
ODBC connection types.
You cannot use SSL connection on a DSN less connection. If you want to use SSL, you must use the
DSN connection. Enable the Use DSN option and configure the SSL options in the odbc.ini file.
If the Microsoft SQL Server uses Kerberos authentication, you must set the GSSClient property to point to
the Informatica Kerberos libraries. Use the following path and filename: <Informatica installation
directory>/server/bin/libgssapi_krb5.so.2.Create an entry for the GSSClient property in the DSN
entries section in odbc.ini for a DSN connection or in the SQL Server wire protocol section in
odbcinst.ini for a connection that does not use DSN.
Open the odbc.ini file and add an entry for the ODBC data source and DataDirect New SQL Server Wire
Protocol driver under the section [ODBC Data Sources].
2.
428
Attribute
Description
EncryptionMethod
The method that the driver uses to encrypt the data sent between the driver and
the database server. Set the value to 1 to encrypt data using SSL.
ValidateServerCertificate
Determines whether the driver validates the certificate sent by the database
server when SSL encryption is enabled. Set the value to 1 for the driver to
validate the server certificate.
TrustStore
The location and name of the trust store file. The trust store file contains a list of
Certificate Authorities (CAs) that the driver uses for SSL server authentication.
TrustStorePassword
HostNameInCertificate
Optional. The host name that is established by the SSL administrator for the
driver to validate the host name contained in the certificate.
2.
3.
4.
Change the value of the Default Buffer Block size to 5 MB. You can also use the following command:
$INFA_HOME/server/bin/./pmrep massupdate -t session_config_property -n "Default buffer
block size" -v "5MB" -f $<folderName>
To get optimum throughput for a row size of 1 KB, you must set the Buffer Block size to 5 MB.
5.
6.
Change the Commit Interval to 100000 if the session contains a relational target.
7.
Set the DTM Buffer Size. The optimum DTM Buffer Size is ((10 x Block Buffer size) x number of
partitions).
To configure connectivity for the integration service process, log in to the machine as a user who can
start a service process.
2.
429
Variable
Solaris
LD_LIBRARY_PATH
Linux
LD_LIBRARY_PATH
AIX
LIBPATH
HP-UX
SHLIB_PATH
For example, use the following syntax for Solaris and Linux:
Using a C shell:
$ setenv LD_LIBRARY_PATH "${LD_LIBRARY_PATH}:$HOME/server_dir:$ODBCHOME/
lib:<NetezzaInstallationDir>/lib64"
For AIX
Using a C shell:
$ setenv LIBPATH ${LIBPATH}:$HOME/server_dir:$ODBCHOME/
lib:<NetezzaInstallationDir>/lib64
For HP-UX
Using a C shell:
$ setenv SHLIB_PATH ${SHLIB_PATH}:$HOME/server_dir:$ODBCHOME/
lib:<NetezzaInstallationDir>/lib64
4.
Edit the existing odbc.ini file or copy the odbc.ini file to the home directory and edit it.
This file exists in $ODBCHOME directory.
$ cp $ODBCHOME/odbc.ini $HOME/.odbc.ini
430
Add an entry for the Netezza data source under the section [ODBC Data Sources] and configure the
data source.
For example:
[NZSQL]
Driver = /export/home/appsqa/thirdparty/netezza/lib64/libnzodbc.so
Description = NetezzaSQL ODBC
Servername = netezza1.informatica.com
Port = 5480
Database = infa
Username = admin
Password = password
Debuglogging = true
StripCRLF = false
PreFetch = 256
Protocol = 7.0
ReadOnly = false
ShowSystemTables = false
Socket = 16384
DateFormat = 1
TranslationDLL =
TranslationName =
TranslationOption =
NumericAsChar = false
For more information about Netezza connectivity, see the Netezza ODBC driver documentation.
5.
Verify that the last entry in the odbc.ini file is InstallDir and set it to the ODBC installation directory.
For example:
InstallDir=<Informatica install directory>/<ODBCHOME directory>
6.
Edit the .cshrc or .profile file to include the complete set of shell commands.
7.
To configure connectivity for the Data Integration Service, PowerCenter Integration Service, or
PowerCenter Repository Service process, log in to the machine as a user who can start the server
process.
2.
431
You can set the environment variable at the machine where Informatica server runs.
TNS_ADMIN. If the tnsnames.ora file is not in the same location as the Oracle client installation location,
set the TNS_ADMIN environment variable to the directory where the tnsnames.ora file resides. For
example, if the file is in the /HOME2/oracle/files directory, set the variable as follows:
Using a Bourne shell:
$ TNS_ADMIN=$HOME2/oracle/files; export TNS_ADMIN
Using a C shell:
$ setenv TNS_ADMIN=$HOME2/oracle/files
Note: By default, the tnsnames.ora file is stored in the following directory: $ORACLE_HOME/network/
admin.
PATH. To run the Oracle command line programs, set the variable to include the Oracle bin directory.
Using a Bourne shell:
$ PATH=${PATH}:$ORACLE_HOME/bin; export PATH
Using a C shell:
$ setenv PATH ${PATH}:ORACLE_HOME/bin
3.
432
Using a C shell:
$ setenv LD_LIBRARY_PATH ${LD_LIBRARY_PATH}:$HOME/server_dir:$ORACLE_HOME/lib
4.
Edit the .cshrc or .profile to include the complete set of shell commands. Save the file and either log out
and log in again, or run the source command.
Using a Bourne shell:
$ source .profile
Using a C shell:
$ source .cshrc
5.
6.
433
2.
3.
Variable
Solaris
LD_LIBRARY_PATH
Linux
LD_LIBRARY_PATH
AIX
LIBPATH
HP-UX
SHLIB_PATH
For example, use the following syntax for Solaris and Linux:
434
Using a C shell:
$ setenv LD_LIBRARY_PATH ${LD_LIBRARY_PATH}:$HOME/server_dir:$SYBASE/OCS-15_0/lib;
$SYBASE/OCS-15_0/lib3p;$SYBASE/OCS-15_0/lib3p64;
For AIX
Using a C shell:
$ setenv LIBPATH ${LIBPATH}:$HOME/server_dir:$SYBASE/OCS-15_0/lib;$SYBASE/
OCS-15_0/lib3p;$SYBASE/OCS-15_0/lib3p64;
For HP-UX
Using a C shell:
$ setenv SHLIB_PATH ${SHLIB_PATH}:$HOME/server_dir:$SYBASE/OCS-15_0/lib;$SYBASE/
OCS-15_0/lib3p;$SYBASE/OCS-15_0/lib3p64;
4.
Edit the .cshrc or .profile to include the complete set of shell commands. Save the file and either log out
and log in again, or run the source command.
Using a Bourne shell:
$ source .profile
Using a C shell:
$ source .cshrc
5.
Verify the Sybase ASE server name in the Sybase interfaces file stored in the $SYBASE directory.
6.
435
To configure connectivity for the integration service process, log in to the machine as a user who can
start a service process.
2.
3.
436
Operating System
Variable
Solaris
LD_LIBRARY_PATH
Linux
LD_LIBRARY_PATH
AIX
LIBPATH
HP-UX
SHLIB_PATH
For example, use the following syntax for Solaris and Linux:
Using a C shell:
$ setenv LD_LIBRARY_PATH "${LD_LIBRARY_PATH}:$HOME/server_dir:$ODBCHOME/lib:
$TERADATA_HOME/lib64:
$TERADATA_HOME/odbc_64/lib"
For AIX
Using a C shell:
$ setenv LIBPATH ${LIBPATH}:$HOME/server_dir:$ODBCHOME/lib:$TERADATA_HOME/lib64:
$TERADATA_HOME/odbc_64/lib
For HP-UX
Using a C shell:
$ setenv SHLIB_PATH ${SHLIB_PATH}:$HOME/server_dir:$ODBCHOME/lib:$TERADATA_HOME/
lib64:
$TERADATA_HOME/odbc_64/lib
4.
Edit the existing odbc.ini file or copy the odbc.ini file to the home directory and edit it.
This file exists in $ODBCHOME directory.
$ cp $ODBCHOME/odbc.ini $HOME/.odbc.ini
Add an entry for the Teradata data source under the section [ODBC Data Sources] and configure the
data source.
For example:
MY_TERADATA_SOURCE=Teradata Driver
[MY_TERADATA_SOURCE]
Driver=/u01/app/teradata/td-tuf611/odbc/drivers/tdata.so
Description=NCR 3600 running Teradata V1R5.2
DBCName=208.199.59.208
DateTimeFormat=AAA
SessionMode=ANSI
DefaultDatabase=
Username=
Password=
5.
6.
Optionally, set the SessionMode to ANSI. When you use ANSI session mode, Teradata does not roll
back the transaction when it encounters a row error.
If you choose Teradata session mode, Teradata rolls back the transaction when it encounters a row
error. In Teradata mode, the integration service process cannot detect the rollback, and does not report
this in the session log.
437
7.
To configure connection to a single Teradata database, enter the DefaultDatabase name. To create a
single connection to the default database, enter the user name and password. To connect to multiple
databases, using the same ODBC DSN, leave the DefaultDatabase field empty.
For more information about Teradata connectivity, see the Teradata ODBC driver documentation.
8.
Verify that the last entry in the odbc.ini is InstallDir and set it to the odbc installation directory.
For example:
InstallDir=<Informatica installation directory>/ODBC<version>
9.
10.
Edit the .cshrc or .profile to include the complete set of shell commands.
Save the file and either log out and log in again, or run the source command.
Using a Bourne shell:
$ source .profile
Using a C shell:
$ source .cshrc
11.
For each data source you use, make a note of the file name under the Driver=<parameter> in the data
source entry in odbc.ini. Use the ddtestlib utility to verify that the DataDirect ODBC driver manager can
load the driver file.
For example, if you have the driver entry:
Driver=/u01/app/teradata/td-tuf611/odbc/drivers/tdata.so
run the following command:
ddtestlib /u01/app/teradata/td-tuf611/odbc/drivers/tdata.so
12.
On the machine where the application service runs, log in as a user who can start a service process.
2.
438
Variable
Solaris
LD_LIBRARY_PATH
Linux
LD_LIBRARY_PATH
AIX
LIBPATH
HP-UX
SHLIB_PATH
For example, use the following syntax for Solaris and Linux:
Using a C shell:
$ setenv LD_LIBRARY_PATH $HOME/server_dir:$ODBCHOME:${LD_LIBRARY_PATH}
For AIX
Using a C shell:
$ setenv LIBPATH ${LIBPATH}:$HOME/server_dir:$ODBCHOME/lib
For HP-UX
Using a C shell:
Edit the existing odbc.ini file or copy the odbc.ini file to the home directory and edit it.
This file exists in $ODBCHOME directory.
$ cp $ODBCHOME/odbc.ini $HOME/.odbc.ini
Add an entry for the ODBC data source under the section [ODBC Data Sources] and configure the data
source.
439
For example:
MY_MSSQLSERVER_ODBC_SOURCE=<Driver name or data source description>
[MY_SQLSERVER_ODBC_SOURCE]
Driver=<path to ODBC drivers>
Description=DataDirect 7.1 SQL Server Wire Protocol
Database=<SQLServer_database_name>
LogonID=<username>
Password=<password>
Address=<TCP/IP address>,<port number>
QuoteId=No
AnsiNPW=No
ApplicationsUsingThreads=1
This file might already exist if you have configured one or more ODBC data sources.
5.
Verify that the last entry in the odbc.ini is InstallDir and set it to the odbc installation directory.
For example:
InstallDir=/export/home/Informatica/10.0.0/ODBC7.1
6.
If you use the odbc.ini file in the home directory, set the ODBCINI environment variable.
Using a Bourne shell:
$ ODBCINI=/$HOME/.odbc.ini; export ODBCINI
Using a C shell:
$ setenv ODBCINI $HOME/.odbc.ini
7.
Edit the .cshrc or .profile to include the complete set of shell commands. Save the file and either log out
and log in again, or run the source command.
Using a Bourne shell:
$ source .profile
Using a C shell:
$ source .cshrc
8.
Use the ddtestlib utility to verify that the DataDirect ODBC driver manager can load the driver file you
specified for the data source in the odbc.ini file.
For example, if you have the driver entry:
Driver = /export/home/Informatica/10.0.0/ODBC7.1/lib/DWxxxxnn.so
run the following command:
ddtestlib /export/home/Informatica/10.0.0/ODBC7.1/lib/DWxxxxnn.so
9.
Install and configure any underlying client access software needed by the ODBC driver.
Note: While some ODBC drivers are self-contained and have all information inside the .odbc.ini file,
most are not. For example, if you want to use an ODBC driver to access Sybase IQ, you must install the
Sybase IQ network client software and set the appropriate environment variables.
To use the Informatica ODBC drivers (DWxxxxnn.so), manually set the PATH and shared library path
environment variables. Alternatively, run the odbc.sh or odbc.csh script in the $ODBCHOME folder. This
script will set the required PATH and shared library path environment variables for the ODBC drivers
provided by Informatica.
440
441
ProgramID=
QueryTimeout=0
ReportCodePageConversionErrors=0
TcpPort=50000
TrustStore=
TrustStorePassword=
UseCurrentSchema=0
ValidateServerCertificate=1
WithHold=1
XMLDescribeType=-10
[Informix Wire Protocol]
Driver=/<Informatica installation directory>/ODBC7.1/lib/DWifcl27.so
Description=DataDirect 7.1 Informix Wire Protocol
AlternateServers=
ApplicationUsingThreads=1
CancelDetectInterval=0
ConnectionRetryCount=0
ConnectionRetryDelay=3
Database=<database_name>
HostName=<Informix_host>
LoadBalancing=0
LogonID=
Password=
PortNumber=<Informix_server_port>
ServerName=<Informix_server>
TrimBlankFromIndexName=1
UseDelimitedIdentifiers=0
[Oracle Wire Protocol]
Driver=/<Informatica installation directory>/ODBC7.1/lib/DWora27.so
Description=DataDirect 7.1 Oracle Wire Protocol
AlternateServers=
ApplicationUsingThreads=1
AccountingInfo=
Action=
ApplicationName=
ArraySize=60000
AuthenticationMethod=1
BulkBinaryThreshold=32
BulkCharacterThreshold=-1
BulkLoadBatchSize=1024
BulkLoadFieldDelimiter=
BulkLoadRecordDelimiter=
CachedCursorLimit=32
CachedDescLimit=0
CatalogIncludesSynonyms=1
CatalogOptions=0
ClientHostName=
ClientID=
ClientUser=
ConnectionReset=0
ConnectionRetryCount=0
ConnectionRetryDelay=3
DataIntegrityLevel=0
DataIntegrityTypes=MD5,SHA1
DefaultLongDataBuffLen=1024
DescribeAtPrepare=0
EditionName=
EnableBulkLoad=0
EnableDescribeParam=0
EnableNcharSupport=0
EnableScrollableCursors=1
EnableStaticCursorsForLongData=0
EnableTimestampWithTimeZone=0
EncryptionLevel=0
EncryptionMethod=0
EncryptionTypes=AES128,AES192,AES256,DES,3DES112,3DES168,RC4_40,RC4_56,RC4_128,
RC4_256
FailoverGranularity=0
FailoverMode=0
442
FailoverPreconnect=0
FetchTSWTZasTimestamp=0
GSSClient=native
HostName=<Oracle_server>
HostNameInCertificate=
InitializationString=
KeyPassword=
KeyStore=
KeyStorePassword=
LoadBalanceTimeout=0
LoadBalancing=0
LocalTimeZoneOffset=
LockTimeOut=-1
LoginTimeout=15
LogonID=
MaxPoolSize=100
MinPoolSize=0
Module=
Password=
Pooling=0
PortNumber=<Oracle_server_port>
ProcedureRetResults=0
ProgramID=
QueryTimeout=0
ReportCodePageConversionErrors=0
ReportRecycleBin=0
ServerName=<server_name in tnsnames.ora>
ServerType=0
ServiceName=
SID=<Oracle_System_Identifier>
TimestampeEscapeMapping=0
TNSNamesFile=<tnsnames.ora_filename>
TrustStore=
TrustStorePassword=
UseCurrentSchema=1
ValidateServerCertificate=1
WireProtocolMode=2
[Sybase Wire Protocol]
Driver=/<Informatica installation directory>/ODBC7.1/lib/DWase27.so
Description=DataDirect 7.1 Sybase Wire Protocol
AlternateServers=
ApplicationName=
ApplicationUsingThreads=1
ArraySize=50
AuthenticationMethod=0
BulkBinaryThreshold=32
BulkCharacterThreshold=-1
BulkLoadBatchSize=1024
BulkLoadFieldDelimiter=
BulkLoadRecordDelimiter=
Charset=
ConnectionReset=0
ConnectionRetryCount=0
ConnectionRetryDelay=3
CursorCacheSize=1
Database=<database_name>
DefaultLongDataBuffLen=1024
EnableBulkLoad=0
EnableDescribeParam=0
EnableQuotedIdentifiers=0
EncryptionMethod=0
FailoverGranularity=0
FailoverMode=0
FailoverPreconnect=0
GSSClient=native
HostNameInCertificate=
InitializationString=
Language=
LoadBalancing=0
LoadBalanceTimeout=0
443
LoginTimeout=15
LogonID=
MaxPoolSize=100
MinPoolSize=0
NetworkAddress=<Sybase_host,Sybase_server_port>
OptimizePrepare=1
PacketSize=0
Password=
Pooling=0
QueryTimeout=0
RaiseErrorPositionBehavior=0
ReportCodePageConversionErrors=0
SelectMethod=0
ServicePrincipalName=
TruncateTimeTypeFractions=0
TrustStore=
TrustStorePassword=
ValidateServerCertificate=1
WorkStationID=
[SQL Server Wire Protocol]
Driver=/<Informatica installation directory>/ODBC7.1/lib/DWsqls27.so
Description=DataDirect 7.1 SQL Server Wire Protocol
AlternateServers=
AlwaysReportTriggerResults=0
AnsiNPW=1
ApplicationName=
ApplicationUsingThreads=1
AuthenticationMethod=1
BulkBinaryThreshold=32
BulkCharacterThreshold=-1
BulkLoadBatchSize=1024
BulkLoadOptions=2
ConnectionReset=0
ConnectionRetryCount=0
ConnectionRetryDelay=3
Database=<database_name>
EnableBulkLoad=0
EnableQuotedIdentifiers=0
EncryptionMethod=0
FailoverGranularity=0
FailoverMode=0
FailoverPreconnect=0
FetchTSWTZasTimestamp=0
FetchTWFSasTime=1
GSSClient=native
HostName=<SQL_Server_host>
HostNameInCertificate=
InitializationString=
Language=
LoadBalanceTimeout=0
LoadBalancing=0
LoginTimeout=15
LogonID=
MaxPoolSize=100
MinPoolSize=0
PacketSize=-1
Password=
Pooling=0
PortNumber=<SQL_Server_server_port>
QueryTimeout=0
ReportCodePageConversionErrors=0
SnapshotSerializable=0
TrustStore=
TrustStorePassword=
ValidateServerCertificate=1
WorkStationID=
XML Describe Type=-10
[MySQL Wire Protocol]
Driver=/<Informatica installation directory>/ODBC7.1/lib/DWmysql27.so
444
445
ReportCodepageConversionErrors=0
TransactionErrorBehavior=1
TrustStore=
TrustStorePassword=
ValidateServerCertificate=1
XMLDescribeType=-10
[Greenplum Wire Protocol]
Driver=/<Informatica installation directory>/ODBC7.1/lib/DWgplm27.so
Description=DataDirect 7.1 Greenplum Wire Protocol
AlternateServers=
ApplicationUsingThreads=1
ConnectionReset=0
ConnectionRetryCount=0
ConnectionRetryDelay=3
Database=<database_name>
DefaultLongDataBuffLen=2048
EnableDescribeParam=0
EnableKeysetCursors=0
EncryptionMethod=0
ExtendedColumnMetadata=0
FailoverGranularity=0
FailoverMode=0
FailoverPreconnect=0
FetchTSWTZasTimestamp=0
FetchTWFSasTime=0
HostName=<Greenplum_host>
InitializationString=
KeyPassword=
KeysetCursorOptions=0
KeyStore=
KeyStorePassword=
LoadBalanceTimeout=0
LoadBalancing=0
LoginTimeout=15
LogonID=
MaxPoolSize=100
MinPoolSize=0
Password=
Pooling=0
PortNumber=<Greenplum_server_port>
QueryTimeout=0
ReportCodepageConversionErrors=0
TransactionErrorBehavior=1
XMLDescribeType=-10
[SQL Server Legacy Wire Protocol]
Driver=/<Informatica installation directory>/ODBC7.1/lib/DWmsss27.so
Description=DataDirect 7.1 SQL Server Legacy Wire Protocol
Address=<SQLServer_host, SQLServer_server_port>
AlternateServers=
AnsiNPW=Yes
ConnectionRetryCount=0
ConnectionRetryDelay=3
Database=<database_name>
FetchTSWTZasTimestamp=0
FetchTWFSasTime=0
LoadBalancing=0
LogonID=
Password=
QuotedId=No
ReportCodepageConversionErrors=0
SnapshotSerializable=0
446
APPENDIX D
2.
Choose the Connect for JDBC driver for an IBM DB2 data source.
447
3.
4.
Download the utility to a machine that has access to the DB2 database server.
5.
6.
In the directory where you extracted the file, run the installer.
The installation program creates a folder named testforjdbc in the installation directory.
In the DB2 database, set up a system adminstrator user account with the BINDADD authority.
2.
In the directory where you installed the DataDirect Connect for JDBC Utility, run the Test for JDBC tool.
On Windows, run testforjdbc.bat. On UNIX, run testforjdbc.sh.
3.
On the Test for JDBC Tool window, click Press Here to Continue.
4.
5.
448
6.
In the User Name and Password fields, enter the system administrator user name and password you use
to connect to the DB2 database.
7.
Index
A
Abort
option to disable PowerCenter Integration Service 220
option to disable PowerCenter Integration Service process 220
option to disable the Web Services Hub 387
adaptive dispatch mode
description 248
overview 258
Additional JDBC Parameters
description 181
address validation properties
configuring 44
Administrator tool
SAP BW Service, configuring 348
advanced profiling properties
configuring 64
advanced properties
Metadata Manager Service 185
PowerCenter Integration Service 228
PowerCenter Repository Service 287
Web Services Hub 388, 390
Agent Cache Capacity (property)
description 287
agent port
description 180
AggregateTreatNullsAsZero
option 230
option override 230
AggregateTreatRowsAsInsert
option 230
option override 230
Aggregator transformation
caches 267, 272
treating nulls as zero 230
treating rows as insert 230
Allow Writes With Agent Caching (property)
description 287
Analyst Service
Analyst Service security process properties 32
creating 34
custom service process properties 33
environment variables 33
Maximum Heap Size 32
node process properties 32
process properties 31
properties 29
application
backing up 156
changing the name 155
deploying 152
enabling 155
properties 153
refreshing 157
application service upgrade
privileges 394
application services
system 364
architecture
Data Integration Service 76
ASCII mode
ASCII data movement mode, setting 226
Data Integration Service 79
overview 268
associated PowerCenter Repository Service
PowerCenter Integration Service 218
associated repository
Web Services Hub, adding to 392
Web Services Hub, editing for 393
associated Repository Service
Web Services Hub 386, 392, 393
audit trails
creating 310
Authenticate MS-SQL User (property)
description 287
B
backing up
list of backup files 307
performance 311
repositories 306
backup directory
Model Repository Service 205
backup node
license requirement 225
node assignment, configuring 225
PowerCenter Integration Service 218
baseline system
CPU profile 251
basic dispatch mode
overview 258
blocking
description 264
blocking source data
PowerCenter Integration Service handling 264
buffer memory
buffer blocks 267
DTM process 267
C
Cache Connection
property 59
cache files
directory 239
overview 272
permissions 269
Cache Removal Time
property 59
449
caches
default directory 272
memory 267
memory usage 267
multiple directories 106
overview 269
transformation 272
certificate
keystore file 386, 389
character data sets
handling options for Microsoft SQL Server and PeopleSoft on
Oracle 230
character encoding
Web Services Hub 389
classpaths
Java SDK 239
ClientStore
option 228
Code Page (property)
PowerCenter Integration Service process 239
PowerCenter Repository Service 282
code pages
data movement modes 268
for PowerCenter Integration Service process 237
global repository 300
PowerCenter repository 282
repository 299
repository, Web Services Hub 386
validation for sources and targets 232
command line programs
team-based development, administering 215
compatibility properties
PowerCenter Integration Service 230
Complete
option to disable PowerCenter Integration Service 220
option to disable PowerCenter Integration Service process 220
compute node
overriding attributes 146
compute role
Data Integration Service node 80
Compute view
Data Integration Service 70
environment variables 71
execution options 70
concurrent jobs
Data Integration Service grid 148
configuration properties
Listener Service 316
Logger Service 321
PowerCenter Integration Service 232
configure and synchronize with version control system
how to 211
connect string
examples 176, 284
PowerCenter repository database 286
syntax 176, 284
connecting
Integration Service to IBM DB2 (Windows) 416, 424
Integration Service to Informix (UNIX) 426
Integration Service to Informix (Windows) 416
Integration Service to Microsoft Access 417
Integration Service to Microsoft SQL Server 417
Integration Service to ODBC data sources (UNIX) 438
Integration Service to Oracle (UNIX) 431
Integration Service to Oracle (Windows) 419
Integration Service to Sybase ASE (UNIX) 433
Integration Service to Sybase ASE (Windows) 421
Microsoft Excel to Integration Service 417
450
Index
connecting (continued)
SQL data service 121
UNIX databases 423
Windows databases 415
Windows using JDBC 415
connection performance
optimizing 98
connection pooling
description 96
example 98
management 96
PowerExchange 98
properties 97
connection resources
assigning 245
connections
adding pass-through security 123
pass-through security 121
connectivity
connect string examples 176, 284
overview 254
Content Management Service
architecture 36
classifier model file path 48
creating 49
Data Integration Service grid 147
file transfer option 42
identity data properties 47
log events 42
Multi-Service Options 41
orphaned reference data 38
overview 35
probabilistic model file path 48
purge orphaned reference data 39
reference data storage location 38, 41
rule specifications 35, 36
staging directory for reference data 42
control file
overview 271
permissions 269
control files
Data Integration Service 93
CPU profile
computing 251
description 251
CPU usage
Integration Service 267
CreateIndicatorFiles
option 232
custom properties
configuring for Data Integration Service 66, 69
configuring for Metadata Manager 186
configuring for Web Services Hub 392
PowerCenter Integration Service process 241
PowerCenter Repository Service 290
PowerCenter Repository Service process 290
Web Services Hub 388
custom resources
defining 246
naming conventions 246
Custom transformation
directory for Java components 239
D
Data Analyzer
Data Profiling reports 327
Index
451
database clients
configuring 414
environment variables 414
IBM DB2 client application enabler 413
Microsoft SQL Server native clients 413
Oracle clients 413
Sybase open clients 413
database connection timeout
description 286
database connections
PowerCenter Integration Service resilience 275
Database Hostname
description 181
Database Name
description 181
Database Pool Expiration Threshold (property)
description 287
Database Pool Expiration Timeout (property)
description 287
Database Pool Size (property)
description 286
Database Port
description 181
database preparation
repositories 398
database requirements
Data Analyzer 399
data object cache 401
Jaspersoft repository 402
Metadata Manager repository 402
Model repository 406
PowerCenter repository 407
profiling warehouse 409
reference data warehouse 410
workflow database 411
database resilience
repository 291
database statistics
IBM DB2 120
Microsoft SQL Server 120
Oracle 120
database user accounts
guidelines for setup 399
databases
connecting to (UNIX) 423
connecting to (Windows) 415
connecting to IBM DB2 416, 424
connecting to Informix 416, 426
connecting to Microsoft Access 417
connecting to Microsoft SQL Server 417
connecting to Netezza (UNIX) 429
connecting to Netezza (Windows) 419
connecting to Oracle 419, 431
connecting to Sybase ASE 421, 433
connecting to Teradata (UNIX) 435
connecting to Teradata (Windows) 422
Data Analyzer repository 399
Metadata Manager repository 399
PowerCenter repository 399
testing connections 414
DateDisplayFormat
option 232
DateHandling40Compatibility
option 230
dates
default format for logs 232
dbs2 connect
testing database connections 414
452
Index
deadlock retries
setting number 230
DeadlockSleep
option 230
Debug
error severity level 228, 390
Debugger
running 228
dependency graph
rebuilding 396
deployment
applications 152
directories
cache files 239
external procedure files 239
for Java components 239
lookup files 239
recovery files 239
reject files 239
root directory 239
session log files 239
source files 239
target files 239
temporary files 239
workflow log files 239
disabling
Metadata Manager Service 178
PowerCenter Integration Service 220
PowerCenter Integration Service process 220
Reporting Service 330, 331
Web Services Hub 387
dispatch mode
adaptive 248
configuring 248
Load Balancer 258
metric-based 248
round-robin 248
dispatch priority
configuring 249
dispatch queue
overview 256
service levels, creating 249
dispatch wait time
configuring 249
domain
associated repository for Web Services Hub 386
metadata, sharing 299
domain configuration repository
IBM DB2 database requirements 192, 406
Microsoft SQL Server database requirements 193
DTM (Data Transformation Manager)
buffer memory 267
distribution on PowerCenter grids 266
instance 80
master DTM 266
output files 81
preparer DTM 266
process 83, 259
processing threads 81
resource allocation policy 81
worker DTM 266
DTM instances
Data Integration Service 80
description 95
DTM process
environment variables 71
DTM processes
description 95
E
Email Service
properties 366
Enable Nested LDO Cache
property 59
enabling
Metadata Manager Service 178
PowerCenter Integration Service 220
PowerCenter Integration Service process 220
Reporting Service 330, 331
Web Services Hub 387
encoding
Web Services Hub 389
environment variables
compute node 71
database client 241, 290
database clients 414
DTM process 71
Listener Service process 316
Logger Service process 324
PowerCenter Integration Service process 241
PowerCenter Repository Service process 290
UNIX database clients 414
Error
severity level 228, 390
error logs
messages 270
Error Severity Level (property)
Metadata Manager Service 185
PowerCenter Integration Service 228
execution Data Transformation Manager
Data Integration Service 80
execution options
configuring 57
override for compute node 70
ExportSessionLogLibName
option 232
external procedure files
directory 239
F
failover
PowerCenter Integration Service 276
PowerCenter Repository Service 292
PowerExchange Listener Service 318
PowerExchange Logger Service 325
safe mode 224
file permissions
Data Integration Service 94
file/directory resources
defining 246
naming conventions 246
filtering data
SAP BW, parameter file location 353
flat files
output files 272
folders
operating system profile, assigning 306
FTP connections
PowerCenter Integration Service resilience 275
G
general properties
Listener Service 315
Logger Service 321
Metadata Manager Service 179
PowerCenter Integration Service 226
PowerCenter Integration Service process 239
PowerCenter Repository Service 285
SAP BW Service 351
Web Services Hub 388, 389
global repositories
code page 299, 300
creating 300
creating from local repositories 300
moving to another Informatica domain 303
grid
Data Integration Service file directories 92
troubleshooting for PowerCenter Integration Service 247
grid assignment properties
Data Integration Service 55
PowerCenter Integration Service 225
grids
assigning to a PowerCenter Integration Service 243
configuring for PowerCenter Integration Service 242
creating 242
Data Integration Service 124
description for PowerCenter Integration Service 265
DTM processes for PowerCenter 266
for PowerCenter Integration Service 218
license requirement 55
license requirement for PowerCenter Integration Service 225
operating system profile 243
PowerCenter Integration Service processes, distributing 265
troubleshooting for Data Integration Service 149
H
heartbeat interval
description 287
high availability
licensed option 225
Listener Service 318
Logger Service 325
PowerCenter Integration Service 274
PowerCenter Repository Service 291
PowerCenter Repository Service failover 292
PowerCenter Repository Service recovery 292
PowerCenter Repository Service resilience 291
PowerCenter Repository Service restart 292
high availability option
service processes, configuring 295
high availability persistence tables
PowerCenter Integration Service 279
host names
Web Services Hub 386, 389
host port number
Web Services Hub 386, 389
how to
configure and synchronize a Model repository with a version control
system 211
HTTP
Data Integration Service 75
Index
453
I
IBM DB2
connect string example 176, 284
connecting to Integration Service (Windows) 416, 424
repository database schema, optimizing 286
setting DB2CODEPAGE 416
setting DB2INSTANCE 416
single-node tablespaces 408
IBM DB2 database requirements
Data Analyzer repository 399
data object cache 401
domain repository 192, 406
Jaspersoft repository 402
Metadata Manager repository 403
Model repository database 192, 406
PowerCenter repository 408
profiling warehouse 409
reference data warehouse 410
workflow repository 411
IgnoreResourceRequirements
option 228
incremental aggregation
files 272
index caches
memory usage 267
indicator files
description 272
session output 272
infacmd mrs
listing checked-out object 215
listing locked object 215
reassigning locked or checked-out object 215
undoing checked-out object 215
454
Index
J
JaspeReports
overview 338
Jaspersoft repository
database requirements 402
IBM DB2 database requirements 402
Microsoft SQL Server database requirements 402
Oracle database requirements 402
Java
configuring for JMS 239
configuring for PowerExchange for Web Services 239
configuring for webMethods 239
Java components
directories, managing 239
Java SDK
class path 239
maximum memory 239
minimum memory 239
Java SDK Class Path
option 239
Java SDK Maximum Memory
option 239
Java SDK Minimum Memory
option 239
Java transformation
directory for Java components 239
JCEProvider
option 228
JDBC
connecting to (Windows) 415
Data Integration Service 75
jobs
launch as separate processes 94
Joiner transformation
caches 267, 272
setting up for prior version compatibility 230
JoinerSourceOrder6xCompatibility
option 230
JVM Command Line Options
advanced Web Services Hub property 390
K
keystore file
Metadata Manager 183
Web Services Hub 386, 389
keystore password
Web Services Hub 386, 389
L
LDTM
Data Integration Service 79
license
for PowerCenter Integration Service 218
Web Services Hub 386, 389
licensed options
high availability 225
server grid 225
Limit on Resilience Timeouts (property)
description 287
linked domain
multiple domains 301
Linux
database client environment variables 414
listCheckedoutObjects (infacmd mrs) 215
Listener Service process
environment variables 316
listing
checked-out object 215
locked object 215
listLockedObjects (infacmd mrs) 215
Load Balancer
configuring to check resources 257
defining resource provision thresholds 251
dispatch mode 258
dispatching tasks in a grid 257
dispatching tasks on a single node 257
resource provision thresholds 258
resources 244, 257
Load Balancer for PowerCenter Integration Service
assigning priorities to tasks 249, 259
configuring to check resources 228, 250
CPU profile, computing 251
dispatch mode, configuring 248
dispatch queue 256
overview 256
service levels 259
service levels, creating 249
settings, configuring 247
load balancing
SAP BW Service 354
support for SAP BW system 354
LoadManagerAllowDebugging
option 228
local mode
Data Integration Service grid 132
local repositories
code page 299
moving to another Informatica domain 303
promoting 300
registering 301
locks
managing 303
viewing 303
log files
Data Integration Service 86, 93
Data Integration Service permissions 94
M
Manage List
linked domains, adding 301
mapping pipelines
description 102
mapping properties
configuring 158
mappings
Data Integration Service grid 132, 137
grids in local mode 134
grids in remote mode 141
maximum parallelism 102, 104
partition points 102
partitioned 104
pipelines 102
processing threads 102
master thread
description 260
Max Concurrent Resource Load
description, Metadata Manager Service 185
Max Heap Size
description, Metadata Manager Service 185
Max Lookup SP DB Connections
option 230
Max MSSQL Connections
option 230
Max Sybase Connections
option 230
MaxConcurrentRequests
advanced Web Services Hub property 390
description, Metadata Manager Service 183
Maximum Active Connections
description, Metadata Manager Service 184
SQL data service property 160
maximum active users
description 287
Maximum Catalog Child Objects
description 185
Maximum Concurrent Connections
configuring 69
Maximum Concurrent Refresh Requests
property 59
Index
455
456
Index
N
native drivers
Data Integration Service 75
Netezza
connecting from Informatica clients(Windows) 419
connecting from Integration Service (Windows) 419
connecting to Informatica clients (UNIX) 429
connecting to Integration Service (UNIX) 429
node assignment
Data Integration Service 55
PowerCenter Integration Service 225
Resource Manager Service 369
Web Services Hub 388, 389
node properties
maximum CPU run queue length 251
maximum memory percent 251
maximum processes 251
nodes
node assignment, configuring 225
Web Services Hub 386
normal mode
PowerCenter Integration Service 222
notifications
sending 306
null values
PowerCenter Integration Service, configuring 230
NumOfDeadlockRetries
option 230
O
object dependency graph
rebuilding 396
objects
filtering 214
ODBC
Data Integration Service 75
ODBC Connection Mode
description 185
ODBC data sources
connecting to (UNIX) 438
connecting to (Windows) 415
odbc.ini file
sample 440
operating mode
effect on resilience 296
normal mode for PowerCenter Integration Service 221
PowerCenter Integration Service 221
PowerCenter Repository Service 296
safe mode for PowerCenter Integration Service 221
operating system profile
configuration 235
folders, assigning to 306
overview 235
pmimpprocess 235
P
page size
minimum for optimizing repository database schema 286
partition points
description 102
partitioning
enabling 106
mappings 104
maximum parallelism 102, 104
pass-through pipeline
overview 260
pass-through security
adding to connections 123
connecting to SQL data service 121
enabling caching 122
properties 61
web service operation mappings 121
PeopleSoft on Oracle
setting Char handling options 230
performance
details 270
PowerCenter Integration Service 287
PowerCenter Repository Service 287
repository copy, backup, and restore 311
repository database schema, optimizing 286
performance detail files
permissions 269
permissions
output and log files 269
recovery files 269
persistent lookup cache
session output 273
Index
457
pipeline partitioning
multiple CPUs 263
overview 263
symmetric processing platform 267
pipeline stages
description 102
plug-ins
registering 309
unregistering 309
$PMBadFileDir
option 239
$PMCacheDir
option 239
$PMExtProcDir
option 239
$PMFailureEmailUser
option 226
pmimpprocess
description 235
$PMLookupFileDir
option 239
$PMRootDir
description 238
option 239
required syntax 238
shared location 238
PMServer3XCompatibility
option 230
$PMSessionErrorThreshold
option 226
$PMSessionLogCount
option 226
$PMSessionLogDir
option 239
$PMSourceFileDir
option 239
$PMStorageDir
option 239
$PMSuccessEmailUser
option 226
$PMTargetFileDir
option 239
$PMTempDir
option 239
$PMWorkflowLogCount
option 226
$PMWorkflowLogDir
option 239
pooling
connection 96
DTM process 95
pools
connection 96
DTM process 95
port number
Metadata Manager Agent 180
Metadata Manager application 180
post-session email
Microsoft Exchange profile, configuring 232
overview 272
PowerCenter
repository reports 327
PowerCenter Integration Service
advanced properties 228
architecture 253
assign to grid 218, 243
assign to node 218
associated repository 236
458
Index
R
Rank transformation
caches 267, 272
reassignCheckedOutObject (infacmd mrs) 215
reassigning
checked-out object 215
locked object 215
recovery
files, permissions 269
PowerCenter Integration Service 279
PowerCenter Repository Service 292
safe mode 224
Index
459
recovery files
directory 239
reference data
purge orphaned data 39
reference data warehouse
database requirements 410
IBM DB2 database requirements 410
Microsoft SQL Server database requirements 410
Oracle database requirements 411
registering
local repositories 301
plug-ins 309
reject files
directory 239
overview 271
permissions 269
remote mode
Data Integration Service grid 137
logs 145
repagent caching
description 287
Reporting and Dashboards Service
advanced properties 342
creating 342
editing 345
environment variables 342
general properties 340
overview 338
security options 340
Reporting Service
configuring 334
creating 326, 328
data source properties 335
database 328
disabling 330, 331
enabling 330, 331
general properties 334
managing 330
options 328
properties 334
Reporting Service properties 335
repository properties 336
using with Metadata Manager 170
reporting source
adding 344
Reporting and Dashboards Service 344
reports
Data Profiling Reports 327
Metadata Manager Repository Reports 327
repositories
associated with PowerCenter Integration Service 236
backing up 306
code pages 299, 300
configuring native connectivity 413
content, creating 176, 297
content, deleting 176, 298
database preparation 398
database schema, optimizing 286
database, creating 282
installing database clients 413
Metadata Manager 169
moving 303
notifications 306
performance 311
persisting run-time statistics 228
restoring 307
security log file 310
Test Data Manager 377
460
Index
repositories (continued)
version control 299
repository
Data Analyzer 328
repository agent cache capacity
description 287
repository agent caching
PowerCenter Repository Service 287
Repository Agent Caching (property)
description 287
repository domains
description 299
managing 299
moving to another Informatica domain 303
prerequisites 299
registered repositories, viewing 302
user accounts 300
repository locks
managing 303
releasing 305
viewing 303
repository notifications
sending 306
repository password
associated repository for Web Services Hub 392, 393
option 236
repository properties
PowerCenter Repository Service 285
Repository Service process
description 295
repository user name
associated repository for Web Services Hub 386, 392, 393
option 236
repository user password
associated repository for Web Services Hub 386
request timeout
SQL data services requests 160
Required Comments for Checkin(property)
description 287
resilience
in exclusive mode 296
period for PowerCenter Integration Service 228
PowerCenter Integration Service 274
PowerCenter Repository Service 291
repository database 291
Resilience Timeout (property)
description 287
option 228
Resource Manager Service
architecture 368
compute node attributes 146
disabling 370
enabling 370
log level 369
node assignment 369
overview 368
properties 369
recycling 370
Resource Manager Service process
properties 370
resource provision thresholds
defining 251
description 251
overview 258
resources
configuring 244
configuring Load Balancer to check 228, 250, 257
connection, assigning 245
resources (continued)
defining custom 246
defining file/directory 246
defining for nodes 244
Load Balancer 257
naming conventions 246
node 257
predefined 244
user-defined 244
restart
PowerCenter Integration Service 276
PowerCenter Repository Service 292
PowerExchange Listener Service 318
PowerExchange Logger Service 325
restoring
PowerCenter repository for Metadata Manager 177
repositories 307
result set cache
configuring 107
Data Integration Service properties 63, 68
purging 107
SQL data service properties 160
Result Set Cache Manager
description 79
result set caching
Result Set Cache Manager 79
virtual stored procedure properties 163
web service operation properties 166
reverting
checked-out object 215
revertObject (infacmd mrs) 215
root directory
process variable 239
round-robin dispatch mode
description 248
row error log files
permissions 269
rule specifications
Content Management Service 35, 36
run-time statistics
persisting to the repository 228
S
safe mode
configuring for PowerCenter Integration Service 224
PowerCenter Integration Service 222
samples
odbc.ini file 440
SAP BW Service
associated PowerCenter Integration Service 353
creating 348
disabling 350
enabling 350
general properties 351
log events, viewing 354
managing 347
properties 352
SAP Destination R Type (property) 348, 351
SAP BW Service log
viewing 354
SAP Destination R Type (property)
SAP BW Service 348, 351
SAP NetWeaver BI Monitor
log messages 354
saprfc.ini
DEST entry for SAP NetWeaver BI 348, 351
Scheduler Service
disabling 375
enabling 375
overview 371
properties 372
recycling 375
scorecards
purging results for 116
search analyzer
changing 207
custom 207
Model Repository Service 207
search index
Model Repository Service 207
updating 208
Search Service
creating 361
custom service process properties 361
disable 362
enable 362
environment variables 361
Maximum Heap Size 360
recycle 362
service process properties 360
service properties 358
security
audit trail, creating 310
web service security 120
SecurityAuditTrail
logging activities 310
server grid
licensed option 225
service levels
creating and editing 249
description 249
overview 259
service name
Web Services Hub 386
service process variables
list of 239
service role
Data Integration Service node 77
service variables
list of 226
services
system 364
session caches
description 269
session logs
directory 239
overview 270
permissions 269
session details 270
session output
cache files 272
control file 271
incremental aggregation files 272
indicator file 272
performance details 270
persistent lookup cache 273
post-session email 272
reject files 271
session logs 270
target output file 272
SessionExpiryPeriod (property)
Web Services Hub 390
sessions
caches 269
Index
461
sessions (continued)
DTM buffer memory 267
output files 269
performance details 270
running on a grid 266
session details file 270
shared library
configuring the PowerCenter Integration Service 232
shared storage
PowerCenter Integration Service 238
state of operations 238
SID/Service Name
description 181
sort order
SQL data services 160
source data
blocking 264
source databases
connecting through ODBC (UNIX) 438
source files
Data Integration Service 91
directory 239
source pipeline
pass-through 260
reading 263
target load order groups 263
sources
reading 263
SQL data service
changing the service name 164
properties 160
SQL data services
Data Integration Service grid 126, 128
sqlplus
testing database connections 414
startup type
configuring applications 153
configuring SQL data services 160
state of operations
PowerCenter Integration Service 238, 279
PowerCenter Repository Service 292
shared location 238
Stop option
disable Integration Service process 220
disable PowerCenter Integration Service 220
disable the Web Services Hub 387
Sybase ASE
connecting to Integration Service (UNIX) 433
connecting to Integration Service (Windows) 421
Sybase ASE database requirements
Data Analyzer repository 400
PowerCenter repository 408
symmetric processing platform
pipeline partitioning 267
system parameters
Data Integration Service 91
defining values 91
system services
overview 364
Resource Manager Service 368
Scheduler Service 371
T
table owner name
description 286
462
Index
tablespace name
for repository database 286, 336
tablespace recovery
IBM DB2 119
Microsoft SQL Server 120
Oracle 119
tablespaces
single nodes 408
target databases
connecting through ODBC (UNIX) 438
target files
directory 239
multiple directories 106
output files 272
target load order groups
mappings 263
targets
output files 272
session details, viewing 270
tasks
dispatch priorities, assigning 249
TCP/IP network protocol
Data Integration Service 75
team-based development
administering 214, 215
command line program administration 215
Objects view 214, 215
troubleshooting 214, 216
temporary files
directory 239
temporary tables
description 113
operations 114
rules and guidelines 115
Teradata
connecting to Informatica clients (UNIX) 435
connecting to Informatica clients (Windows) 422
connecting to Integration Service (UNIX) 435
connecting to Integration Service (Windows) 422
Test Data Manager
repository 381
Test Data Manager repository
creating 381
Test Data Manager Service
advanced properties 381
assign a new license 383
components 377
creating 382
description 377
general properties 378
properties 377
service properties 378
steps to create 381
TDM repository configuration properties 379
TDM server configuration properties 380
thread pool size
configuring maximum 63
threads
creation 260
mapping 260
master 260
post-session 260
pre-session 260
processing mappings 102
reader 260
transformation 260
types 261
writer 260
timeout
SQL data service connections 160
writer wait timeout 232
Timeout Interval (property)
description 185
Tracing
error severity level 228, 390
TreatCHARAsCHAROnRead
option 230
TreatDBPartitionAsPassThrough
option 232
TreatNullInComparisonOperatorsAs
option 232
troubleshooting
grid for Data Integration Service 149
grid for PowerCenter Integration Service 247
versioning 214, 216
TrustStore
option 228
U
undoing
checked-out object 215
Unicode mode
code pages 268
Data Integration Service 79
Unicode data movement mode, setting 226
UNIX
connecting to ODBC data sources 438
database client environment variables 414
database client variables 414
unlocking
locked object 215
UnlockObject (infacmd mrs) 215
unregistering
local repositories 301
plug-ins 309
upgrade error
Model Repository Service 396
URL scheme
Metadata Manager 183
Web Services Hub 386, 389
user connections
closing 305
managing 303
viewing 304
user-managed cache tables
configuring 111
description 111
users
notifications, sending 306
UTF-8
repository code page, Web Services Hub 386
writing logs 228
V
ValidateDataCodePages
option 232
validating
source and target code pages 232
version control
enabling 299
repositories 299
W
Warning
error severity level 228, 390
web service
changing the service name 166
enabling 166
operation properties 166
properties 164
security 120
web service security
authentication 120
authorization 120
HTTP client filter 120
HTTPS 120
message layer security 120
pass-through security 120
permissions 120
transport layer security 120
web services
Data Integration Service grid 126, 128
Web Services Hub
advanced properties 388, 390
associated PowerCenter repository 392
associated Repository Service 386, 392, 393
associated repository, adding 392
associated repository, editing 393
associating a PowerCenter repository Service 386
character encoding 389
creating 386
custom properties 388
disable with Abort option 387
disable with Stop option 387
disabling 387
domain for associated repository 386
DTM timeout 390
enabling 387
general properties 388, 389
host names 386, 389
host port number 386, 389
Hub Logical Address (property) 390
internal host name 386, 389
internal port number 386, 389
keystore file 386, 389
keystore password 386, 389
license 386, 389
location 386
MaxISConnections 390
node 386
node assignment 388, 389
password for administrator of associated repository 392, 393
properties, configuring 388
security domain for administrator of associated repository 392
service name 386
Index
463
464
Index
workflow schedules
safe mode 224
workflows
Data Integration Service grid 132, 137
database requirements 411
grids in local mode 134
grids in remote mode 141
running on a grid 265
Workflow Orchestration Service properties 65
writer wait timeout
configuring 232
WriterWaitTimeOut
option 232
X
XMLWarnDupRows
option 232
Z
ZPMSENDSTATUS
log messages 354