SlideShare a Scribd company logo
2White Paper|Working with Informatica-Teradata Parallel Transporter
1. Introduction
Today's growing data warehouses demand fast, reliable tools are help to acquire and
manage data and flexibility to load large volumes of data from any source at any
time. Challenges come from everywhere: more data sources, growing data volumes,
dynamically changing business requirements and user demands for fresher data.
Teradata is a global technology leader in enterprise data warehousing, business
analytics, and data warehousing services. Teradata provides a powerful suite of
software that includes the Teradata Database, data access and management tools,
and data mining applications. PowerCenter works with the Teradata Database and
Teradata tools to provide a data integration solution that allows integrating data from
virtually any business system into Teradata as well as leveraging Teradata data for
use in other business systems.
PowerCenter uses the following techniques when extracting data from and loading
data to the Teradata database:
ETL
(Extract, transform, and load). This technique extracts data from the source systems,
transforms the data within PowerCenter, and loads it to target tables.
ELT
(Extract, load, and then transform). This technique extracts data from the source
systems, loads it to user-defined staging tables in the target database, and
transforms the data within the target system using generated SQL. The SQL queries
include a final insert into the target tables.
ETL-T
(ETL and ELT hybrid). This technique extracts data from the source systems,
transforms the data within PowerCenter, loads the data to user-defined staging tables
in the target database, and further transforms the data within the target system using
generated SQL. The SQL queries include a final insert into the target tables. The ELT-T
technique is optimized within PowerCenter, so that the transformations that better
perform within the database system can be performed there and the Integration
Service performs the other transformations.
To perform ETL operations, configure PowerCenter sessions to use a Teradata
relational connection, a Teradata standalone load or unload utility, or Teradata Parallel
Transporter. To use ELT or ETL-T techniques, configure PowerCenter sessions to use
pushdown optimization.
3White Paper|Working with Informatica-Teradata Parallel Transporter
2. Teradata standalone Load and Unload Utilities
Teradata standalone load and unload utilities are fast, reliable tools that help to
export large amounts of data from Teradata databases and load session target files
into Teradata databases. Use a standalone load or unload utility when PowerCenter
sessions extract or load large amounts of data. Standalone load and unload utilities
are faster than Teradata relational connections because they load or extract data
directly from a file or pipe rather than run SQL commands to load or extract the data.
PowerCenter works with the following Teradata standalone load and unload utilities:
2.1. Teradata FastLoad
x Teradata FastLoad is a command-line utility that quickly loads large amounts
of data to empty tables in a Teradata database. Use FastLoad for a high-
volume initial load or for high-volume truncate and reload operations.
FastLoad is the fastest load utility, but it has the following limitations:
x FastLoad uses multiple sessions to load data, but it can load data to only one
table in a Teradata database per job.
x It locks tables while loading data, preventing others and other instances of
FastLoad from accessing the tables during data loading.
x FastLoad only works with empty tables with no secondary indexes.
x It can only insert data.
2.2. Teradata MultiLoad
Teradata MultiLoad is a command-driven utility for fast, high-volume
maintenance on multiple tables and views of a Teradata database. Each
MultiLoad instance can perform multiple data insert, update, and delete
operations on up to five different tables or views. MultiLoad optimizes operations
that rapidly acquire, process, and apply data to Teradata tables. Use MultiLoad
for large volume, incremental data loads.
2.2.1.MultiLoad has the following advantages:
x MultiLoad is very fast. It can process millions of rows in a few minutes.
x MultiLoad supports inserts, updates, upserts, deletes, and data-driven
operations in PowerCenter.
x We can use variables and embed conditional logic into MultiLoad control
files.
x MultiLoad supports sophisticated error recovery. It allows load jobs to be
restarted without having to redo all of the prior work.
4White Paper|Working with Informatica-Teradata Parallel Transporter
2.2.2. MultiLoad has the following limitations:
x MultiLoad is designed for the highest possible throughput, so it can be very
resource intensive.
x It locks tables while loading data, preventing others and other instances of
MultiLoad from accessing the tables during data loading.
x Because of its “phased” nature, there are potentially inconvenient windows
of time when MultiLoad cannot be stopped without losing access to target
tables.
2.3. Teradata TPump
Teradata TPump is a highly parallel utility that can continuously move data from
data sources into Teradata tables without locking the affected table. TPump
supports inserts, updates, deletes, and data-driver updates. TPump acquires row
hash locks on a database table instead of table-level locks, so multiple TPump
instances can load data simultaneously to the same table. TPump is often used to
“trickle-load” a database table. Use TPump for low volume, online data loads.
2.3.1.TPump has the following advantages:
x TPump can refresh database tables in near real-time.
x TPump continuously loads data into Teradata tables without locking the
affected tables, so users can run queries when TPump is running.
x TPump is less resource-intensive than MultiLoad because it does not write to
temporary tables.
x Users can control the rate at which statements are sent to the Teradata
database, limiting resource consumption.
x It supports parallel processing.
x TPump can always be stopped and all of its locks dropped with no effect.
x TPump is not as fast as the other standalone loaders for large volume loads
because it changes the same data block multiple times.
2.4. Teradata FastExport
Teradata FastExport is a command-driven utility that uses multiple sessions to
quickly transfer large amounts of data from Teradata sources to PowerCenter.
Use FastExport to quickly extract data from Teradata sources.
2.4.1. FastExport has the following advantages:
x It is faster than Teradata relational connections when extracting large
amounts of data.
x FastExport can be run in streaming mode, which avoids the need to stage
the data file.
5White Paper|Working with Informatica-Teradata Parallel Transporter
x We can encrypt the data transfer between FastExport and the Teradata
server.
x FastExport is available for sources and pipeline lookups.
When we create a FastExport connection, verify the settings of the following
connection attributes:
x Data encryption.
Enable this attribute to encrypt the data transfer between FastExport and
the Teradata server so that unauthorized users cannot access the data being
transferred across the network.
x Fractional seconds.
This attribute specifies the precision of the decimal portion of timestamp
data. To avoid session failure or possible data corruption, make sure this
value matches the timestamp precision of the column in the Teradata
database. For date column filed selects from Teradata to Informatica we
need to match the Teradata date format with Informatica date format
To use FastExport in a session, configure the mapping to extract from a Teradata
source, configure the session to read from FastExport instead of a relational
database, and select the FastExport connection for the session. When a session
transfers data between Teradata and PowerCenter, the following files are
created:
x A staging file or pipe.
PowerCenter creates a staging file or named pipe for data transfer based on
how we configure the connection. Named pipes are generally faster than
staging files because data is transferred as soon as it appears in the pipe. If
we use a staging file, data is not transferred until all data appears in the file.
x A control file.
PowerCenter generates a control file that contains instructions for loading or
extracting data. PowerCenter creates the control file based on the loader or
FastExport attributes we configure for the connection and the session.
x A log file.
The load or unload utility creates a log file and writes error messages to it.
The PowerCenter session log indicates whether the session ran successfully,
but does not contain load or unload utility error messages. Use the log file to
debug problems that occur during data loading or extraction.
6White Paper|Working with Informatica-Teradata Parallel Transporter
By default, loader staging, control, and log files are created in the target file
directory. The FastExport staging, control, and log files are created in the
PowerCenter temporary files directory.
All of these load and unload utilities are included in the Teradata Tools and Utilities
(TTU), available from Teradata.PowerCenter supports these entire standalone load
and unload utilities. Before we can configure a session to use a load or unload utility,
create a loader or FastExport (application) connection in the PowerCenter Workflow
Manager and enter a value for the TDPID in the connection attributes.
For more information about creating connection objects in PowerCenter, see the
APPENDIX A, APPENDIX-B and PowerCenter Workflow Basics Guide.
3. Teradata Parallel Transporter
Teradata Parallel Transporter (TPT) is a client application that provides scalable, high-
speed, parallel data extraction, loading, and updating. It uses and expands upon the
functions and capabilities of the standalone Teradata load and unload utilities.
Teradata PT supports a single scripting environment with different system operators
for extracting and loading data. It also supports massive parallel extraction and
loading, so if we partition a Teradata PT session, multiple Teradata PT instances can
extract or load large amounts of data in the same database tables at the same time.
To provide the functionality of the standalone load and unload utilities, Teradata PT
extracts or loads data using one of the following system operators:
x Export. Exports large data sets from Teradata tables or views and imports the
data to file or Informatica pipe using the FastExport protocol.
x Load. Bulk loads large volumes of data into empty Teradata database tables using
the FastLoad protocol.
x Update. Batch updates, inserts, upserts, and deletes data in Teradata database
tables using the MultiLoad protocol.
x Stream. Continuously updates, inserts, upserts, and deletes data in near real-time
using the TPump protocol.
Fig1. TPT operators for Informatica job
7White Paper|Working with Informatica-Teradata Parallel Transporter
3.1.Teradata PT advantages:
Teradata PT is up to 20% faster than the standalone Teradata load and unload
utilities, even though it uses the underlying protocols from the standalone
utilities.
Teradata PT supports recovery for sessions that use the Stream operator when
the source data is repeatable. This feature is especially useful when running real-
time sessions and streaming the changes to Teradata.
Users can invoke Teradata PT through a set of open APIs that communicate with
the database directly, eliminating the need for a staging file or pipe and a control
file.
Teradata PT includes added features are enabled by users and some are
improvements made behind the scenes. They include:
3.1.1. Parallel Reading:
Allowing multiple instances of a single operator to read from the same input
file takes advantage of the file I/O cache monitored and maintained by the
operating system (OS). As long as all instances run at the same relative
speed, the OS will read a block from the file into memory and all instances
will be accessing the same memory. Dramatic improvements in performance
have been seen when using up to five or six instances reading from a single
(very large) file.
3.1.2. Exporting of Data:
The normal operating procedure for consumer operators is for the first
instance to get as much work as possible, provided it can keep up with the
rate at which data arrives through the data streams. The default behavior is
to only write to a single output file.
3.1.3. Parallel Writing:
If multiple instances are specified for the DataConnector operator (as a file
writer) and indicated that each instance should write to a different file, then
there can be some parallelism in data output. However, when exporting data
to flat files, this results in an uneven distribution of data across output files.
3.1.4. Round-Robin Data Distribution:
This new parallel writing feature allows for the ability to switch to a round-
robin fashion of file writing. This can evenly distribute data across multiple
8White Paper|Working with Informatica-Teradata Parallel Transporter
files, thereby improving the performance of writing to disk even more. That
performance improvement can be further enhanced by specifying each
output file on a separate disk.
PowerCenter communicates with Teradata PT using PowerExchange for Teradata
Parallel Transporter, which is available through the Informatica-Teradata Enterprise
Data Warehousing Solution. PowerExchange for Teradata Parallel Transporter provides
integration between PowerCenter and Teradata databases for data extraction and
loading.
PowerExchange for Teradata Parallel Transporter executes Teradata PT operators
directly through API [Fig2] calls. This improves performance by eliminating the
staging file or named pipe. It also improves security by eliminating the control file, so
there is no need to overwrite or store passwords in the control file. PowerExchange for
Teradata Parallel Transporter supports session and workflow recovery. It also captures
Teradata PT error messages and displays them in the session log, so do not need to
check the utility log file when errors occur. Before we can configure a session to use
Teradata PT, we must create a Teradata PT (relational) connection in the Workflow
Manager and enter a value for the TDPID in the connection attributes.
For more information about creating Teradata PT connection objects in PowerCenter,
see the APPENDIX A and APPENDIX B
Fig2. Informatica communicates with Teradata using TPT API Interface
9White Paper|Working with Informatica-Teradata Parallel Transporter
3.2. Selecting the Load utility based on records counts
For loading data into table, we will use different kinds of Teradata load utilities
based on data volume. Below table explains the best load utility to use in
Informatica based on data volume.
Task First Choice Second Choice
Insert a small number of rows into
an empty table
TPT-Load, FastLoad TPump, TPT-Steam, or
Relational from Informatica
Insert a large number of rows into
an empty table
TPT-Load, FastLoad TPump, TPT-Steam, or
Relational from Informatica
Insert a small number of rows into
small populated table
TPT-Update, TPT-
Stream, Relational,
Multiload, or TPump
Insert a small number of rows into
large populated table
TPump, TPT-Steam,
or Relational
Insert a moderate number of rows
into large populated table
TPT-Update, Multiload TPT-Update, TPT-Stream,
Relational, Multiload, TPump
Insert a large number of rows into
an multiple populated tables
TPT-Update, Multiload TPump, TPT-Steam, or
Relational
Insert a large number of rows into
large populated table
TPT-Update, Multiload
Update/Delete large number of rows TPT-Update, Multiload Relational
Update/Delete small number of rows Relational TPump, TPT-Steam, or
Relational
3.3. Using TPT connection with Load and Update operator in session:
To configure a session to extract data, configure the associated mapping to read
from Teradata, change the reader type for the session to Teradata Parallel
Transporter Reader, and select the Teradata PT connection.
1. In 'writers' select 'Teradata Parallel Transporter Writer'.
2. Once we select the above option in 'Connections' two types of connections will
appear.
i. The first connection defines the connection to Teradata PT API i.e., TPT
connection. The PowerCenter Integration Service uses Teradata PT API
connection to extract from or to load data to Teradata.
ii. The second connection defines an optional ODBC connection to the target
database. The PowerCenter Integration Service uses the target ODBC
connection to drop log, error, and work tables, truncate target tables, and
create recovery tables in the target database. The PowerCenter Integration
Service does not use the ODBC connection to extract from or load data to
Teradata.
10White Paper|Working with Informatica-Teradata Parallel Transporter
3. Select appropriate TPT and ODBC connections for the two types as shown
below
4. Make sure the following session properties are set when using TPT
connections:
i. Specify the database and table name for work table.
ii. Truncate table option can be used for Load, stream and update system
operator.
5. We can select any of the following options for 'Mark missing rows' option as
per requirement which specifies how TPT handles rows if it is not present in
the target table.
x Mark Missing Rows:
i. None - If Teradata PT API receives a row marked for update or delete but it
is missing in the target table, it does not mark the row in the error table.
ii. For Update - If Teradata PT API receives a row marked for update but it is
missing in the target table, it marks the row as an error row.
iii. For Delete - If Teradata PT API receives a row marked for delete but it is
missing in the target table, it marks the row as an error row.
iv. Both - If Teradata PT API receives a row marked for update or delete but it
is missing in the target table, it marks the row as an error row.
6. Similarly we can select the following options for Mark Duplicate Rows option
as well.
Fig3. Selecting TPT connection type and connection name
11White Paper|Working with Informatica-Teradata Parallel Transporter
x Mark Duplicate Rows:
i. None - If Teradata PT API receives a row marked for insert or update that
causes a duplicate row in the target table, it does not mark the row in the
error table.
ii. For Insert - If Teradata PT API receives a row marked for insert but it exists
in the target table, it marks the row as an error row.
iii. For Update - If Teradata PT API receives a row marked for update that
causes a duplicate row in the target table, Teradata PT API marks the row as
an error row.
iv. Both - If Teradata PT API receives a row marked for insert or update that
causes a duplicate row in the target table, Teradata PT API marks the row as
an error row.
7. Specify the database name and the table name for log table to be created
8. Specify the database name and the table name for error tables to be created.
9. Drop work/log/error tables option is available. This option can be checked
when we wish to drop the intermediate tables that are created during session
run.
10. Pack, Pack minimum and buffers options are all used when we use Stream
load operator.
11. Tracing options can be selected as per needs. Usually we use the default
values
Fig4. Selecting TPT (Load or update) connection Options
12White Paper|Working with Informatica-Teradata Parallel Transporter
3.4. TPT Connection with export operator:
Connection that uses export operator can be used only as a source connection.
The below image shows, how to select The TPT export operator for Teradata
source tables in Informatica session
3.5. TPT connection using Stream system operator:
Stream operator uses macros to modify tables. So macro database has to be
specified in the session properties while using this type of connection.
3.5.1. Macro Database:
Name of the database that stores the macros Teradata PT API creates when
we select the Stream system operator. The Stream system operator uses
macros to modify tables. It creates macros before Teradata PT API begins
loading data and removes them from the database after Teradata PT API
loads all rows to the target.
Fig5. Selecting TPT Export connection for Source
13White Paper|Working with Informatica-Teradata Parallel Transporter
3.6. Restart in case of failure:-
3.6.1. Restart for System Operators- Update, Export or Stream
In case there is a session failure where the session uses one of the operators
like Update, Export or Stream then the session can rerun successfully once
all the intermediate tables are dropped.
3.6.2. Restart for System Operator - Load
In case of a session failure where it uses System operator as 'Load' then we
need to follow the below mentioned steps to rerun the session successfully:
There is no need to drop and recreate the target table for load operator,
instead a simple checkout of the session and an ' uncheck ' for the below
options in session properties to run the failed TPT job is sufficient.
x Drop Error/Work/Log Tables
x Truncate Table options
Also, there is no need to drop the intermediate tables. And there is no
manual intervention required at the Database side. Once the session reruns
successfully after the failure one must keep in mind that the session has to
be 'Undo checkout’ for the next future successful runs.
3.7. Comparison of loading Teradata table using Informatica with
Teradata Relational Connection and Teradata TPT Connection
The Teradata RDBMS offers several tools to load external data into database. The
Teradata load utilities can be divided into two main groups:
The first group of utilities makes use of the transient journal but loading of data
is slower, while the second group is thought to bring the data as fast as possible
into the target tables because of bypassing the transient journal.
The second group utilities include the BTEQ, TPUMP and relational/native
connections to load the data into Teradata tables. These will insert the data
record by record into target table, allowing full transaction handling and rollback
functionality.
These are still quite useful for loading small amounts of data, but as it were
missing some important features like the usage of several parallel load sessions.
14White Paper|Working with Informatica-Teradata Parallel Transporter
The Image Fig5, it is Informatica Workflow monitor workflow load statics. Here the
source, target are Teradata tables and relational connections are used. The
relational connection uses the transient journal and loads the data record by
record into target table, so the throughput of target loading is less and it was
taken around 5 hours of time to load the 1.7 millions of records.
The second group utilities include the Teradata standalone Load and Unload
Utilities. These are the fastest way of getting external data into Teradata.it will
bypass transaction handling and data is loaded in big chunks, namely blocks of 64
kilobytes of data.
Bypassing the transaction log (transient journal) means several restrictions:
x It does not support target tables with unique secondary indexes (USI), join
indexes (JI), referential integrity (RI) or triggers. FastLoad even goes a step
further and requires the target table to be empty.
x In the case of a SET target table, a duplicate row check would be required.
FastLoad by default remove the record level duplicates
x Multiload, as the second tool in the set of fast loading utilities can take over
additional tasks such as UPDATE, DELETE, UPSERT etc. but all again on a block
level. Multiload uses temporary work tables for this purpose and does not reach
the load performance like FastLoad but still is a pretty good choice to get in
huge amounts of data very fast.
TPT is to integrate with the infrastructure in a plug-in fashion to perform Teradata
standalone Load and Unload Utilities.
Fig6. Session load statistics with Teradata relational connection
15White Paper|Working with Informatica-Teradata Parallel Transporter
The Image Fig6, it is Informatica Workflow monitor workflow load statistics. Here
the source, targets are Teradata tables and TPT connection used. The TPT
operators load data by block level, so the throughput of target loading is huge and
it was taken few seconds to load the 1.7 millions of records.
Each group has its own advantages and disadvantages and depending on load
requirements. The decision, when we should use which tool for loading depends on
the amount of data records and the number of bytes a data records is consuming.
3.8. Loading the Teradata target table using Informatica update else
insert logic by using TPT connection.
In Informatica session, we can update the target table against to source table
data. Generally we will use the two flows for insert else update operation one flow
is for bulk insert and another one is for update.
The TPT update operator (Multiload) will lock the target table to load the data.
When we will use the two or more target instances in single session, the TPT will
lock the target table while loading the data using one of the targets instance and
session will try to load the data using another target instance, here again the
MultiLoad job try to lock the same target table, which is already locked by insert
instance so session will fail with the TPT error.
“Error Code 2652: Operation not allowed - Table is being mloaded”
To avoid this kind of error, we need to make use of Informatica Update else Insert
option from session properties and treat source rows as update. We will update
the record based on key columns and we will not update some fields like
Create_Timestamp. If we want to use single instance for insert and update the
record then we need to update record with previous (target) values for some
fields which we don’t want to update with new values. i.e., Create_Timestamp.
Fig7. Session load statistics with Teradata relation connection
16White Paper|Working with Informatica-Teradata Parallel Transporter
3.9. Loading the PPI indexed Teradata table with Informatica by using
TPT connection.
Whenever we need to Load and select incremental data from a huge table, we can
use Multiload jobs to load data and create index to select data. But when we create
index like Secondary index or join index Teradata Bulk load utilities doesn’t work.
We can overcome this issue by using the partitioned primary index on the table.
Teradata Multiload job will work on partitioned primary indexed tables.
For MultiLoad job, the table should contain the primary index for update the table
records and we need another index on table for fast selecting of incremental data.
We can use PI for MultiLoad job and PPI for selecting incremental data. For
portioned primary index tables, Teradata MultiLoad tasks require all values of the
primary index column set and all values of the partitioning column set for deletes
and updates. The table creation statement will be:
CREATE TABLE <TABLE>
(
Primary_key_field INTEGER,
Field_1 INTEGER,
Field_2 Varchar (10),
CREATE_DT TIMESTAMP (0),
UPDATE_DT TIMESTAMP (0))
PRIMARY INDEX (Primary_key_field, UPDATE_DT)
PARTITION BY (
RANGE_N (UPDATE_DT BETWEEN DATE ‘2014-01-01′ AND DATE ‘2020-12-31′ EACH
INTERVAL ‘1’ DAY, NO RANGE, UNKNOWN));
We will load data with CREATE_DT and UPDATE_DT values as current_date for
inserting records and for updating records UPDATE_DT will be updated with
current_date.
We will select the Incremental data using
SELECT * FROM <TABLE>
WHERE UPDATE_DT= current_date
This will select both inserted and updated records for the day.
4. Issues Affecting Loading to and Unloading from Teradata
This section describes issues might encounter when we move data between
PowerCenter and Teradata.
4.1. Increasing Lookup Performance
Applies to: Teradata relational connections, FastExport
17White Paper|Working with Informatica-Teradata Parallel Transporter
Sessions that perform lookups on Teradata tables must use Teradata relational
connections. If we experience performance problems when running a session that
performs lookups against a Teradata database, we might be able to increase
performance in the following ways:
x Use FastExport to extract data to a flat file and perform the lookup on the flat file.
x Enable or disable the Lookup Cache.
4.2. Using FastExport to Extract Lookup Data
If a session performs a lookup on a large, static Teradata table, we might be able
to increase performance by using FastExport to extract the data to a flat file and
configuring the session to look up data in the flat file. To do this, redesign the
mapping as follows:
1. Create a simple, pass-through mapping to pass the lookup data to a flat file.
Configure the session to extract data to the flat file using FastExport.
2. Configure the original mapping to perform the lookup on the flat file.
Note: If we redesign the mapping using this procedure, we can further increase
performance by specifying an ORDER BY clause on the FastExport SQL and
enabling the Sorted Input property for the lookup file. This prevents Power
Center from having to sort the file before populating the lookup cache.
4.3. Restarting a Failed MultiLoad Job Manually
Applies to: MultiLoad
When loading data, MultiLoad puts the target table into the “MultiLoad” state and
creates a log table for the target table. After successfully loading the data, it
returns the target table to the normal (non-MultiLoad) state and deletes the log
table. When we load data using MultiLoad, and the MultiLoad job fails for any
reason, MultiLoad reports an error, and leaves the target table in the Multi-Load
state. Additionally, MultiLoad queries the log table to check for errors. If a target
table is in the MultiLoad state or if a log table exists for the target table, we
cannot restart the job.
To recover from a failed MultiLoad job, we must release the target table from the
MultiLoad state and drop the Multiload log, error and work tables.
To release the target table from Multi-load state:
Release mload <target table>;
4.4. Configuring Sessions that Load to the Same Table
Applies to: MultiLoad
18White Paper|Working with Informatica-Teradata Parallel Transporter
While Teradata MultiLoad loads data to a database table, it locks the table.
MultiLoad requires that all instances handle wait events so they do not try to
access the same table simultaneously.
If we have multiple PowerCenter sessions that load to the same Teradata table
using MultiLoad, set the Tenacity attribute for the session to a value that is
greater than the expected run time of the session. The Tenacity attribute controls
the amount of time a MultiLoad instance waits for the table to become available.
Also configure each session to use unique log file names.
For more information about the Tenacity, see the PowerCenter Advanced
Workflow Guide.
4.5. Loading from Partitioned Sessions
Applies to: FastLoad, MultiLoad
When we configure multiple partitions in a session that uses staging files, the
Integration Service creates a separate flat file for each partition. Since FastLoad
and MultiLoad cannot load data from multiple files, use round-robin partitioning
to route the data to a single file. When we do this, the Integration Service writes
all data to the first partition and starts only one instance of FastLoad or
MultiLoad. It writes the following message in the session log:
MAPPING> DBG_21684 Target [TD_INVENTORY] does not support multiple partitions. All
data will be routed to the first partition.
If we do not route the data to a single file, the session fails with the following
error:
WRITER_1_*_1> WRT_8240 Error: The external loader [Teradata Mload Loader] does not
support partitioned sessions.
WRITER_1_*_1> Thu Jun 16 11:58:21 2005
WRITER_1_*_1> WRT_8068 Writer initialization failed. Writer terminating.
For more information about loading from partitioned sessions, see the
PowerCenter Advanced Workflow Guide.
4.6. Using Error Tables to Identify Problems during Loading
Applies to: FastLoad, MultiLoad, TPump
When problems occur during loading data, the Teradata standalone load utilities
generate error tables. (FastExport generates an error log file.) The load utilities
generate different errors during the different phases of loading data.
FastLoad jobs run in two main phases: loading and end loading. During the
loading phase, FastLoad initiates the job, locks the target table, and loads the
data. During the end loading phase, the Teradata database distributes the rows
19White Paper|Working with Informatica-Teradata Parallel Transporter
of data to the target table and unlocks it. FastLoad requires an exclusive lock on
the target table during the loading Phase.
MultiLoad also loads data during two main phases: acquisition and application.
In the acquisition phase, MultiLoad reads the input data and writes it to a
temporary work table. In the application phase, MultiLoad writes the data from
the work table to the actual target table. MultiLoad requires an exclusive lock on
the target table during the application phase.
TPump loads data in a single phase. It converts the SQL in the control file into a
database macro and applies the macro to the input data. TPump uses standard
SQL and standard table locking.
The following table lists the error tables can check to troubleshoot load or unload
utility errors:
Utility
Data Loading
Phase
Default Error table Name
Error types
FastLoad
x Loading
x End Loading
x INFA_ET1_<SEQ_no>
x INFA_ET2_<SEQ_no>
x Constraint violations, conversion
errors, unavailable AMP conditions
x Unique Primary index violations
MultiLoad
x Acquisition
x Application
x INFA_ET1_<SEQ_no>
x INFA_ET2_<SEQ_no>
x All Acquisition phase errors,
application phase errors if the
Teradata database cannot build a
valid primary index
x Uniqueness violations, field overflow
on columns other than primary
index fields, constraints errors
TPump Single phase x INFA_ET_<SEQ_no> x All TPump errors
When a load fails, check the “ET1_” error table first for specific information. The
Error Field or Error Field Name column indicates the column in the target table
that could not be loaded. The Error Code field provides details that explain why
the column
x 2689: Trying to load a null value into a non-null field
x 2665: Invalid date format
In the MultiLoad “ET2_” error table, we can also check the DBC Error Field
column and DBC Error Code field. The DBC Error Field column is not initialized in
the case of primary key uniqueness violations. The DBC Error Code that
corresponds to a primary key uniqueness violation is 2794.
20White Paper|Working with Informatica-Teradata Parallel Transporter
5. Conclusion
Teradata Parallel Transporter provides flexible, high-performance load and unload
facilities that enable population of the data warehouse. It supports many traditional
functions of Teradata utilities, as well as many that are new and advanced. Through its
parallel execution design, Teradata Parallel Transporter improves ETL performance
while providing a uniform interface. It also facilitates active data warehousing by
allowing continuous and scalable access to data sources and targets, thus keeping the
data fresh and with maximum throughput. In all, Teradata Parallel Transporter drives
business value by enabling real-time decision making as required by enterprise-wide
business intelligence (BI) applications.
APPENDIX-A
Before we run sessions that move data between PowerCenter and Teradata, we might
want to install Teradata client tools. We also need to locate the Teradata TDPID.
1. Teradata Client Tools
Teradata client tools help to communicate with the Teradata database and debug
problems that occur when session loads data to or extracts data from the Teradata
database. We can install the following Teradata client tools:
2. BTEQ.
A general-purpose, command-line utility (similar to Oracle SQL*Plus) that enables to
communicate with one or more Teradata databases.
3. Teradata SQL Assistant.
A GUI-based tool that allows retrieving data from any ODBC-compliant database
server and manipulating and storing the data in desktop applications. Teradata
Quarryman is the older version of this tool. Install BTEQ or Teradata SQL Assistant to
help we debug problems that occur when loading to and extracting from Teradata.
Both tools are included in the Teradata Utility Pack, which is available from Teradata.
4. TDPID
The Teradata TPDID indicates the name of the Teradata instance and defines the
name a client uses to connect to a server. When we use a Teradata Parallel
Transporter or a standalone load or unload utility with PowerCenter, we must specify
the TDPID in the connection properties. The Teradata TDPID appears in the hosts file
21White Paper|Working with Informatica-Teradata Parallel Transporter
on the machines on which the Integration Service and PowerCenter Client run. By
default, the hosts file appears in the following location:
UNIX: /etc/hosts
Windows: %SystemRoot%system32driversetchosts*
* The actual location is defined in the Registrykey
HKEY_LOCAL_MACHINESYSTEMCurrentControlSetServicesTcpipParametersDataB
asePath
The hosts file contains client configuration information for Teradata. In a hosts file
entry, the TDPID precedes the string“cop1.”
For example, the hosts file contains the following entries:
127.0.0.1 Localhost demo1099cop1
192.168.80.113 td_1 custcop1
192.168.80.114 td_2 custcop2
192.168.80.115 td_3 custcop3
192.168.80.116 td_4 custcop4
The first entry has the TDPID “demo1099.” This entry tells the Teradata database that
when a client tool references the Teradata instance “demo1099,” it should direct requests
to “localhost” (IP address 127.0.0.1).The following entries have the same TDPID, “cust.”
Multiple hosts file entries with the same TDPID indicate the Teradata instance is
configured for load balancing among nodes. When a client tool attempts to reference
Teradata instance “cust,” the Teradata database directs requests to the first node in the
entry list, “td_1.” If it takes too long for the node to respond, the database redirects the
request to the second node, and so on. This process prevents the first node, “td_1” from
becoming overloaded.
APPENDIX-B
a. Teradata Connections
Teradata relational connections use ODBC to connect to Teradata. PowerCenter
uses the ODBC Driver for Teradata to retrieve metadata and read and write to
Teradata. To establish ODBC connectivity between Teradata and PowerCenter,
install the ODBC Driver for Teradata on each PowerCenter machine that
communicates with Teradata. The ODBC Driver for Teradata is included in the
Teradata Tools and Utilities (TTU). We can download the driver from the Teradata
22White Paper|Working with Informatica-Teradata Parallel Transporter
web site. PowerCenter works with the ODBC Driver for Teradata available in the
following TTU versions:
Informatica
Version
Teradata
Version(s)
9.5
TD 14.0
TD 13.10
9.1
TD 13.10
TD 13.0
TD 12.0
9.0.1
TD 13.10
TD 13.0
TD 12.0
8.6.1
TD 13.0
TD 12.0
8.6.0 TD 12.0
8.1.1 SP1 -SP5 TD 12.0
To load or extract data using a Teradata relational connection on UNIX, we must
verify the configuration of environment variables and the odbc.ini file on the
machine on which the Integration Service runs. To verify the environment variable
configuration, ensure the Teradata ODBC path precedes the Data Direct driver
path information in the PATH and shared library path environment variables. Place
the Teradata path before the Data Direct path because both sets of ODBC software
use some of the same file names. To verify the odbc.ini file configuration, make
sure there is an entry for the Teradata ODBC driver in the “[ODBC Data Sources]”
section of odbc.ini. The following excerpt from an odbc.ini file shows a Teradata
ODBC driver (tdata.so) entry on Linux:
[ODBC Data Sources]intdv14=tdata.so
[intdv14]
Driver=/usr/odbc/drivers/tdata.so
Description= running Teradata V14DBCName=intdv14
SessionMode=Teradata
CharacterSet=UTF8
StCheckLevel=0
DateTimeFormat=AAA
LastUser=
Username=
Password=
Database=
DefaultDatabase=
23White Paper|Working with Informatica-Teradata Parallel Transporter
ODBC is a native interface for Teradata. Teradata provides 32- and 64-bit ODBC
drivers for Windows and UNIX platforms. The driver bit mode must be compatible with
the bit mode of the platform on which the PowerCenter Integration Services runs. For
example, 32-bit PowerCenter only runs with 32-bit drivers.
For more information about configuring odbc.ini, see the PowerCenter Configuration
Guide and the ODBC Driver for Teradata User Guide.
b. Creating a Teradata Relational Connection
When we create a Teradata (relational) connection object in the Workflow Manager,
choose “Teradata,” and not “ODBC,” as the connection type in the connection
properties
.
When we choose Teradata as the connection type, the Integration Service still uses
Teradata ODBC to connect to Teradata. Although both ODBC and Teradata connection
types might work, the Integration Service communicates with the Teradata database
more efficiently when we choose the Teradata connection type.
c. Creating a Teradata Parallel Transporter Connection
When we create a Teradata (TPT) connection object in the Workflow Manager,
choose “Teradata PT connection”. The Teradata PT connection will ask for
connection attributes.
1. In connections → relational → select Type as 'Teradata PT Connection' → click on
New.
Fig8. Teradata relational connection creation
24White Paper|Working with Informatica-Teradata Parallel Transporter
2. Configure the connection as per the requirement as shown in the screen shot
below:
Here we can specify the TDPID, Database name and the system operator that we
need for a specific connection.
3. System operator drop down has all 4 operators as shown in the screen shot:
Fig9. Teradata TPT connection creation
Fig10. Selecting the TPT connection System operator
Ad

More Related Content

What's hot (20)

Intro to Delta Lake
Intro to Delta LakeIntro to Delta Lake
Intro to Delta Lake
Databricks
 
(The life of a) Data engineer
(The life of a) Data engineer(The life of a) Data engineer
(The life of a) Data engineer
Alex Chalini
 
ETL Using Informatica Power Center
ETL Using Informatica Power CenterETL Using Informatica Power Center
ETL Using Informatica Power Center
Edureka!
 
Mapping Data Flows Training deck Q1 CY22
Mapping Data Flows Training deck Q1 CY22Mapping Data Flows Training deck Q1 CY22
Mapping Data Flows Training deck Q1 CY22
Mark Kromer
 
Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data Engineering
Vivek Aanand Ganesan
 
Visualization using Tableau
Visualization using TableauVisualization using Tableau
Visualization using Tableau
Girija Muscut
 
Dynamic filtering for presto join optimisation
Dynamic filtering for presto join optimisationDynamic filtering for presto join optimisation
Dynamic filtering for presto join optimisation
Ori Reshef
 
Postgresql
PostgresqlPostgresql
Postgresql
NexThoughts Technologies
 
Snowflake Datawarehouse Architecturing
Snowflake Datawarehouse ArchitecturingSnowflake Datawarehouse Architecturing
Snowflake Datawarehouse Architecturing
Ishan Bhawantha Hewanayake
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architecture
Adam Doyle
 
Data Engineering Basics
Data Engineering BasicsData Engineering Basics
Data Engineering Basics
Catherine Kimani
 
What is ETL?
What is ETL?What is ETL?
What is ETL?
Ismail El Gayar
 
Types of connections in Power BI
Types of connections in Power BITypes of connections in Power BI
Types of connections in Power BI
Swapnil Jadhav
 
Building Dynamic Pipelines in Azure Data Factory (SQLSaturday Oslo)
Building Dynamic Pipelines in Azure Data Factory (SQLSaturday Oslo)Building Dynamic Pipelines in Azure Data Factory (SQLSaturday Oslo)
Building Dynamic Pipelines in Azure Data Factory (SQLSaturday Oslo)
Cathrine Wilhelmsen
 
Snowflake for Data Engineering
Snowflake for Data EngineeringSnowflake for Data Engineering
Snowflake for Data Engineering
Harald Erb
 
Introduction to PySpark
Introduction to PySparkIntroduction to PySpark
Introduction to PySpark
Russell Jurney
 
Talend ETL Tutorial | Talend Tutorial For Beginners | Talend Online Training ...
Talend ETL Tutorial | Talend Tutorial For Beginners | Talend Online Training ...Talend ETL Tutorial | Talend Tutorial For Beginners | Talend Online Training ...
Talend ETL Tutorial | Talend Tutorial For Beginners | Talend Online Training ...
Edureka!
 
Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)
James Serra
 
Informatica PowerCenter
Informatica PowerCenterInformatica PowerCenter
Informatica PowerCenter
Ramy Mahrous
 
Azure Data Factory
Azure Data FactoryAzure Data Factory
Azure Data Factory
HARIHARAN R
 
Intro to Delta Lake
Intro to Delta LakeIntro to Delta Lake
Intro to Delta Lake
Databricks
 
(The life of a) Data engineer
(The life of a) Data engineer(The life of a) Data engineer
(The life of a) Data engineer
Alex Chalini
 
ETL Using Informatica Power Center
ETL Using Informatica Power CenterETL Using Informatica Power Center
ETL Using Informatica Power Center
Edureka!
 
Mapping Data Flows Training deck Q1 CY22
Mapping Data Flows Training deck Q1 CY22Mapping Data Flows Training deck Q1 CY22
Mapping Data Flows Training deck Q1 CY22
Mark Kromer
 
Visualization using Tableau
Visualization using TableauVisualization using Tableau
Visualization using Tableau
Girija Muscut
 
Dynamic filtering for presto join optimisation
Dynamic filtering for presto join optimisationDynamic filtering for presto join optimisation
Dynamic filtering for presto join optimisation
Ori Reshef
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architecture
Adam Doyle
 
Types of connections in Power BI
Types of connections in Power BITypes of connections in Power BI
Types of connections in Power BI
Swapnil Jadhav
 
Building Dynamic Pipelines in Azure Data Factory (SQLSaturday Oslo)
Building Dynamic Pipelines in Azure Data Factory (SQLSaturday Oslo)Building Dynamic Pipelines in Azure Data Factory (SQLSaturday Oslo)
Building Dynamic Pipelines in Azure Data Factory (SQLSaturday Oslo)
Cathrine Wilhelmsen
 
Snowflake for Data Engineering
Snowflake for Data EngineeringSnowflake for Data Engineering
Snowflake for Data Engineering
Harald Erb
 
Introduction to PySpark
Introduction to PySparkIntroduction to PySpark
Introduction to PySpark
Russell Jurney
 
Talend ETL Tutorial | Talend Tutorial For Beginners | Talend Online Training ...
Talend ETL Tutorial | Talend Tutorial For Beginners | Talend Online Training ...Talend ETL Tutorial | Talend Tutorial For Beginners | Talend Online Training ...
Talend ETL Tutorial | Talend Tutorial For Beginners | Talend Online Training ...
Edureka!
 
Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)
James Serra
 
Informatica PowerCenter
Informatica PowerCenterInformatica PowerCenter
Informatica PowerCenter
Ramy Mahrous
 
Azure Data Factory
Azure Data FactoryAzure Data Factory
Azure Data Factory
HARIHARAN R
 

Viewers also liked (20)

Teradata Aggregate Join Indices And Dimensional Models
Teradata Aggregate Join Indices And Dimensional ModelsTeradata Aggregate Join Indices And Dimensional Models
Teradata Aggregate Join Indices And Dimensional Models
pepeborja
 
ABC of Teradata System Performance Analysis
ABC of Teradata System Performance AnalysisABC of Teradata System Performance Analysis
ABC of Teradata System Performance Analysis
Shaheryar Iqbal
 
Media_Pack16
Media_Pack16Media_Pack16
Media_Pack16
Anthony Wilkinson
 
Radsok Presentation Ipe
Radsok Presentation IpeRadsok Presentation Ipe
Radsok Presentation Ipe
mistybeals
 
Amphenol LTW Industrial Ethernet M12 to RJ45 Adaptor
Amphenol LTW Industrial Ethernet M12 to RJ45 AdaptorAmphenol LTW Industrial Ethernet M12 to RJ45 Adaptor
Amphenol LTW Industrial Ethernet M12 to RJ45 Adaptor
Amphenol LTW
 
E M C Ionix Overview 2010
E M C  Ionix Overview 2010E M C  Ionix Overview 2010
E M C Ionix Overview 2010
Suhela Dighe
 
Global Ethernet Network
Global Ethernet NetworkGlobal Ethernet Network
Global Ethernet Network
Malard St
 
Suman Resume
Suman ResumeSuman Resume
Suman Resume
Suman dasani
 
Amphenol_Backplane_Systems_UHD_NAFI_BROFinalWeb_1Apr2014
Amphenol_Backplane_Systems_UHD_NAFI_BROFinalWeb_1Apr2014Amphenol_Backplane_Systems_UHD_NAFI_BROFinalWeb_1Apr2014
Amphenol_Backplane_Systems_UHD_NAFI_BROFinalWeb_1Apr2014
Candice Tsang
 
ŠTIKY ČESKÉHO BYZNYSU
ŠTIKY ČESKÉHO BYZNYSU ŠTIKY ČESKÉHO BYZNYSU
ŠTIKY ČESKÉHO BYZNYSU
AgenturaHelas
 
Predstavitev izobraževanja odraslih na Kosovem, dr. Rame Likaj, Konferenca Gr...
Predstavitev izobraževanja odraslih na Kosovem, dr. Rame Likaj, Konferenca Gr...Predstavitev izobraževanja odraslih na Kosovem, dr. Rame Likaj, Konferenca Gr...
Predstavitev izobraževanja odraslih na Kosovem, dr. Rame Likaj, Konferenca Gr...
AndragoskiCenterSlovenije
 
InnerWireless Distributed Antenna Brochure
InnerWireless Distributed Antenna BrochureInnerWireless Distributed Antenna Brochure
InnerWireless Distributed Antenna Brochure
InnerWireless/Black Box Network Services
 
Job offer tpil commercial manager-surabaya
Job offer tpil commercial manager-surabayaJob offer tpil commercial manager-surabaya
Job offer tpil commercial manager-surabaya
Bambang Eko Cahyono
 
VSE JE OK! Dr. Rok Stritar, Ekonomska fakulteta v Ljubljani, MQ konferenca, 1...
VSE JE OK! Dr. Rok Stritar, Ekonomska fakulteta v Ljubljani, MQ konferenca, 1...VSE JE OK! Dr. Rok Stritar, Ekonomska fakulteta v Ljubljani, MQ konferenca, 1...
VSE JE OK! Dr. Rok Stritar, Ekonomska fakulteta v Ljubljani, MQ konferenca, 1...
Zdruzenje_Manager
 
China mobile communication antenna industry report, 2011
China mobile communication antenna industry report, 2011China mobile communication antenna industry report, 2011
China mobile communication antenna industry report, 2011
ResearchInChina
 
Keynote Presentation - The Power of Storytelling with Andrew Griffiths
Keynote Presentation - The Power of Storytelling with Andrew GriffithsKeynote Presentation - The Power of Storytelling with Andrew Griffiths
Keynote Presentation - The Power of Storytelling with Andrew Griffiths
Andrew Griffiths Enterprises
 
Ect english company profile v3.0
Ect english company profile v3.0Ect english company profile v3.0
Ect english company profile v3.0
Nam Truong
 
Lessons Learned: Implementing VoLTE Roaming APAC
Lessons Learned: Implementing VoLTE Roaming APAC Lessons Learned: Implementing VoLTE Roaming APAC
Lessons Learned: Implementing VoLTE Roaming APAC
Syniverse
 
Introduction - A STUDY ON EMPLOYEE ENGAGEMENT IN FCI OEN CONNECTORS, MULAMTHU...
Introduction - A STUDY ON EMPLOYEE ENGAGEMENT IN FCI OEN CONNECTORS, MULAMTHU...Introduction - A STUDY ON EMPLOYEE ENGAGEMENT IN FCI OEN CONNECTORS, MULAMTHU...
Introduction - A STUDY ON EMPLOYEE ENGAGEMENT IN FCI OEN CONNECTORS, MULAMTHU...
JINSE PARACKAL
 
David Fei - Session 1: The Global Airport Cities Report: The Latest Project N...
David Fei - Session 1: The Global Airport Cities Report: The Latest Project N...David Fei - Session 1: The Global Airport Cities Report: The Latest Project N...
David Fei - Session 1: The Global Airport Cities Report: The Latest Project N...
Global Airport Cities
 
Teradata Aggregate Join Indices And Dimensional Models
Teradata Aggregate Join Indices And Dimensional ModelsTeradata Aggregate Join Indices And Dimensional Models
Teradata Aggregate Join Indices And Dimensional Models
pepeborja
 
ABC of Teradata System Performance Analysis
ABC of Teradata System Performance AnalysisABC of Teradata System Performance Analysis
ABC of Teradata System Performance Analysis
Shaheryar Iqbal
 
Radsok Presentation Ipe
Radsok Presentation IpeRadsok Presentation Ipe
Radsok Presentation Ipe
mistybeals
 
Amphenol LTW Industrial Ethernet M12 to RJ45 Adaptor
Amphenol LTW Industrial Ethernet M12 to RJ45 AdaptorAmphenol LTW Industrial Ethernet M12 to RJ45 Adaptor
Amphenol LTW Industrial Ethernet M12 to RJ45 Adaptor
Amphenol LTW
 
E M C Ionix Overview 2010
E M C  Ionix Overview 2010E M C  Ionix Overview 2010
E M C Ionix Overview 2010
Suhela Dighe
 
Global Ethernet Network
Global Ethernet NetworkGlobal Ethernet Network
Global Ethernet Network
Malard St
 
Amphenol_Backplane_Systems_UHD_NAFI_BROFinalWeb_1Apr2014
Amphenol_Backplane_Systems_UHD_NAFI_BROFinalWeb_1Apr2014Amphenol_Backplane_Systems_UHD_NAFI_BROFinalWeb_1Apr2014
Amphenol_Backplane_Systems_UHD_NAFI_BROFinalWeb_1Apr2014
Candice Tsang
 
ŠTIKY ČESKÉHO BYZNYSU
ŠTIKY ČESKÉHO BYZNYSU ŠTIKY ČESKÉHO BYZNYSU
ŠTIKY ČESKÉHO BYZNYSU
AgenturaHelas
 
Predstavitev izobraževanja odraslih na Kosovem, dr. Rame Likaj, Konferenca Gr...
Predstavitev izobraževanja odraslih na Kosovem, dr. Rame Likaj, Konferenca Gr...Predstavitev izobraževanja odraslih na Kosovem, dr. Rame Likaj, Konferenca Gr...
Predstavitev izobraževanja odraslih na Kosovem, dr. Rame Likaj, Konferenca Gr...
AndragoskiCenterSlovenije
 
Job offer tpil commercial manager-surabaya
Job offer tpil commercial manager-surabayaJob offer tpil commercial manager-surabaya
Job offer tpil commercial manager-surabaya
Bambang Eko Cahyono
 
VSE JE OK! Dr. Rok Stritar, Ekonomska fakulteta v Ljubljani, MQ konferenca, 1...
VSE JE OK! Dr. Rok Stritar, Ekonomska fakulteta v Ljubljani, MQ konferenca, 1...VSE JE OK! Dr. Rok Stritar, Ekonomska fakulteta v Ljubljani, MQ konferenca, 1...
VSE JE OK! Dr. Rok Stritar, Ekonomska fakulteta v Ljubljani, MQ konferenca, 1...
Zdruzenje_Manager
 
China mobile communication antenna industry report, 2011
China mobile communication antenna industry report, 2011China mobile communication antenna industry report, 2011
China mobile communication antenna industry report, 2011
ResearchInChina
 
Keynote Presentation - The Power of Storytelling with Andrew Griffiths
Keynote Presentation - The Power of Storytelling with Andrew GriffithsKeynote Presentation - The Power of Storytelling with Andrew Griffiths
Keynote Presentation - The Power of Storytelling with Andrew Griffiths
Andrew Griffiths Enterprises
 
Ect english company profile v3.0
Ect english company profile v3.0Ect english company profile v3.0
Ect english company profile v3.0
Nam Truong
 
Lessons Learned: Implementing VoLTE Roaming APAC
Lessons Learned: Implementing VoLTE Roaming APAC Lessons Learned: Implementing VoLTE Roaming APAC
Lessons Learned: Implementing VoLTE Roaming APAC
Syniverse
 
Introduction - A STUDY ON EMPLOYEE ENGAGEMENT IN FCI OEN CONNECTORS, MULAMTHU...
Introduction - A STUDY ON EMPLOYEE ENGAGEMENT IN FCI OEN CONNECTORS, MULAMTHU...Introduction - A STUDY ON EMPLOYEE ENGAGEMENT IN FCI OEN CONNECTORS, MULAMTHU...
Introduction - A STUDY ON EMPLOYEE ENGAGEMENT IN FCI OEN CONNECTORS, MULAMTHU...
JINSE PARACKAL
 
David Fei - Session 1: The Global Airport Cities Report: The Latest Project N...
David Fei - Session 1: The Global Airport Cities Report: The Latest Project N...David Fei - Session 1: The Global Airport Cities Report: The Latest Project N...
David Fei - Session 1: The Global Airport Cities Report: The Latest Project N...
Global Airport Cities
 
Ad

Similar to Working with informtiaca teradata parallel transporter (20)

TPT connection Implementation in Informatica
TPT connection Implementation in InformaticaTPT connection Implementation in Informatica
TPT connection Implementation in Informatica
Yagya Sharma
 
Getting started with Teradata
Getting started with TeradataGetting started with Teradata
Getting started with Teradata
Edureka!
 
td1.ppt
td1.ppttd1.ppt
td1.ppt
ABHINAVGUPTA401650
 
Talend Components | tMap, tJoin, tFileList, tInputFileDelimited | Talend Onli...
Talend Components | tMap, tJoin, tFileList, tInputFileDelimited | Talend Onli...Talend Components | tMap, tJoin, tFileList, tInputFileDelimited | Talend Onli...
Talend Components | tMap, tJoin, tFileList, tInputFileDelimited | Talend Onli...
Edureka!
 
ETL VS ELT.pdf
ETL VS ELT.pdfETL VS ELT.pdf
ETL VS ELT.pdf
BOSupport
 
Building the DW - ETL
Building the DW - ETLBuilding the DW - ETL
Building the DW - ETL
ganblues
 
Tide data warehousesolutionfort24_nayamsoft_flyer
Tide data warehousesolutionfort24_nayamsoft_flyerTide data warehousesolutionfort24_nayamsoft_flyer
Tide data warehousesolutionfort24_nayamsoft_flyer
Dileep Sankar
 
Whitepaper tableau for-the-enterprise-0
Whitepaper tableau for-the-enterprise-0Whitepaper tableau for-the-enterprise-0
Whitepaper tableau for-the-enterprise-0
alok khobragade
 
60141457-Oracle-Golden-Gate-Presentation.ppt
60141457-Oracle-Golden-Gate-Presentation.ppt60141457-Oracle-Golden-Gate-Presentation.ppt
60141457-Oracle-Golden-Gate-Presentation.ppt
padalamail
 
What are Tableau Server Components and their working.docx
What are Tableau Server Components and their working.docxWhat are Tableau Server Components and their working.docx
What are Tableau Server Components and their working.docx
PankajNagla2
 
ETL Tools Ankita Dubey
ETL Tools Ankita DubeyETL Tools Ankita Dubey
ETL Tools Ankita Dubey
Ankita Dubey
 
Informatica Interview Questions & Answers
Informatica Interview Questions & AnswersInformatica Interview Questions & Answers
Informatica Interview Questions & Answers
ZaranTech LLC
 
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data SolutionBig Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Etu Solution
 
Data-ware Housing
Data-ware HousingData-ware Housing
Data-ware Housing
Prof.Nilesh Magar
 
Presto
PrestoPresto
Presto
Knoldus Inc.
 
Firebird to PostgreSQL Migration Ask On Data
Firebird to PostgreSQL Migration Ask On DataFirebird to PostgreSQL Migration Ask On Data
Firebird to PostgreSQL Migration Ask On Data
Varsha Nayak
 
Firebird to PostgreSQL Migration Ask On Data
Firebird to PostgreSQL Migration Ask On DataFirebird to PostgreSQL Migration Ask On Data
Firebird to PostgreSQL Migration Ask On Data
Varsha Nayak
 
Firebird to PostgreSQL Migration Ask On Data
Firebird to PostgreSQL Migration Ask On DataFirebird to PostgreSQL Migration Ask On Data
Firebird to PostgreSQL Migration Ask On Data
Varsha Nayak
 
IT Summit - Modernizing Enterprise Analytics: the IT Story
IT Summit - Modernizing Enterprise Analytics: the IT StoryIT Summit - Modernizing Enterprise Analytics: the IT Story
IT Summit - Modernizing Enterprise Analytics: the IT Story
Tableau Software
 
DataStage_Whitepaper
DataStage_WhitepaperDataStage_Whitepaper
DataStage_Whitepaper
Sourav Maity
 
TPT connection Implementation in Informatica
TPT connection Implementation in InformaticaTPT connection Implementation in Informatica
TPT connection Implementation in Informatica
Yagya Sharma
 
Getting started with Teradata
Getting started with TeradataGetting started with Teradata
Getting started with Teradata
Edureka!
 
Talend Components | tMap, tJoin, tFileList, tInputFileDelimited | Talend Onli...
Talend Components | tMap, tJoin, tFileList, tInputFileDelimited | Talend Onli...Talend Components | tMap, tJoin, tFileList, tInputFileDelimited | Talend Onli...
Talend Components | tMap, tJoin, tFileList, tInputFileDelimited | Talend Onli...
Edureka!
 
ETL VS ELT.pdf
ETL VS ELT.pdfETL VS ELT.pdf
ETL VS ELT.pdf
BOSupport
 
Building the DW - ETL
Building the DW - ETLBuilding the DW - ETL
Building the DW - ETL
ganblues
 
Tide data warehousesolutionfort24_nayamsoft_flyer
Tide data warehousesolutionfort24_nayamsoft_flyerTide data warehousesolutionfort24_nayamsoft_flyer
Tide data warehousesolutionfort24_nayamsoft_flyer
Dileep Sankar
 
Whitepaper tableau for-the-enterprise-0
Whitepaper tableau for-the-enterprise-0Whitepaper tableau for-the-enterprise-0
Whitepaper tableau for-the-enterprise-0
alok khobragade
 
60141457-Oracle-Golden-Gate-Presentation.ppt
60141457-Oracle-Golden-Gate-Presentation.ppt60141457-Oracle-Golden-Gate-Presentation.ppt
60141457-Oracle-Golden-Gate-Presentation.ppt
padalamail
 
What are Tableau Server Components and their working.docx
What are Tableau Server Components and their working.docxWhat are Tableau Server Components and their working.docx
What are Tableau Server Components and their working.docx
PankajNagla2
 
ETL Tools Ankita Dubey
ETL Tools Ankita DubeyETL Tools Ankita Dubey
ETL Tools Ankita Dubey
Ankita Dubey
 
Informatica Interview Questions & Answers
Informatica Interview Questions & AnswersInformatica Interview Questions & Answers
Informatica Interview Questions & Answers
ZaranTech LLC
 
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data SolutionBig Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Etu Solution
 
Firebird to PostgreSQL Migration Ask On Data
Firebird to PostgreSQL Migration Ask On DataFirebird to PostgreSQL Migration Ask On Data
Firebird to PostgreSQL Migration Ask On Data
Varsha Nayak
 
Firebird to PostgreSQL Migration Ask On Data
Firebird to PostgreSQL Migration Ask On DataFirebird to PostgreSQL Migration Ask On Data
Firebird to PostgreSQL Migration Ask On Data
Varsha Nayak
 
Firebird to PostgreSQL Migration Ask On Data
Firebird to PostgreSQL Migration Ask On DataFirebird to PostgreSQL Migration Ask On Data
Firebird to PostgreSQL Migration Ask On Data
Varsha Nayak
 
IT Summit - Modernizing Enterprise Analytics: the IT Story
IT Summit - Modernizing Enterprise Analytics: the IT StoryIT Summit - Modernizing Enterprise Analytics: the IT Story
IT Summit - Modernizing Enterprise Analytics: the IT Story
Tableau Software
 
DataStage_Whitepaper
DataStage_WhitepaperDataStage_Whitepaper
DataStage_Whitepaper
Sourav Maity
 
Ad

Recently uploaded (20)

DPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdfDPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdf
inmishra17121973
 
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptxPerencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
PareaRusan
 
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
gmuir1066
 
C++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptxC++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptx
aquibnoor22079
 
LLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bertLLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bert
ChadapornK
 
183409-christina-rossetti.pdfdsfsdasggsag
183409-christina-rossetti.pdfdsfsdasggsag183409-christina-rossetti.pdfdsfsdasggsag
183409-christina-rossetti.pdfdsfsdasggsag
fardin123rahman07
 
Classification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptxClassification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptx
wencyjorda88
 
Ch3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendencyCh3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendency
ayeleasefa2
 
How to join illuminati Agent in uganda call+256776963507/0741506136
How to join illuminati Agent in uganda call+256776963507/0741506136How to join illuminati Agent in uganda call+256776963507/0741506136
How to join illuminati Agent in uganda call+256776963507/0741506136
illuminati Agent uganda call+256776963507/0741506136
 
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.pptJust-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
ssuser5f8f49
 
Minions Want to eat presentacion muy linda
Minions Want to eat presentacion muy lindaMinions Want to eat presentacion muy linda
Minions Want to eat presentacion muy linda
CarlaAndradesSoler1
 
04302025_CCC TUG_DataVista: The Design Story
04302025_CCC TUG_DataVista: The Design Story04302025_CCC TUG_DataVista: The Design Story
04302025_CCC TUG_DataVista: The Design Story
ccctableauusergroup
 
AI Competitor Analysis: How to Monitor and Outperform Your Competitors
AI Competitor Analysis: How to Monitor and Outperform Your CompetitorsAI Competitor Analysis: How to Monitor and Outperform Your Competitors
AI Competitor Analysis: How to Monitor and Outperform Your Competitors
Contify
 
Conic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptxConic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptx
taiwanesechetan
 
03 Daniel 2-notes.ppt seminario escatologia
03 Daniel 2-notes.ppt seminario escatologia03 Daniel 2-notes.ppt seminario escatologia
03 Daniel 2-notes.ppt seminario escatologia
Alexander Romero Arosquipa
 
Cleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdfCleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdf
alcinialbob1234
 
Stack_and_Queue_Presentation_Final (1).pptx
Stack_and_Queue_Presentation_Final (1).pptxStack_and_Queue_Presentation_Final (1).pptx
Stack_and_Queue_Presentation_Final (1).pptx
binduraniha86
 
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptxmd-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
fatimalazaar2004
 
Medical Dataset including visualizations
Medical Dataset including visualizationsMedical Dataset including visualizations
Medical Dataset including visualizations
vishrut8750588758
 
Data Science Courses in India iim skills
Data Science Courses in India iim skillsData Science Courses in India iim skills
Data Science Courses in India iim skills
dharnathakur29
 
DPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdfDPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdf
inmishra17121973
 
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptxPerencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
PareaRusan
 
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
gmuir1066
 
C++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptxC++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptx
aquibnoor22079
 
LLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bertLLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bert
ChadapornK
 
183409-christina-rossetti.pdfdsfsdasggsag
183409-christina-rossetti.pdfdsfsdasggsag183409-christina-rossetti.pdfdsfsdasggsag
183409-christina-rossetti.pdfdsfsdasggsag
fardin123rahman07
 
Classification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptxClassification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptx
wencyjorda88
 
Ch3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendencyCh3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendency
ayeleasefa2
 
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.pptJust-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
ssuser5f8f49
 
Minions Want to eat presentacion muy linda
Minions Want to eat presentacion muy lindaMinions Want to eat presentacion muy linda
Minions Want to eat presentacion muy linda
CarlaAndradesSoler1
 
04302025_CCC TUG_DataVista: The Design Story
04302025_CCC TUG_DataVista: The Design Story04302025_CCC TUG_DataVista: The Design Story
04302025_CCC TUG_DataVista: The Design Story
ccctableauusergroup
 
AI Competitor Analysis: How to Monitor and Outperform Your Competitors
AI Competitor Analysis: How to Monitor and Outperform Your CompetitorsAI Competitor Analysis: How to Monitor and Outperform Your Competitors
AI Competitor Analysis: How to Monitor and Outperform Your Competitors
Contify
 
Conic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptxConic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptx
taiwanesechetan
 
Cleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdfCleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdf
alcinialbob1234
 
Stack_and_Queue_Presentation_Final (1).pptx
Stack_and_Queue_Presentation_Final (1).pptxStack_and_Queue_Presentation_Final (1).pptx
Stack_and_Queue_Presentation_Final (1).pptx
binduraniha86
 
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptxmd-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
fatimalazaar2004
 
Medical Dataset including visualizations
Medical Dataset including visualizationsMedical Dataset including visualizations
Medical Dataset including visualizations
vishrut8750588758
 
Data Science Courses in India iim skills
Data Science Courses in India iim skillsData Science Courses in India iim skills
Data Science Courses in India iim skills
dharnathakur29
 

Working with informtiaca teradata parallel transporter

  • 1. 2White Paper|Working with Informatica-Teradata Parallel Transporter 1. Introduction Today's growing data warehouses demand fast, reliable tools are help to acquire and manage data and flexibility to load large volumes of data from any source at any time. Challenges come from everywhere: more data sources, growing data volumes, dynamically changing business requirements and user demands for fresher data. Teradata is a global technology leader in enterprise data warehousing, business analytics, and data warehousing services. Teradata provides a powerful suite of software that includes the Teradata Database, data access and management tools, and data mining applications. PowerCenter works with the Teradata Database and Teradata tools to provide a data integration solution that allows integrating data from virtually any business system into Teradata as well as leveraging Teradata data for use in other business systems. PowerCenter uses the following techniques when extracting data from and loading data to the Teradata database: ETL (Extract, transform, and load). This technique extracts data from the source systems, transforms the data within PowerCenter, and loads it to target tables. ELT (Extract, load, and then transform). This technique extracts data from the source systems, loads it to user-defined staging tables in the target database, and transforms the data within the target system using generated SQL. The SQL queries include a final insert into the target tables. ETL-T (ETL and ELT hybrid). This technique extracts data from the source systems, transforms the data within PowerCenter, loads the data to user-defined staging tables in the target database, and further transforms the data within the target system using generated SQL. The SQL queries include a final insert into the target tables. The ELT-T technique is optimized within PowerCenter, so that the transformations that better perform within the database system can be performed there and the Integration Service performs the other transformations. To perform ETL operations, configure PowerCenter sessions to use a Teradata relational connection, a Teradata standalone load or unload utility, or Teradata Parallel Transporter. To use ELT or ETL-T techniques, configure PowerCenter sessions to use pushdown optimization.
  • 2. 3White Paper|Working with Informatica-Teradata Parallel Transporter 2. Teradata standalone Load and Unload Utilities Teradata standalone load and unload utilities are fast, reliable tools that help to export large amounts of data from Teradata databases and load session target files into Teradata databases. Use a standalone load or unload utility when PowerCenter sessions extract or load large amounts of data. Standalone load and unload utilities are faster than Teradata relational connections because they load or extract data directly from a file or pipe rather than run SQL commands to load or extract the data. PowerCenter works with the following Teradata standalone load and unload utilities: 2.1. Teradata FastLoad x Teradata FastLoad is a command-line utility that quickly loads large amounts of data to empty tables in a Teradata database. Use FastLoad for a high- volume initial load or for high-volume truncate and reload operations. FastLoad is the fastest load utility, but it has the following limitations: x FastLoad uses multiple sessions to load data, but it can load data to only one table in a Teradata database per job. x It locks tables while loading data, preventing others and other instances of FastLoad from accessing the tables during data loading. x FastLoad only works with empty tables with no secondary indexes. x It can only insert data. 2.2. Teradata MultiLoad Teradata MultiLoad is a command-driven utility for fast, high-volume maintenance on multiple tables and views of a Teradata database. Each MultiLoad instance can perform multiple data insert, update, and delete operations on up to five different tables or views. MultiLoad optimizes operations that rapidly acquire, process, and apply data to Teradata tables. Use MultiLoad for large volume, incremental data loads. 2.2.1.MultiLoad has the following advantages: x MultiLoad is very fast. It can process millions of rows in a few minutes. x MultiLoad supports inserts, updates, upserts, deletes, and data-driven operations in PowerCenter. x We can use variables and embed conditional logic into MultiLoad control files. x MultiLoad supports sophisticated error recovery. It allows load jobs to be restarted without having to redo all of the prior work.
  • 3. 4White Paper|Working with Informatica-Teradata Parallel Transporter 2.2.2. MultiLoad has the following limitations: x MultiLoad is designed for the highest possible throughput, so it can be very resource intensive. x It locks tables while loading data, preventing others and other instances of MultiLoad from accessing the tables during data loading. x Because of its “phased” nature, there are potentially inconvenient windows of time when MultiLoad cannot be stopped without losing access to target tables. 2.3. Teradata TPump Teradata TPump is a highly parallel utility that can continuously move data from data sources into Teradata tables without locking the affected table. TPump supports inserts, updates, deletes, and data-driver updates. TPump acquires row hash locks on a database table instead of table-level locks, so multiple TPump instances can load data simultaneously to the same table. TPump is often used to “trickle-load” a database table. Use TPump for low volume, online data loads. 2.3.1.TPump has the following advantages: x TPump can refresh database tables in near real-time. x TPump continuously loads data into Teradata tables without locking the affected tables, so users can run queries when TPump is running. x TPump is less resource-intensive than MultiLoad because it does not write to temporary tables. x Users can control the rate at which statements are sent to the Teradata database, limiting resource consumption. x It supports parallel processing. x TPump can always be stopped and all of its locks dropped with no effect. x TPump is not as fast as the other standalone loaders for large volume loads because it changes the same data block multiple times. 2.4. Teradata FastExport Teradata FastExport is a command-driven utility that uses multiple sessions to quickly transfer large amounts of data from Teradata sources to PowerCenter. Use FastExport to quickly extract data from Teradata sources. 2.4.1. FastExport has the following advantages: x It is faster than Teradata relational connections when extracting large amounts of data. x FastExport can be run in streaming mode, which avoids the need to stage the data file.
  • 4. 5White Paper|Working with Informatica-Teradata Parallel Transporter x We can encrypt the data transfer between FastExport and the Teradata server. x FastExport is available for sources and pipeline lookups. When we create a FastExport connection, verify the settings of the following connection attributes: x Data encryption. Enable this attribute to encrypt the data transfer between FastExport and the Teradata server so that unauthorized users cannot access the data being transferred across the network. x Fractional seconds. This attribute specifies the precision of the decimal portion of timestamp data. To avoid session failure or possible data corruption, make sure this value matches the timestamp precision of the column in the Teradata database. For date column filed selects from Teradata to Informatica we need to match the Teradata date format with Informatica date format To use FastExport in a session, configure the mapping to extract from a Teradata source, configure the session to read from FastExport instead of a relational database, and select the FastExport connection for the session. When a session transfers data between Teradata and PowerCenter, the following files are created: x A staging file or pipe. PowerCenter creates a staging file or named pipe for data transfer based on how we configure the connection. Named pipes are generally faster than staging files because data is transferred as soon as it appears in the pipe. If we use a staging file, data is not transferred until all data appears in the file. x A control file. PowerCenter generates a control file that contains instructions for loading or extracting data. PowerCenter creates the control file based on the loader or FastExport attributes we configure for the connection and the session. x A log file. The load or unload utility creates a log file and writes error messages to it. The PowerCenter session log indicates whether the session ran successfully, but does not contain load or unload utility error messages. Use the log file to debug problems that occur during data loading or extraction.
  • 5. 6White Paper|Working with Informatica-Teradata Parallel Transporter By default, loader staging, control, and log files are created in the target file directory. The FastExport staging, control, and log files are created in the PowerCenter temporary files directory. All of these load and unload utilities are included in the Teradata Tools and Utilities (TTU), available from Teradata.PowerCenter supports these entire standalone load and unload utilities. Before we can configure a session to use a load or unload utility, create a loader or FastExport (application) connection in the PowerCenter Workflow Manager and enter a value for the TDPID in the connection attributes. For more information about creating connection objects in PowerCenter, see the APPENDIX A, APPENDIX-B and PowerCenter Workflow Basics Guide. 3. Teradata Parallel Transporter Teradata Parallel Transporter (TPT) is a client application that provides scalable, high- speed, parallel data extraction, loading, and updating. It uses and expands upon the functions and capabilities of the standalone Teradata load and unload utilities. Teradata PT supports a single scripting environment with different system operators for extracting and loading data. It also supports massive parallel extraction and loading, so if we partition a Teradata PT session, multiple Teradata PT instances can extract or load large amounts of data in the same database tables at the same time. To provide the functionality of the standalone load and unload utilities, Teradata PT extracts or loads data using one of the following system operators: x Export. Exports large data sets from Teradata tables or views and imports the data to file or Informatica pipe using the FastExport protocol. x Load. Bulk loads large volumes of data into empty Teradata database tables using the FastLoad protocol. x Update. Batch updates, inserts, upserts, and deletes data in Teradata database tables using the MultiLoad protocol. x Stream. Continuously updates, inserts, upserts, and deletes data in near real-time using the TPump protocol. Fig1. TPT operators for Informatica job
  • 6. 7White Paper|Working with Informatica-Teradata Parallel Transporter 3.1.Teradata PT advantages: Teradata PT is up to 20% faster than the standalone Teradata load and unload utilities, even though it uses the underlying protocols from the standalone utilities. Teradata PT supports recovery for sessions that use the Stream operator when the source data is repeatable. This feature is especially useful when running real- time sessions and streaming the changes to Teradata. Users can invoke Teradata PT through a set of open APIs that communicate with the database directly, eliminating the need for a staging file or pipe and a control file. Teradata PT includes added features are enabled by users and some are improvements made behind the scenes. They include: 3.1.1. Parallel Reading: Allowing multiple instances of a single operator to read from the same input file takes advantage of the file I/O cache monitored and maintained by the operating system (OS). As long as all instances run at the same relative speed, the OS will read a block from the file into memory and all instances will be accessing the same memory. Dramatic improvements in performance have been seen when using up to five or six instances reading from a single (very large) file. 3.1.2. Exporting of Data: The normal operating procedure for consumer operators is for the first instance to get as much work as possible, provided it can keep up with the rate at which data arrives through the data streams. The default behavior is to only write to a single output file. 3.1.3. Parallel Writing: If multiple instances are specified for the DataConnector operator (as a file writer) and indicated that each instance should write to a different file, then there can be some parallelism in data output. However, when exporting data to flat files, this results in an uneven distribution of data across output files. 3.1.4. Round-Robin Data Distribution: This new parallel writing feature allows for the ability to switch to a round- robin fashion of file writing. This can evenly distribute data across multiple
  • 7. 8White Paper|Working with Informatica-Teradata Parallel Transporter files, thereby improving the performance of writing to disk even more. That performance improvement can be further enhanced by specifying each output file on a separate disk. PowerCenter communicates with Teradata PT using PowerExchange for Teradata Parallel Transporter, which is available through the Informatica-Teradata Enterprise Data Warehousing Solution. PowerExchange for Teradata Parallel Transporter provides integration between PowerCenter and Teradata databases for data extraction and loading. PowerExchange for Teradata Parallel Transporter executes Teradata PT operators directly through API [Fig2] calls. This improves performance by eliminating the staging file or named pipe. It also improves security by eliminating the control file, so there is no need to overwrite or store passwords in the control file. PowerExchange for Teradata Parallel Transporter supports session and workflow recovery. It also captures Teradata PT error messages and displays them in the session log, so do not need to check the utility log file when errors occur. Before we can configure a session to use Teradata PT, we must create a Teradata PT (relational) connection in the Workflow Manager and enter a value for the TDPID in the connection attributes. For more information about creating Teradata PT connection objects in PowerCenter, see the APPENDIX A and APPENDIX B Fig2. Informatica communicates with Teradata using TPT API Interface
  • 8. 9White Paper|Working with Informatica-Teradata Parallel Transporter 3.2. Selecting the Load utility based on records counts For loading data into table, we will use different kinds of Teradata load utilities based on data volume. Below table explains the best load utility to use in Informatica based on data volume. Task First Choice Second Choice Insert a small number of rows into an empty table TPT-Load, FastLoad TPump, TPT-Steam, or Relational from Informatica Insert a large number of rows into an empty table TPT-Load, FastLoad TPump, TPT-Steam, or Relational from Informatica Insert a small number of rows into small populated table TPT-Update, TPT- Stream, Relational, Multiload, or TPump Insert a small number of rows into large populated table TPump, TPT-Steam, or Relational Insert a moderate number of rows into large populated table TPT-Update, Multiload TPT-Update, TPT-Stream, Relational, Multiload, TPump Insert a large number of rows into an multiple populated tables TPT-Update, Multiload TPump, TPT-Steam, or Relational Insert a large number of rows into large populated table TPT-Update, Multiload Update/Delete large number of rows TPT-Update, Multiload Relational Update/Delete small number of rows Relational TPump, TPT-Steam, or Relational 3.3. Using TPT connection with Load and Update operator in session: To configure a session to extract data, configure the associated mapping to read from Teradata, change the reader type for the session to Teradata Parallel Transporter Reader, and select the Teradata PT connection. 1. In 'writers' select 'Teradata Parallel Transporter Writer'. 2. Once we select the above option in 'Connections' two types of connections will appear. i. The first connection defines the connection to Teradata PT API i.e., TPT connection. The PowerCenter Integration Service uses Teradata PT API connection to extract from or to load data to Teradata. ii. The second connection defines an optional ODBC connection to the target database. The PowerCenter Integration Service uses the target ODBC connection to drop log, error, and work tables, truncate target tables, and create recovery tables in the target database. The PowerCenter Integration Service does not use the ODBC connection to extract from or load data to Teradata.
  • 9. 10White Paper|Working with Informatica-Teradata Parallel Transporter 3. Select appropriate TPT and ODBC connections for the two types as shown below 4. Make sure the following session properties are set when using TPT connections: i. Specify the database and table name for work table. ii. Truncate table option can be used for Load, stream and update system operator. 5. We can select any of the following options for 'Mark missing rows' option as per requirement which specifies how TPT handles rows if it is not present in the target table. x Mark Missing Rows: i. None - If Teradata PT API receives a row marked for update or delete but it is missing in the target table, it does not mark the row in the error table. ii. For Update - If Teradata PT API receives a row marked for update but it is missing in the target table, it marks the row as an error row. iii. For Delete - If Teradata PT API receives a row marked for delete but it is missing in the target table, it marks the row as an error row. iv. Both - If Teradata PT API receives a row marked for update or delete but it is missing in the target table, it marks the row as an error row. 6. Similarly we can select the following options for Mark Duplicate Rows option as well. Fig3. Selecting TPT connection type and connection name
  • 10. 11White Paper|Working with Informatica-Teradata Parallel Transporter x Mark Duplicate Rows: i. None - If Teradata PT API receives a row marked for insert or update that causes a duplicate row in the target table, it does not mark the row in the error table. ii. For Insert - If Teradata PT API receives a row marked for insert but it exists in the target table, it marks the row as an error row. iii. For Update - If Teradata PT API receives a row marked for update that causes a duplicate row in the target table, Teradata PT API marks the row as an error row. iv. Both - If Teradata PT API receives a row marked for insert or update that causes a duplicate row in the target table, Teradata PT API marks the row as an error row. 7. Specify the database name and the table name for log table to be created 8. Specify the database name and the table name for error tables to be created. 9. Drop work/log/error tables option is available. This option can be checked when we wish to drop the intermediate tables that are created during session run. 10. Pack, Pack minimum and buffers options are all used when we use Stream load operator. 11. Tracing options can be selected as per needs. Usually we use the default values Fig4. Selecting TPT (Load or update) connection Options
  • 11. 12White Paper|Working with Informatica-Teradata Parallel Transporter 3.4. TPT Connection with export operator: Connection that uses export operator can be used only as a source connection. The below image shows, how to select The TPT export operator for Teradata source tables in Informatica session 3.5. TPT connection using Stream system operator: Stream operator uses macros to modify tables. So macro database has to be specified in the session properties while using this type of connection. 3.5.1. Macro Database: Name of the database that stores the macros Teradata PT API creates when we select the Stream system operator. The Stream system operator uses macros to modify tables. It creates macros before Teradata PT API begins loading data and removes them from the database after Teradata PT API loads all rows to the target. Fig5. Selecting TPT Export connection for Source
  • 12. 13White Paper|Working with Informatica-Teradata Parallel Transporter 3.6. Restart in case of failure:- 3.6.1. Restart for System Operators- Update, Export or Stream In case there is a session failure where the session uses one of the operators like Update, Export or Stream then the session can rerun successfully once all the intermediate tables are dropped. 3.6.2. Restart for System Operator - Load In case of a session failure where it uses System operator as 'Load' then we need to follow the below mentioned steps to rerun the session successfully: There is no need to drop and recreate the target table for load operator, instead a simple checkout of the session and an ' uncheck ' for the below options in session properties to run the failed TPT job is sufficient. x Drop Error/Work/Log Tables x Truncate Table options Also, there is no need to drop the intermediate tables. And there is no manual intervention required at the Database side. Once the session reruns successfully after the failure one must keep in mind that the session has to be 'Undo checkout’ for the next future successful runs. 3.7. Comparison of loading Teradata table using Informatica with Teradata Relational Connection and Teradata TPT Connection The Teradata RDBMS offers several tools to load external data into database. The Teradata load utilities can be divided into two main groups: The first group of utilities makes use of the transient journal but loading of data is slower, while the second group is thought to bring the data as fast as possible into the target tables because of bypassing the transient journal. The second group utilities include the BTEQ, TPUMP and relational/native connections to load the data into Teradata tables. These will insert the data record by record into target table, allowing full transaction handling and rollback functionality. These are still quite useful for loading small amounts of data, but as it were missing some important features like the usage of several parallel load sessions.
  • 13. 14White Paper|Working with Informatica-Teradata Parallel Transporter The Image Fig5, it is Informatica Workflow monitor workflow load statics. Here the source, target are Teradata tables and relational connections are used. The relational connection uses the transient journal and loads the data record by record into target table, so the throughput of target loading is less and it was taken around 5 hours of time to load the 1.7 millions of records. The second group utilities include the Teradata standalone Load and Unload Utilities. These are the fastest way of getting external data into Teradata.it will bypass transaction handling and data is loaded in big chunks, namely blocks of 64 kilobytes of data. Bypassing the transaction log (transient journal) means several restrictions: x It does not support target tables with unique secondary indexes (USI), join indexes (JI), referential integrity (RI) or triggers. FastLoad even goes a step further and requires the target table to be empty. x In the case of a SET target table, a duplicate row check would be required. FastLoad by default remove the record level duplicates x Multiload, as the second tool in the set of fast loading utilities can take over additional tasks such as UPDATE, DELETE, UPSERT etc. but all again on a block level. Multiload uses temporary work tables for this purpose and does not reach the load performance like FastLoad but still is a pretty good choice to get in huge amounts of data very fast. TPT is to integrate with the infrastructure in a plug-in fashion to perform Teradata standalone Load and Unload Utilities. Fig6. Session load statistics with Teradata relational connection
  • 14. 15White Paper|Working with Informatica-Teradata Parallel Transporter The Image Fig6, it is Informatica Workflow monitor workflow load statistics. Here the source, targets are Teradata tables and TPT connection used. The TPT operators load data by block level, so the throughput of target loading is huge and it was taken few seconds to load the 1.7 millions of records. Each group has its own advantages and disadvantages and depending on load requirements. The decision, when we should use which tool for loading depends on the amount of data records and the number of bytes a data records is consuming. 3.8. Loading the Teradata target table using Informatica update else insert logic by using TPT connection. In Informatica session, we can update the target table against to source table data. Generally we will use the two flows for insert else update operation one flow is for bulk insert and another one is for update. The TPT update operator (Multiload) will lock the target table to load the data. When we will use the two or more target instances in single session, the TPT will lock the target table while loading the data using one of the targets instance and session will try to load the data using another target instance, here again the MultiLoad job try to lock the same target table, which is already locked by insert instance so session will fail with the TPT error. “Error Code 2652: Operation not allowed - Table is being mloaded” To avoid this kind of error, we need to make use of Informatica Update else Insert option from session properties and treat source rows as update. We will update the record based on key columns and we will not update some fields like Create_Timestamp. If we want to use single instance for insert and update the record then we need to update record with previous (target) values for some fields which we don’t want to update with new values. i.e., Create_Timestamp. Fig7. Session load statistics with Teradata relation connection
  • 15. 16White Paper|Working with Informatica-Teradata Parallel Transporter 3.9. Loading the PPI indexed Teradata table with Informatica by using TPT connection. Whenever we need to Load and select incremental data from a huge table, we can use Multiload jobs to load data and create index to select data. But when we create index like Secondary index or join index Teradata Bulk load utilities doesn’t work. We can overcome this issue by using the partitioned primary index on the table. Teradata Multiload job will work on partitioned primary indexed tables. For MultiLoad job, the table should contain the primary index for update the table records and we need another index on table for fast selecting of incremental data. We can use PI for MultiLoad job and PPI for selecting incremental data. For portioned primary index tables, Teradata MultiLoad tasks require all values of the primary index column set and all values of the partitioning column set for deletes and updates. The table creation statement will be: CREATE TABLE <TABLE> ( Primary_key_field INTEGER, Field_1 INTEGER, Field_2 Varchar (10), CREATE_DT TIMESTAMP (0), UPDATE_DT TIMESTAMP (0)) PRIMARY INDEX (Primary_key_field, UPDATE_DT) PARTITION BY ( RANGE_N (UPDATE_DT BETWEEN DATE ‘2014-01-01′ AND DATE ‘2020-12-31′ EACH INTERVAL ‘1’ DAY, NO RANGE, UNKNOWN)); We will load data with CREATE_DT and UPDATE_DT values as current_date for inserting records and for updating records UPDATE_DT will be updated with current_date. We will select the Incremental data using SELECT * FROM <TABLE> WHERE UPDATE_DT= current_date This will select both inserted and updated records for the day. 4. Issues Affecting Loading to and Unloading from Teradata This section describes issues might encounter when we move data between PowerCenter and Teradata. 4.1. Increasing Lookup Performance Applies to: Teradata relational connections, FastExport
  • 16. 17White Paper|Working with Informatica-Teradata Parallel Transporter Sessions that perform lookups on Teradata tables must use Teradata relational connections. If we experience performance problems when running a session that performs lookups against a Teradata database, we might be able to increase performance in the following ways: x Use FastExport to extract data to a flat file and perform the lookup on the flat file. x Enable or disable the Lookup Cache. 4.2. Using FastExport to Extract Lookup Data If a session performs a lookup on a large, static Teradata table, we might be able to increase performance by using FastExport to extract the data to a flat file and configuring the session to look up data in the flat file. To do this, redesign the mapping as follows: 1. Create a simple, pass-through mapping to pass the lookup data to a flat file. Configure the session to extract data to the flat file using FastExport. 2. Configure the original mapping to perform the lookup on the flat file. Note: If we redesign the mapping using this procedure, we can further increase performance by specifying an ORDER BY clause on the FastExport SQL and enabling the Sorted Input property for the lookup file. This prevents Power Center from having to sort the file before populating the lookup cache. 4.3. Restarting a Failed MultiLoad Job Manually Applies to: MultiLoad When loading data, MultiLoad puts the target table into the “MultiLoad” state and creates a log table for the target table. After successfully loading the data, it returns the target table to the normal (non-MultiLoad) state and deletes the log table. When we load data using MultiLoad, and the MultiLoad job fails for any reason, MultiLoad reports an error, and leaves the target table in the Multi-Load state. Additionally, MultiLoad queries the log table to check for errors. If a target table is in the MultiLoad state or if a log table exists for the target table, we cannot restart the job. To recover from a failed MultiLoad job, we must release the target table from the MultiLoad state and drop the Multiload log, error and work tables. To release the target table from Multi-load state: Release mload <target table>; 4.4. Configuring Sessions that Load to the Same Table Applies to: MultiLoad
  • 17. 18White Paper|Working with Informatica-Teradata Parallel Transporter While Teradata MultiLoad loads data to a database table, it locks the table. MultiLoad requires that all instances handle wait events so they do not try to access the same table simultaneously. If we have multiple PowerCenter sessions that load to the same Teradata table using MultiLoad, set the Tenacity attribute for the session to a value that is greater than the expected run time of the session. The Tenacity attribute controls the amount of time a MultiLoad instance waits for the table to become available. Also configure each session to use unique log file names. For more information about the Tenacity, see the PowerCenter Advanced Workflow Guide. 4.5. Loading from Partitioned Sessions Applies to: FastLoad, MultiLoad When we configure multiple partitions in a session that uses staging files, the Integration Service creates a separate flat file for each partition. Since FastLoad and MultiLoad cannot load data from multiple files, use round-robin partitioning to route the data to a single file. When we do this, the Integration Service writes all data to the first partition and starts only one instance of FastLoad or MultiLoad. It writes the following message in the session log: MAPPING> DBG_21684 Target [TD_INVENTORY] does not support multiple partitions. All data will be routed to the first partition. If we do not route the data to a single file, the session fails with the following error: WRITER_1_*_1> WRT_8240 Error: The external loader [Teradata Mload Loader] does not support partitioned sessions. WRITER_1_*_1> Thu Jun 16 11:58:21 2005 WRITER_1_*_1> WRT_8068 Writer initialization failed. Writer terminating. For more information about loading from partitioned sessions, see the PowerCenter Advanced Workflow Guide. 4.6. Using Error Tables to Identify Problems during Loading Applies to: FastLoad, MultiLoad, TPump When problems occur during loading data, the Teradata standalone load utilities generate error tables. (FastExport generates an error log file.) The load utilities generate different errors during the different phases of loading data. FastLoad jobs run in two main phases: loading and end loading. During the loading phase, FastLoad initiates the job, locks the target table, and loads the data. During the end loading phase, the Teradata database distributes the rows
  • 18. 19White Paper|Working with Informatica-Teradata Parallel Transporter of data to the target table and unlocks it. FastLoad requires an exclusive lock on the target table during the loading Phase. MultiLoad also loads data during two main phases: acquisition and application. In the acquisition phase, MultiLoad reads the input data and writes it to a temporary work table. In the application phase, MultiLoad writes the data from the work table to the actual target table. MultiLoad requires an exclusive lock on the target table during the application phase. TPump loads data in a single phase. It converts the SQL in the control file into a database macro and applies the macro to the input data. TPump uses standard SQL and standard table locking. The following table lists the error tables can check to troubleshoot load or unload utility errors: Utility Data Loading Phase Default Error table Name Error types FastLoad x Loading x End Loading x INFA_ET1_<SEQ_no> x INFA_ET2_<SEQ_no> x Constraint violations, conversion errors, unavailable AMP conditions x Unique Primary index violations MultiLoad x Acquisition x Application x INFA_ET1_<SEQ_no> x INFA_ET2_<SEQ_no> x All Acquisition phase errors, application phase errors if the Teradata database cannot build a valid primary index x Uniqueness violations, field overflow on columns other than primary index fields, constraints errors TPump Single phase x INFA_ET_<SEQ_no> x All TPump errors When a load fails, check the “ET1_” error table first for specific information. The Error Field or Error Field Name column indicates the column in the target table that could not be loaded. The Error Code field provides details that explain why the column x 2689: Trying to load a null value into a non-null field x 2665: Invalid date format In the MultiLoad “ET2_” error table, we can also check the DBC Error Field column and DBC Error Code field. The DBC Error Field column is not initialized in the case of primary key uniqueness violations. The DBC Error Code that corresponds to a primary key uniqueness violation is 2794.
  • 19. 20White Paper|Working with Informatica-Teradata Parallel Transporter 5. Conclusion Teradata Parallel Transporter provides flexible, high-performance load and unload facilities that enable population of the data warehouse. It supports many traditional functions of Teradata utilities, as well as many that are new and advanced. Through its parallel execution design, Teradata Parallel Transporter improves ETL performance while providing a uniform interface. It also facilitates active data warehousing by allowing continuous and scalable access to data sources and targets, thus keeping the data fresh and with maximum throughput. In all, Teradata Parallel Transporter drives business value by enabling real-time decision making as required by enterprise-wide business intelligence (BI) applications. APPENDIX-A Before we run sessions that move data between PowerCenter and Teradata, we might want to install Teradata client tools. We also need to locate the Teradata TDPID. 1. Teradata Client Tools Teradata client tools help to communicate with the Teradata database and debug problems that occur when session loads data to or extracts data from the Teradata database. We can install the following Teradata client tools: 2. BTEQ. A general-purpose, command-line utility (similar to Oracle SQL*Plus) that enables to communicate with one or more Teradata databases. 3. Teradata SQL Assistant. A GUI-based tool that allows retrieving data from any ODBC-compliant database server and manipulating and storing the data in desktop applications. Teradata Quarryman is the older version of this tool. Install BTEQ or Teradata SQL Assistant to help we debug problems that occur when loading to and extracting from Teradata. Both tools are included in the Teradata Utility Pack, which is available from Teradata. 4. TDPID The Teradata TPDID indicates the name of the Teradata instance and defines the name a client uses to connect to a server. When we use a Teradata Parallel Transporter or a standalone load or unload utility with PowerCenter, we must specify the TDPID in the connection properties. The Teradata TDPID appears in the hosts file
  • 20. 21White Paper|Working with Informatica-Teradata Parallel Transporter on the machines on which the Integration Service and PowerCenter Client run. By default, the hosts file appears in the following location: UNIX: /etc/hosts Windows: %SystemRoot%system32driversetchosts* * The actual location is defined in the Registrykey HKEY_LOCAL_MACHINESYSTEMCurrentControlSetServicesTcpipParametersDataB asePath The hosts file contains client configuration information for Teradata. In a hosts file entry, the TDPID precedes the string“cop1.” For example, the hosts file contains the following entries: 127.0.0.1 Localhost demo1099cop1 192.168.80.113 td_1 custcop1 192.168.80.114 td_2 custcop2 192.168.80.115 td_3 custcop3 192.168.80.116 td_4 custcop4 The first entry has the TDPID “demo1099.” This entry tells the Teradata database that when a client tool references the Teradata instance “demo1099,” it should direct requests to “localhost” (IP address 127.0.0.1).The following entries have the same TDPID, “cust.” Multiple hosts file entries with the same TDPID indicate the Teradata instance is configured for load balancing among nodes. When a client tool attempts to reference Teradata instance “cust,” the Teradata database directs requests to the first node in the entry list, “td_1.” If it takes too long for the node to respond, the database redirects the request to the second node, and so on. This process prevents the first node, “td_1” from becoming overloaded. APPENDIX-B a. Teradata Connections Teradata relational connections use ODBC to connect to Teradata. PowerCenter uses the ODBC Driver for Teradata to retrieve metadata and read and write to Teradata. To establish ODBC connectivity between Teradata and PowerCenter, install the ODBC Driver for Teradata on each PowerCenter machine that communicates with Teradata. The ODBC Driver for Teradata is included in the Teradata Tools and Utilities (TTU). We can download the driver from the Teradata
  • 21. 22White Paper|Working with Informatica-Teradata Parallel Transporter web site. PowerCenter works with the ODBC Driver for Teradata available in the following TTU versions: Informatica Version Teradata Version(s) 9.5 TD 14.0 TD 13.10 9.1 TD 13.10 TD 13.0 TD 12.0 9.0.1 TD 13.10 TD 13.0 TD 12.0 8.6.1 TD 13.0 TD 12.0 8.6.0 TD 12.0 8.1.1 SP1 -SP5 TD 12.0 To load or extract data using a Teradata relational connection on UNIX, we must verify the configuration of environment variables and the odbc.ini file on the machine on which the Integration Service runs. To verify the environment variable configuration, ensure the Teradata ODBC path precedes the Data Direct driver path information in the PATH and shared library path environment variables. Place the Teradata path before the Data Direct path because both sets of ODBC software use some of the same file names. To verify the odbc.ini file configuration, make sure there is an entry for the Teradata ODBC driver in the “[ODBC Data Sources]” section of odbc.ini. The following excerpt from an odbc.ini file shows a Teradata ODBC driver (tdata.so) entry on Linux: [ODBC Data Sources]intdv14=tdata.so [intdv14] Driver=/usr/odbc/drivers/tdata.so Description= running Teradata V14DBCName=intdv14 SessionMode=Teradata CharacterSet=UTF8 StCheckLevel=0 DateTimeFormat=AAA LastUser= Username= Password= Database= DefaultDatabase=
  • 22. 23White Paper|Working with Informatica-Teradata Parallel Transporter ODBC is a native interface for Teradata. Teradata provides 32- and 64-bit ODBC drivers for Windows and UNIX platforms. The driver bit mode must be compatible with the bit mode of the platform on which the PowerCenter Integration Services runs. For example, 32-bit PowerCenter only runs with 32-bit drivers. For more information about configuring odbc.ini, see the PowerCenter Configuration Guide and the ODBC Driver for Teradata User Guide. b. Creating a Teradata Relational Connection When we create a Teradata (relational) connection object in the Workflow Manager, choose “Teradata,” and not “ODBC,” as the connection type in the connection properties . When we choose Teradata as the connection type, the Integration Service still uses Teradata ODBC to connect to Teradata. Although both ODBC and Teradata connection types might work, the Integration Service communicates with the Teradata database more efficiently when we choose the Teradata connection type. c. Creating a Teradata Parallel Transporter Connection When we create a Teradata (TPT) connection object in the Workflow Manager, choose “Teradata PT connection”. The Teradata PT connection will ask for connection attributes. 1. In connections → relational → select Type as 'Teradata PT Connection' → click on New. Fig8. Teradata relational connection creation
  • 23. 24White Paper|Working with Informatica-Teradata Parallel Transporter 2. Configure the connection as per the requirement as shown in the screen shot below: Here we can specify the TDPID, Database name and the system operator that we need for a specific connection. 3. System operator drop down has all 4 operators as shown in the screen shot: Fig9. Teradata TPT connection creation Fig10. Selecting the TPT connection System operator