0% found this document useful (0 votes)
255 views

TPT Teradata - The Teradata Parallel Transporter

TPT (Teradata Parallel Transporter) merges the functionalities of Fastload, Multiload, TPUMP, BTEQ and Fastexport into one utility. It offers a uniform syntax for loading, updating, deleting and extracting data from Teradata. TPT introduces concepts of data streams and operators - producer operators read data from sources and write to streams, consumer operators read from streams and write to targets. This allows parallelism not possible with traditional sequential utilities. TPT replaces standalone utilities with corresponding operators that perform similar functions.

Uploaded by

mohnish
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
255 views

TPT Teradata - The Teradata Parallel Transporter

TPT (Teradata Parallel Transporter) merges the functionalities of Fastload, Multiload, TPUMP, BTEQ and Fastexport into one utility. It offers a uniform syntax for loading, updating, deleting and extracting data from Teradata. TPT introduces concepts of data streams and operators - producer operators read data from sources and write to streams, consumer operators read from streams and write to targets. This allows parallelism not possible with traditional sequential utilities. TPT replaces standalone utilities with corresponding operators that perform similar functions.

Uploaded by

mohnish
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

TPT Teradata – The Teradata

Parallel Transporter
TPT Teradata – Introduction
Those of us who are working since many years with Teradata and its utilities know,
that the Teradata Parallel Transporter Utility (TPT) merges the functionalities of
Fastload, Multiload, TPUMP, BTEQ and Fastexport into one utility.

Teradata attempted to create a common tool in the past, but I assume many of us did
not even notice it, as it was an epic fail, called “The Teradata Warehouse Builder”.

I can’t remember anybody ever using it at all. Nevertheless, scripts written in


Teradata Warehouse Builder are executable with TPT without any changes.

While the syntax of the standalone utilities is not consistent, TPT offers a
uniform syntax for all loading, update, delete and extract tasks (and some other
tasks, like DDL statements, execution of Linux shell scripts etc.)

The base of TPT for Teradata are the concepts of data streams and operators.

Data streams are not directly accessible from your scripts. They are the pipelines
between operators and they are kept in memory. No data is written to the disks.

Operators read data from a source (which could be a data stream, or any other valid
source like a flat file or ODBC connection) or write data to a target (which again can
be a data stream or any valid target like a table or a flat file). Some operators take
over more tasks such as dropping and creating of tables. We will describe each type
of operator in detail later in this article.

Without TPT, we would probably be using Linux pipes to make in-memory pipelining
between files and the load utilities.

For example, one shell script could be writing a flat file into a named pipe, while at
the same time a Fastload script would be reading from this named pipe:

cat thefile > named_pipe &


cat named_pipe | fastload

Although TPT uses naming conventions and concepts different from the standalone
tools, most times we can easily find an equal TPT operator for each traditional tool;
TPT combines all standalone tools along with more features.
TPT Teradata – Operators Overview
TPT operators are grouped into producer operators (read operators), filter operators
and consumer operators (write operators).

Producer operators read data from various data sources and make them available in a
data stream for consumer operators (by reading from flat files, ODBC sources, SQL
select statement, export SQL).

Consumer operators read data from a data stream and write it to a target table, or a
flat file.

Producer operators and consumer operators can use access modules. Access modules
are software modules which are used to read from data stores such as CD, DVD, tape
drives.

The following access modules are available: Named pipes (for reading from Unix
named pipes), WebSphere MQ (for reading from IBM message queues), JMS.  The
user can implement additional access modules.

For a Teradata beginner, it’s quite difficult to distinguish between consumers and


producers. Maybe the easiest way to distinguish is to memorize the following:

Producer operators never write into the target, only into a data stream. Consumer
operators never write into a data stream, only directly into a target.

The data streams connect operators with each other (standalone operators we will
cover later):

As you can see in above picture, TPT covers the complete ETL chain.

The table below shows, how TPT replaces the the most used standalone utilities:

TPT operator Standalone utility Task

DDL operator BTEQ Executes DDL, DCL, and self-


contained DML SQL statements

Export operator FastExport Exports data from Teradata

Loads an empty table in block


Load operator FastLoad
mode

OLE DB Access Exports data from ODBC data


ODBC operator
Module source

.OS command in
OS Command operator Executes Linux commands
BTEQ

Transactionally inserts data into a


SQL Inserter operator BTEQ
Teradata table

SQL Selector operator BTEQ SQL SELECT from Teradata

Transactionally loads Teradata


Stream operator Tpump
tables

Update operator MultiLoad Updates, inserts, and deletes rows

The most important difference between the standalone utilities and TPT is the level of
parallelism. Traditionally the utilities have been strictly used in a sequential way, TPT
offers parallelism by sharing the data streams and running of several operators in
parallel:

Probably the most significant difference between the standalone utilities and TPT is
the level of parallelism. Traditionally, the utilities have been strictly used in a
sequential way. TPT offers parallelism by sharing the data streams, and the possibility
to run several operators in parallel:
In above example, two producer operators are running in parallel, writing into a
typical data stream. At the same time, two consumer operators are reading in parallel
from the data stream and writing into the Teradata database table. Such a setup
would need a lot of programming (Linux shell scripts etc.) if implemented with the
standalone utilities.
We will now get more into detail, by showing you which operators exist and how the
standalone tools made it into TPT operators:

The Producer Operators


Producer operators read data from a valid data source and make it available for
consumer operators in a data stream.

The Data Connector Operator (DATACONNECTOR PRODUCER):

The Data Connector Operator is a two-way operator, either used as producer operator
or as consumer operator.

When the type is DATACONNECTOR PRODUCER, it’s a producer operator and used to
read data from flat files or from an access module, pushing the data into a data
stream.

It can read from a single flat file (similar to the file=”filename” statement in a
Fastload). Furthermore, all files of a certain directory which are matching a wildcard
pattern, can be read at once (i.e. treated as a single input file).

Apart from directly reading from flat files, INMOD adapters can be used to push the
data to the consumer operator (Fastload INMOD and Multiload INMOD).

The Export Operator (EXPORT):

This operator replaces the Fastexport utility. It reads data from a Teradata table
(using a SQL SELECT statement) and pushes the data into a data stream: It’s a
producer operator as it puts the data into a data stream, not directly into a flat file!

The SQL Selector Operator (SELECTOR):


This operator produces data by executing a SQL SELECT statement. The data are
written into a data stream. This operator is comparable to the BTEQ export.

The ODBC Operator (ODBC):

The ODBC operator produces data, by reading from a ODBC data source and writing it
into a data stream.

The Consumer Operators


Consumer operators read data from a data stream and write it into a target, which is
either a table or a flat file. Access modules are also valid targets. Consumer operators
read from data streams and write to targets.

Consumer operators correspond with their standalone utilities.

The Data Connector Operator (CONSUMER):


The Data Connector Operator, when defined as consumer type, is used to write into a
flat file. Even access modules are utilized as the target.

The Load  Operator (LOAD):


This operator offers the block level load functionality we can find in a Fastload.
The Update Operator (UPDATE):
This operator provides the enhanced block level update functionality we can find in a
Multiload.

The Stream Operator (STREAM):


This operator implements the TPUMP functionality.

The SQL Inserter Operator (INSERTER):


This operator performs the transactional BTEQ INSERT functionality.

The Fastexport OUTMOD (FASTEXPORT OUTMOD):


This operator allows for usage of the Fastexport OUTMOD adapter

Filter Operators
We use this operator to apply filtering on the data stream.

The TPT scripts allow to invoke user written filters (C operator, C++), WHERE clauses
and CASE DML expressions in APPLY statements.
Standalone Operators
The OS Command Operator (OS COMMAND):

We operator to execute Linux commands. It replaces the functionality BTEQ offers


with the .OS command

The DDL Operator (DDL):

We use this operator to execute DDL statements. It’s useful for tasks, such as,
dropping or creating of tables, indexes before the real data load takes place.

While the name of this operator might be somehow misleading, it allows for any SQL
statements which doesn’t return a result set.

We can use for example statements like INSERT…SELECT, UPDATE and DELETE.

The Update Operator (UPDATE):


When used for deletion, it’s the replacement for an optimized Multiload DELETE.

TPT is tightly coupled to the standalone utilities, offering more functionality on top of
the other tools Fastload, Multiload and Bteq.

You might also like