Odbcref
Odbcref
This document contains proprietary information of IBM. It is provided under a license agreement and Copyright
law protects it. The information contained in this publication does not include any product warranties, and any
statements provided in this manual should not be interpreted as such.
You can order IBM publications online or through your local IBM representative:
• To order publications online, go to the IBM Publications Center at www.ibm.com/shop/publications/order.
• To find your local IBM representative, go to the IBM Directory of Worldwide Contacts at
www.ibm.com/planetwide.
When you send information to IBM, you grant IBM a nonexclusive right to use or distribute the information in any
way it believes appropriate without incurring any obligation to you.
© Copyright International Business Machines Corporation 2006. All rights reserved.
US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract
with IBM Corp.
Contents
Audience
This guide is intended for:
DataStage Enterprise Edition designers who use Orchestrate
(OSH) to create or modify jobs that use ODBC enterprise stage.
DataStage administrators who install or upgrade DataStage
Enterprise Edition.
Accessing external data sets using ODBC enterprise "Accessing an External Datasource from
stage DataStage" on page 1-8
Related Documentation
To learn more about documentation from other Ascential products
and third-party documentation as they relate to the ODBC enterprise
stage, refer to the following section.
Guide Description
DataStage Install and Upgrade Instructions for installing or
Guide upgrading DataStage Enterprise
Edition and information about
environment variables.
Conventions
Convention Used for…
bold Field names, button names, menu items, and
keystrokes. Also used to indicate filenames, and
window and dialog box names.
user input Information that you need to enter as is.
code Code examples
variable Placeholders for information that you need to enter.
or Do not type the greater-/less-than brackets as part of
the variable.
<variable>
odbcread
Output dataset
Properties
Table 1-1 odbcread Operator Properties
Property Value
Number of input datasets 0
Composite Stage No
odbcread
The odbcread operator performs basic reads from a datasource.
Since it relies on ODBC interfaces to connect and to import data from
-user user_name
Optionally specify the user name used to connect to the
datasource.
-password password
Optionally specify the password used to connect to the
datasource.
-tablename table_name
Specify the table to be read from. May be fully
qualified. The -table option is mutually exclusive with
the -query option. This option has 2 suboptions:
-filter where_predicate: Optionally specify the rows
of the table to exclude from the read operation. This
predicate will be appended to the where clause of the
SQL statement to be executed.
-selectlist select_predicate: Optionally specify the
list of column names that will appear in the select
clause of the SQL statement to be executed.
-open open_command
Optionally specify an SQL statement to be executed
before the insert array is processed. The statements are
executed only once on the conductor node.
-close close_command
Optionally specify an SQL statement to be executed
after the insert array is processed. You cannot commit
work using this option. The statements are executed
only once on the conductor node.
-query sql_query
Specify an SQL query to read from one or more tables.
The -query option is mutually exclusive with the -table
option.
-fetcharraysize n
Specify the number of rows to retrieve during each
fetch operation. The default number of rows is1.
-db_cs code_page Optionally specify the ICU code page which represents
the database character set in use. The default is ISO-
8859-1. This option has the following sub option:
-use_strings: If this option is set, strings (instead of
ustrings) will be generated in the DataStage schema.
Operator Action
Here are the chief characteristics of the odbcread operator:
You can direct it to run in specific node pools.
It translates the query’s result set (a two-dimensional array) row
by row to a DataStage dataset.
Its output is a DataStage dataset that you can use as input to a
subsequent DataStage Stage.
Its translation includes the conversion of external datasource
datatypes to DataStage datatypes.
The size of external datasource rows can be greater than that of
DataStage records.
The Stage specifies either an external datasource table to read or
to perform an SQL query.
It optionally specifies commands to be run before the read
operation is performed and after it has completed the operation.
You can perform a join operation between DataStage dataset and
external datasource (there may be one or more tables) data.
Datatype Conversion
The odbcread operator converts external datasource datatypes to
OSH datatypes, as shown in the following table:
SQL_VARCHAR string[max=n]
SQL_WCHAR ustring(n)
SQL_WVARCHAR ustring(max=n)
SQL_DECIMAL decimal(p,s)
SQL_NUMERIC decimal(p,s)
SQL_SMALLINT in16
SQL_INTEGER int32
SQL_REAL decimal(p,s)
SQL_FLOAT decimal(p,s)
SQL_DOUBLE decimal(p,s)
SQL_BIT int8 (0 or 1)
SQL_TINYINT int8
SQL_BIGINT int64
SQL_VARBINARY raw(max=n)
SQL_TYPE_DATEP[6]P date
SQL_TYPE_TIMEP[6]P time[p]
SQL_TYPE_TIMESTAMPP[6]P timestamp[p]
SQL_GUID string[36]
Note Datatypes that are not listed in the table above generate an
error.
The filter specifies the rows of the table to exclude from the read
operation. By default, DataStage reads all rows.
You can optionally specify an -open and -close option command.
These commands are executed by ODBC on the external datasource
before the table is opened and after it is closed.
Join Operations
You can perform a join operation between DataStage datasets and
external datasource data. First use the ODBC Stage, and then the
lookup Stage or a join Stage. See the Parallel Job Developer’s Guide
for information about these stages.
[-open open_command]
-use_strings xyz
-array_size 5
-isolation_level read_committed
You must specify either the -query or -table option. You must also
specify the data source, user and password.
odbcwrite
output table
Composite Stage No
Stage Action
Below are the chief characteristics of the odbcwrite operator:
Translation includes the conversion of DataStage datatypes to
external datasource datatypes.
The Stage appends records to an existing table, unless you set
another mode of writing.
Datatype Conversion
string[max=n] SQL_VARCHAR
ustring(n) SQL_WCHAR
ustring(max=n) SQL_WVARCHAR
decimal(p,s) SQL_DECIMAL
decimal(p,s) SQL_NUMERIC
in16 SQL_SMALLINT
int32 SQL_INTEGER
decimal(p,s) SQL_REAL
decimal(p,s) SQL_FLOAT
decimal(p,s) SQL_DOUBLE
int8 (0 or 1) SQL_BIT
int8 SQL_TINYINT
int64 SQL_BIGINT
raw(n) SQL_BINARY
raw(max=n) SQL_VARBINARY
date SQL_TYPE_DATEP[6]
time[p] SQL_TYPE_TIMEP[6]P
timestamp[p] SQL_TYPE_TIMESTAMPP[6]P
string[36] SQL_GUID
Write Modes
The write mode of the operator determines how the records of the
dataset are inserted into the destination table. The write mode can
have one of the following values:
append: This is the default mode. The table must exist and the
record schema of the dataset must be compatible with the table.
The write operator appends new rows to the table. The schema of
the existing table determines the input interface of the Stage.
create: The operator creates a new table. If a table exists with the
same name as the one being created, the step that contains the
operator terminates with an error. The schema of the DataStage
dataset determines the schema of the new table. The table is
created with simple default properties. To create a table that is
partitioned, indexed, in a non-default table space, or in some other
non-standard way, you can use the -createstmt option with your
own create table statement.
replace: The operator drops the existing table and creates a new
one in its place. If a table exists with the same name as the one
you want to create, the existing table is overwritten. The schema
of the DataStage dataset determines the schema of the new table.
truncate: The operator retains the table attributes but discards
existing records and appends new ones. The schema of the
existing table determines the input interface of the Stage. Each
mode requires the specific user privileges shown in the table
below.
Note If a previous write operation failed, you can retry. Specify
the replace write mode to delete any information in the
output table that may have been written by the previous
attempt to run your program.
-password password
Optionally specify the password used to connect to the
datasource.
-tablename table_name
Specify the table to which to write. May be fully
qualified.
-createstmt create_statement
Optionally specify the create statement to be used for
creating the table when -mode create is specified.
-truncateLength n
Specify the length to which to truncate column names.
-open open_command
Optionally specify an SQL statement to be executed
before the insert array is processed. The statements are
executed only once on the conductor node.
-close close_command
Optionally specify an SQL statement to be executed
after the insert array is processed. You cannot commit
work using this option. The statements are executed
only once on the conductor node.
-insertarraysize n
Optionally specify the size of the insert array. The
default size is 2000 records.
-rowCommitInterval n
Optionally specify the number of records that should be
committed before starting a new transaction. This
option can only be specified if arraysize = 1. Otherwise
rowCommitInterval = arraysize. This is because of the
rollback logic/retry logic that occurs when an array
execute fails.
-db_cs code page name Optionally specify the ICU code page which represents
the database character set in use. The default is ISO-
8859-1.
itemNum:int32;
price:decimal[6,2];
storeID:int16;
Oracle table
The record schema of the DataStage dataset and the row schema of
the external datasource table correspond to one another, and field and
column names are identical. Following are the input DataStage record
schema and output external datasource row schema:
age
(number[5,0]) zip
(char[5]) income
(number)
ODBC Table
column name
odbcwrite
itemNum:int32;
price:decimal[6,2];
storeID:int16;
Other Features
Quoted Identifiers
All operators that accept SQL statements as arguments will support
quoted identifiers in those arguments. The quotes should be escaped
with ‘\’ c character.
Stored Procedures
The ODBC stage will not support stored procedures. The user should
use the Stored Procedure Stage for stored procedure support.
Transaction Rollback
Because it is not possible for native transactions to span transaction
of multiple processes, rollback will not be possible in this release.
Unit of Work
The unit of work Stage will not be modified to support ODBC in this
release.
Input dataset
odbcupsert
Number of output datasets by None; 1 when you select the -reject option
default
Operator Action
Here are the main characteristics of odbcupsert:
If a -insert statement is included, the insert is executed first. Any
records that fail to be inserted because of a unique-constraint
violation are then used in the execution of the update statement.
DataStage uses host-array processing by default to enhance the
performance of insert array processing. Each insert array is
executed with a single SQL statement. Updated records are
processed individually.
Use the -insertArraySize option to specify the size of the insert
array. For example:
-insertArraySize 250.
-reject filename
-password password
Optionally specify the password used to connect to the
datasource.
Statement options The user must specify at least one of the following
options and no more than two. An error is generated if
the user does not specify a statement option or
specifies more than two.
-update update_statement
Optionally specify the update or delete statement to be
executed.
-delete delete_statement
Optionally specify the delete statement to be executed.
-open open_command
Optionally specify an SQL statement to be executed
before the insert array is processed. The statements are
executed only once on the conductor node.
-close close_command
Optionally specify an SQL statement to be executed
after the insert array is processed. You cannot committ
work using this option. The statements are executed
only once on the conductor node.
-insertarraysize n
Optionally specify the size of the insert/update array.
The default size is 2000 records.
-rowCommitInterval n
Optionally specify the number of records that should be
committed before starting a new transaction. This
option can only be specified if arraysize = 1. Otherwise
rowCommitInterval = arraysize. This is because of the
rollback logic/retry logic that occurs when an array
execution fails.
Example
This example shows updating of an Oracle table that has two columns
acct_id and acct_balance, where acct_id is the primary key. Two of
the records cannot be inserted because of unique key constraints.
Instead, they are used to update existing records. One record is
transferred to the reject dataset because its acct_id generates a -
error.
Summarized below are the states of the Oracle table before the data
flow is run, the contents of the input file, and the action that
DataStage performs for each record in the input file.
Table 1-11
Table before dataflow Input file DataStage
acct_id acct_balance contents action
873092 67.23
566678 2008.56
865544 8569.23
678888 7888.23
995666 75.72
odbclookup
optional reject
output dataset
dataset
Properties
Composite stage No
|
You must specify either the -query option or one or more -table
options with one or more -key fields.
-password password
Optionally specify the password used to connect to the
datasource.
-tablename table_name
Specify a table and key fields to be used to generate a
lookup query. This option is mutually exclusive with the
-query option. The -table option has 3 suboptions:
-filter where_predicate: Specify the rows of the table
to exclude from the read operation. This predicate will
be appended to the where clause of the SQL statement
to be executed.
-selectlist select_predicate: Specify the list of column
names that will appear in the select clause of the SQL
statement to be executed.
-key field: Specify a lookup key. A lookup key is a field
in the table that will be used to join with a field of the
same name in the DataStage dataset. The -key option
can be specified more than once to specify more than
one key field.
-query sql_query
Specify a lookup query to be executed. This option is
mutually exclusive with the -table option.
-open open_command
Optionally specify an SQL statement to be executed
before the insert array is processed. The statements are
executed only once on the conductor node.
-fetcharraysize n
Specify the number of rows to retrieve during each
fetch operation. The default number of rows is 1.
Example
Suppose you want to connect to the APT81 server as user user101,
with the password test. You want to perform a lookup between a
DataStage dataset and a table called target, on the key fields lname,
fname, and DOB. You can configure odbclookup in either of two
ways to accomplish this.
Here is the OSH command using the -table and -key options:
$ osh " odbclookup - }
-key lname -key fname -key DOB
< data1.ds > data2.ds "
DataStage prints the lname, fname and DOB column names, values
from the DataStage input dataset, and the lname, fname and DOB
column names and values from the external datasource table.
If a column name in the external datasource table has the same name
as a DataStage output dataset schema fieldname, the printed output
shows the column in the external datasource table renamed using this
format:
APT_integer_fieldname
Notices N-39
Some states do not allow disclaimer of express or implied warranties
in certain transactions; therefore, this statement may not apply to you.
This information could include technical inaccuracies or typographical
errors. Changes are periodically made to the information herein; these
changes will be incorporated in new editions of the publication. IBM
may make improvements and/or changes in the product(s) and/or the
program(s) described in this publication at any time without notice.
Any references in this information to non-IBM Web sites are provided
for convenience only and do not in any manner serve as an
endorsement of those Web sites. The materials at those Web sites are
not part of the materials for this IBM product, and use of those Web
sites is at your own risk.
IBM may use or distribute any of the information you supply in any
way it believes appropriate without incurring any obligation to you.
Licensees of this program who wish to have information about it for
the purpose of enabling: (i) the exchange of information between
independently created programs and other programs (including this
one) and (ii) the mutual use of the information that has been
exchanged, should contact:
IBM Corporation
J46A/G4
555 Bailey Avenue
San Jose, CA 95141-1003
U.S.A.
Such information may be available, subject to appropriate terms and
conditions, including in some cases payment of a fee.
The licensed program described in this document and all licensed
material available for it are provided by IBM under terms of the IBM
Customer Agreement, IBM International Program License Agreement,
or any equivalent agreement between us.
Any performance data contained herein was determined in a
controlled environment. Therefore, the results obtained in other
operating environments may vary significantly. Some measurements
may have been made on development-level systems, and there is no
guarantee that these measurements will be the same on generally
available systems. Furthermore, some measurements may have been
estimated through extrapolation. Actual results may vary. Users of
this document should verify the applicable data for their specific
environment.
Information concerning non-IBM products was obtained from the
suppliers of those products, their published announcements, or other
publicly available sources. IBM has not tested those products and
IBM DataStage
Ascential DB2
Notices N-41
Ascential QualityStage OS/2
Ascential Software