Partisions Types
Partisions Types
long time
Steps:
As per best practices we need to design a mapping with minimum transformations
As much as possible we need to filtered out unwanted data in source qualifier
itself
If any mapping taking long time ,then we need to get session statistics in
monitor by
get run properties
If source reading very less records like throughput 5,10,100 etc less records we
need
to look session log .
Check busy percentages of reader,Writer and trasformation threads.
Partitioning Sessions
Performance can be improved by processing data in parallel in a single session
bycreating
multiple partitions of the pipeline.
By default session will have one partition that is pass through partition it will
create 1 reader and 1 writer thread.
Rather than processing larger volumes of data through single reader and single
writer,
we willshare with multiple reader and multiple writer using partitions.
Increasing the number of partitions allows the Integration Service to create
multiple
connections to sources and process partitions of source data concurrently.
We have 4 types of partiotions at session level
Key range
Hash
Pass through
Round Robin
If we create key range then we need to specify range values for each partion
based on
key column.
If it is pass through then we need to specify SQ queries at session for each
partiotion.
Round robin partition is used to when we want to distributes rows of data evenly to
all
p artitions.
Hash auto-keys: The Integration Service uses a hash function to group rows of data
among
partitions. The Integration Service groups the data based on a partition key.
Informatica PowerCenter
Session Partitioning
Type of Informatica Partitions
Af ter tuning all the performance bottlenecks we can further improve the
performance by
addition partitions.
Partition Point:
There can be one or more pipelines inside a mapping.
Adding a partition point will divide this pipeline into many pipeline stages.
Informatica will create one partition by default for every pipeline stage.
As we increase the partition points it increases the number of threads.
Informatica has mainly three types of threads –Reader, Writer and Transformation
Thread.
The partition type controls how the Integration Service distributes data among
partitions
at partition points.
The Integration Service creates a default partition type at each partition point.
Database Partitioning
For Source Database Partitioning, Informatica will check the database system for
the
partition information
if any and fetches data from corresponding node in the database into the session
partitions.
When you use Target database partitioning, the Integration Service loads data into
corresponding database partition nodes.
Use database partitioning for Oracle and IBM DB2 sources and IBM DB2 targets.
Pass through
Using Pass through partition will not affect the distribution of data across
partitions
instead it will run in single pipeline.which is by default for all your sessions.
The Integration Service processes data without redistributing rows among
partitions.
Hence all rows in a single partition stay in the partition after crossing a pass-
through
partition point.
Key range
Used when we want to partition the data based on upper and lower limit.
The Integration Service will distribute the rows of data based on a port or set of
ports
that we define as the partition key. For each port, we define a range of values.
Based on the range that we define the rows are send to different partitions.
To define the upper and lower
Round robin partition is used to when we want to distributes rows of data evenly to
all
partitions.
To distributes the rows evenly amoung the partition.
Hash auto-keys: The Integration Service uses a hash function to group rows of data
among
partitions.
The Integration Service groups the data based on a partition key.
Hash user keys: The Integration Service uses a hash function to group rows of data
among
partitions. We define the number of ports to generate the partition key.
Informatica PowerCenter
Session Partitioning
Type of Informatica Partitions
After tuning all the performance bottlenecks we can further improve the performance
by addition partitions.
Partition Point:
There can be one or more pipelines inside a mapping.
Adding a partition point will divide this pipeline into many pipeline stages.
Informatica will create one partition by default for every pipeline stage.
As we increase the partition points it increases the number of threads.
Informatica has mainly three types of threads –Reader, Writer and Transformation
Thread.
The partition type controls how the Integration Service distributes data among
partitions at partition points.
The Integration Service creates a default partition type at each partition point.
Database Partitioning
For Source Database Partitioning, Informatica will check the database system for
the
partition information if any and fetches data from corresponding node in the
database
into the session partitions.
When you use Target database partitioning, the Integration Service loads data into
corresponding database partition
nodes.
Use database partitioning for Oracle and IBM DB2 sources and IBM DB2 targets.
Pass through
Using Pass through partition will not affect the distribution of data across
partitions instead
it will run in single pipeline.which is by default for all your sessions.
The Integration Service processes data without redistributing rows among
partitions.
Hence all rows in a single partition stay in the partition after crossing a pass-
through partition point.
Key range
Used when we want to partition the data based on upper and lower limit.
The Integration Service will distribute the rows of data based on a port or set of
ports
that we define as the partition key. For each port, we define a range of values.
Based on the range that we define the rows are send to different partitions
Round robin partition is used to when we want to distributes rows of data evenly to
all
partitions
Hash auto-keys: The Integration Service uses a hash function to group rows of data
among
partitions. The Integration Service groups the data based on a partition key.
Hash user keys: The Integration Service uses a hash function to group rows of data
among
partitions. W e define the number of ports to generate the partition key.
CASE ( ) or DECODE ( )
Case( ) : Case is similar to decode but easier to understand while going through
coding.
Example:
SQL> SELECT Salary,
CASE Salary
WHEN 2500 THEN ‘Low’
WHEN 4000 THEN ‘High’
ELSE ‘Medium’
END CASE
FROM EMPLOYEES;
Decode( ) :
Example:
SQL> SELECT Salary,
DECODE(Salary, 2500,‘Low’,
4000,‘High’,
‘Medium’) AS GRADE
FROM EMPLOYEES;