0% found this document useful (0 votes)
17 views

Partisions Types

The document discusses performance tuning in Informatica mappings that are running slowly. Some key points discussed include: 1. Partitioning sessions into multiple threads to process data in parallel which can improve performance. There are different types of partitioning including key range, hash, round robin, and pass through. 2. Steps for performance tuning include designing mappings with minimal transformations, filtering unwanted data in the source, and checking session statistics and logs. 3. Increasing the number of partitions allows more concurrent processing but can overload the system if too many partitions are used. Partition points control how data is distributed among partitions.

Uploaded by

malleswari Ch
Copyright
© © All Rights Reserved
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Partisions Types

The document discusses performance tuning in Informatica mappings that are running slowly. Some key points discussed include: 1. Partitioning sessions into multiple threads to process data in parallel which can improve performance. There are different types of partitioning including key range, hash, round robin, and pass through. 2. Steps for performance tuning include designing mappings with minimal transformations, filtering unwanted data in the source, and checking session statistics and logs. 3. Increasing the number of partitions allows more concurrent processing but can overload the system if too many partitions are used. Partition points control how data is distributed among partitions.

Uploaded by

malleswari Ch
Copyright
© © All Rights Reserved
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 5

Requirement:How to do performance tuning in informatica if mapping is taking

long time

Long running sessions, Timeout, session failured , CPU consuption.


one thing is viewdged volume of data and another thing number of column and
complex logic.
Above sennarios we are doing performance tuning in informatica.
PARTITION: Parallel processing in the sence we have to incresing the number of
threads
inserte of one thread we are going for number of parallel threads.
1. DBlevel partition(At DB level and not in informatica),
2.Range : Ifwe have thedata rangelike 1 to 10 and 10 to 20 like thiswe have diff
range of data.
3.Round robin: Based on thread avilability.
4. Hash:used on hash alogorithm like
sorter,aggregater, lookup, rank used hash

Steps:
 As per best practices we need to design a mapping with minimum transformations
 As much as possible we need to filtered out unwanted data in source qualifier
itself
 If any mapping taking long time ,then we need to get session statistics in
monitor by
get run properties
 If source reading very less records like throughput 5,10,100 etc less records we
need
to look session log .
 Check busy percentages of reader,Writer and trasformation threads.
Partitioning Sessions
 Performance can be improved by processing data in parallel in a single session
bycreating
multiple partitions of the pipeline.
 By default session will have one partition that is pass through partition it will
create 1 reader and 1 writer thread.
 Rather than processing larger volumes of data through single reader and single
writer,
we willshare with multiple reader and multiple writer using partitions.
 Increasing the number of partitions allows the Integration Service to create
multiple
connections to sources and process partitions of source data concurrently.
 We have 4 types of partiotions at session level
 Key range
 Hash
 Pass through
 Round Robin
 If we create key range then we need to specify range values for each partion
based on
key column.
 If it is pass through then we need to specify SQ queries at session for each
partiotion.
Round robin partition is used to when we want to distributes rows of data evenly to
all
p artitions.
Hash auto-keys: The Integration Service uses a hash function to group rows of data
among
partitions. The Integration Service groups the data based on a partition key.
Informatica PowerCenter
Session Partitioning
Type of Informatica Partitions

Af ter tuning all the performance bottlenecks we can further improve the
performance by
addition partitions.

We can either go for


Dynamic partitioning (number of partition passed as parameter)
or Non-dynamic partition (number of partition are fixed while coding).
Apart from used for optimizing the session, Informatica partition become useful in
situations
where we need to load huge volume of data or when we are using Informatica source
which already has partitions defined, and using those partitions will allow to
improve
the session performance.
The partition attributes include setting the partition point, the number of
partitions,
and the partition types.

Partition Point:
There can be one or more pipelines inside a mapping.
Adding a partition point will divide this pipeline into many pipeline stages.
Informatica will create one partition by default for every pipeline stage.
As we increase the partition points it increases the number of threads.
Informatica has mainly three types of threads –Reader, Writer and Transformation
Thread.

The number of partitions can be set at any partition point.


We can define up to 64 partitions at any partition point in a pipeline.
When you increase the number of partitions, you increase the number of processing
threads,
which can improve session performance. However, if you create a large number of
partitions or
partition points in a session that processes large amounts of data, you can
overload the
system.

You cannot create partition points for the following transformations:


• Source definition
• Sequence Generator
• XMLParser
• XML target
• Unconnected transformations

The partition type controls how the Integration Service distributes data among
partitions
at partition points.
The Integration Service creates a default partition type at each partition point.

Type of partitions are :


1. Database partitioning,
2. Hash auto-keys
3. Hash user keys
4. Key range
5. Pass-through
6. Round-robin.

Database Partitioning
For Source Database Partitioning, Informatica will check the database system for
the
partition information
if any and fetches data from corresponding node in the database into the session
partitions.
When you use Target database partitioning, the Integration Service loads data into
corresponding database partition nodes.
Use database partitioning for Oracle and IBM DB2 sources and IBM DB2 targets.

Pass through
Using Pass through partition will not affect the distribution of data across
partitions
instead it will run in single pipeline.which is by default for all your sessions.
The Integration Service processes data without redistributing rows among
partitions.
Hence all rows in a single partition stay in the partition after crossing a pass-
through
partition point.

Key range
Used when we want to partition the data based on upper and lower limit.
The Integration Service will distribute the rows of data based on a port or set of
ports
that we define as the partition key. For each port, we define a range of values.
Based on the range that we define the rows are send to different partitions.
To define the upper and lower

Round robin partition is used to when we want to distributes rows of data evenly to
all
partitions.
To distributes the rows evenly amoung the partition.

Hash auto-keys: The Integration Service uses a hash function to group rows of data
among
partitions.
The Integration Service groups the data based on a partition key.

Hash user keys: The Integration Service uses a hash function to group rows of data
among
partitions. We define the number of ports to generate the partition key.

Informatica PowerCenter
Session Partitioning
Type of Informatica Partitions

After tuning all the performance bottlenecks we can further improve the performance
by addition partitions.

We can either go for


Dynamic partitioning (number of partition passed as parameter)
or Non-dynamic partition (number of partition are fixed while coding).
Apart from used for optimizing the session, Informatica partition become useful in
situations
where we need to load huge volume of data or when we are using Informatica source
which already has partitions defined, and using those partitions will allow to
improve the session performance.
The partition attributes include setting the partition point, the number of
partitions, and the partition types.

Partition Point:
There can be one or more pipelines inside a mapping.
Adding a partition point will divide this pipeline into many pipeline stages.
Informatica will create one partition by default for every pipeline stage.
As we increase the partition points it increases the number of threads.
Informatica has mainly three types of threads –Reader, Writer and Transformation
Thread.

The number of partitions can be set at any partition point.


We can define up to 64 partitions at any partition point in a pipeline.
When you increase the number of partitions, you increase the number of processing
threads,
which can improve session performance. However, if you create a large number of
partitions or
partition points in a session that processes large amounts of data, you can
overload the system.

You cannot create partition points for the following transformations:


• Source definition
• Sequence Generator
• XMLParser
• XML target
• Unconnected transformations

The partition type controls how the Integration Service distributes data among
partitions at partition points.
The Integration Service creates a default partition type at each partition point.

Type of partitions are :


1. Database partitioning,
2. Hash auto-keys
3. Hash user keys
4. Key range
5. Pass-through
6. Round-robin.

Database Partitioning
For Source Database Partitioning, Informatica will check the database system for
the
partition information if any and fetches data from corresponding node in the
database
into the session partitions.
When you use Target database partitioning, the Integration Service loads data into
corresponding database partition
nodes.
Use database partitioning for Oracle and IBM DB2 sources and IBM DB2 targets.

Pass through
Using Pass through partition will not affect the distribution of data across
partitions instead
it will run in single pipeline.which is by default for all your sessions.
The Integration Service processes data without redistributing rows among
partitions.
Hence all rows in a single partition stay in the partition after crossing a pass-
through partition point.

Key range
Used when we want to partition the data based on upper and lower limit.
The Integration Service will distribute the rows of data based on a port or set of
ports
that we define as the partition key. For each port, we define a range of values.
Based on the range that we define the rows are send to different partitions

Round robin partition is used to when we want to distributes rows of data evenly to
all
partitions

Hash auto-keys: The Integration Service uses a hash function to group rows of data
among
partitions. The Integration Service groups the data based on a partition key.

Hash user keys: The Integration Service uses a hash function to group rows of data
among
partitions. W e define the number of ports to generate the partition key.

CREATE TABLE Sales_Range


( salesman_id NUMBER(5),
salesman_nameVARCHAR2(30),
sales_amount NUMBER(10),
sales_date DATE
) PARTITION BY RANGE(sales_date)
(
PARTITION sales_jan2000 VALUES LESS
THAN(TO_DATE('02/01/2000','DD/MM/YYYY')),

CASE ( ) or DECODE ( )
Case( ) : Case is similar to decode but easier to understand while going through
coding.
Example:
SQL> SELECT Salary,
CASE Salary
WHEN 2500 THEN ‘Low’
WHEN 4000 THEN ‘High’
ELSE ‘Medium’
END CASE
FROM EMPLOYEES;

Decode( ) :
Example:
SQL> SELECT Salary,
DECODE(Salary, 2500,‘Low’,
4000,‘High’,
‘Medium’) AS GRADE
FROM EMPLOYEES;

You might also like