Dynamic Partitioning in Informatca 8.X
Dynamic Partitioning in Informatca 8.X
High Availability
Grid Computing
Dynamic Partitioning
PowerCenter Domains
The first time you install Informatica Services, you create a domain and add a
node to the domain.
• Resilience.
• Failover.
• Recovery.
• Partition Types :
• Database partitioning.
• Hash auto-keys.
• Hash user keys.
• Key range.
• Pass-through .
• Round-robin.
• Update partitioning information using the Partitions view on the Mapping tab
of session properties.
• You can configure the following information when you edit or add a partition point:
• Specify the partition type at the partition point.
• Add and delete partitions.
• Enter a description for each partition.
• The Integration Service uses a hash function to group rows of data among partitions .
• Improves the performance of the session , the hash function usually processes
numerical data more quickly than string data.
• Specify a hash key for user hash key.
• We have created a sample mapping when we don’t configure this
mapping(m_orders_scd3) for Partitioning then the run time comes up to 37 seconds
• using hash user key partition the run time comes up to 22 seconds to complete the
session as shown in the below figure.
• With key range partitioning, the Integration Service distributes rows of data based on a port.
• you define a range of values.
• using key range partition the run time comes up to 33 seconds to complete the
session as shown in the below figure.
• Source/target statistics
• In round-robin partitioning, the Integration Service distributes rows of data evenly to all partitions .
• The session based on this mapping reads item information from three flat files of different sizes:
• Source file 1: 80,000 rows
• Source file 2: 5,000 rows
• Source file 3: 15,000 rows
• When the Integration Service reads the source data, the first partition begins processing 80% of the
data, the second partition processes 5% of the data, and the third partition processes 15% of the
data.
• To distribute the workload more evenly, set a partition point at the Filter transformation and set the
partition type to round-robin. The Integration Service distributes the data so that each partition
processes approximately one-third of the data.
• If the volume of data grows or you add more CPUs, you might need to adjust
partitioning so the session run time does not increase.
• When you use dynamic partitioning, you can configure the partition information so
the Integration Service determines the number of partitions to create at run time.
• The Integration Service scales the number of session partitions at run time based on
factors such as source database partitions or the number of nodes in a grid.
• Disabled. Do not use dynamic partitioning. Defines the number of partitions on the
Mapping tab.
• Based on number of partitions. Sets the partitions to a number that you define in
the Number of Partitions attribute. Use the $DynamicPartitionCount session
parameter, or enter a number greater than 1.
• Based on number of nodes in grid. Sets the partitions to the number of nodes in the
grid running the session. If you configure this option for sessions that do not run on a
grid, the session runs in one partition and logs a message in the session log.
• Edit the task , go to config object tab. Set the dynamic partition as based on number
of partitions, number of partitions 3.
• Using Dynamic partition the run time comes up to 32 seconds to complete the
session as shown in the below figure.
• Source/target statistics
• Edit the task , go to config object tab. Set the dynamic partition as based on number
of nodes in grid.
• Using Dynamic partition the run time comes up to 25 seconds to complete the
session as shown in the below figure.
• Edit the task , go to config object tab. Set the dynamic partition
as based on source partition
Session run time does not increase with volume of data grows or you add
more CPUs.
• Even though any system fails , session will be completed. ( grid computing).