Implementing Parallel Processing in Peoplesoft
Implementing Parallel Processing in Peoplesoft
You might have seen some batch jobs which updates/creates millions of data taking much
time to finish the job. It is taking time due to the amount of data that needs to be
processed (apart from the bad design/sqls). Parallel processing is a method by which we
could finish the job in a matter of minutes. It is a process by which we divide the data into
smaller logical sets and running the job for each set at the same time. Peoplesoft support
the parallel processing in application engines by means of temporary tables and inbuilt
mechanism to launch multiple instances.
If I run the update process for all the employees without parallel processing, the job has
to update all the employees’ record and say it finishes in 10 minutes. What if I divide my
employees based on BU and run each instance of the same program simultaneously? Each
individual process will take 2 minutes and since all the process is run at the same time, the
entire record will be finished at the same time. So I reduced my processing time from 10
minutes to 2 minutes (in real scenario, it may take time more than 2 minutes but still it
will be better than the initial 10 minutes time). The following diagram illustrates running
of process in parallel.
There will be a delay between the actual processing start of the first instance and second
instance (applicable for all instances). This is due to the fact that, each individual set of
process will lock the data to be processed in the base table. Since one process is updating
the base table, the other one has to wait until the database frees the base table.
Similarly there will be delay when we update back the base table with updated data. But
this delay will be negligible when compared to the overall gain in performance.
You might have wondered from the previous post that how it will improve the performance
if we split the process. If I’m having only one update statement and if I use set based
processing for doing that, then where really is my performance improved. It will not be
always good make a process do the processing in parallel. Sometimes it may have negative
performance gains. As per the example I stated above it will be an overhead for the server
to send 5 sql statements instead of one. So when do I need to make the process parallel?
Below are some scenarios which you can think of introducing parallel processing.
2. There is a possibility that multiple users will run my process at the same time for same
transaction. This can happen if the same process is available in batch and online mode. In
this case if one person runs in batch mode and one person runs online for the same
transaction, one of my processes may error out or updates the tables with wrong data.
Also there can be a chance where two users running the same process with same
runcontrol parameters at the same time.
4. You are doing row by row processing in your application engine program. There can be
scenarios where you cannot do all your processing in set based manor. In such cases
implementing parallel processing is the best option. Since the time required for processing
is directly coupled to the number of rows, the more row you have to process, the more
time it is going to take. So divide your data into logical set and run the process in parallel.
It will reduce the number of rows for each individual process instance and thereby the
processing time also gets reduced.
Implementing parallel processing in Peoplesoft is a simple task if you make sure that you
follow all the steps systematically. Below is the generic guide line to implement parallel
processing in peoplesoft.
4. Build the Temp Tables – You need to build the temp tables only after assigning it to the
app engine. Once you build the table, it will generate copies for the table for as many
instance you have mentioned (online+batch), the maximum being 99. Each instance will
end with specific instance count TAO1, TAO2 etc…
5. Locking logic – Now you are done with the configurations and needs to write the code
to suite parallel processing. Locking logic is important one. If two processes are initiated
for same transactions at the same time, then both will process the transaction and at the
end when inserting data back to the tables. Both the process will try to insert the same
data and as a result the process errors out or in case of updating the process will end up in
updating wrong data. To overcome this situation, we use locking logics. For this we have
created fields in step 1. At the beginning of the program, write an update statement as
below to update the base table with the process instance of the current process.
So when the second program tries to work on this transaction, it will see that this
is used by some other application engine and leaves it. Thus avoiding duplicate
processing and chances of error. Now comes another scenario, if you are also
running the process instance in online mode then your Process Instance will be
always zero and the previous sql will not help you out. In such cases we need to
use the second field added in step 1. Then the sql needs to be modified as below.
6. Drag data to temp table and do the processing – Now you can start dragging the data
from driver table to the temp tables based on the process instance and the do the
processing.
7. Use %Table() metasql – When you do all the processing with temp tables, always make
sure that you wrap your Temporary record name with %Table() metasql. It will
automatically unwind to the current instance name. For example if your current instance
is 6, then the below sql will unwind to UPDATE PS_SAL_TAO6 SET SAL=SAL*10
UPDATE %Table(SAL_TAO) SET SAL+SAL*10
8. Unlock the base table when done with processing - Once you are done with the
processing you can now unlock the base tables by setting the PROCESS_INSTANCE to 0.
Otherwise any other process run at a later time for the same transactions will never be
picked up for processing. If your business logic needs that transaction to be processed only
once, then you can avoid this step.
9. Avoid Truncating tables – Most people are used to truncating the temporary tables as
the last step for the processing to clear up the temporary data. You can now avoid this
step as in the latest versions of PeopleTools, this is taken care internally. Once your
program ends, application engine will automatically issue the truncate statements.
I am sure if you follow these steps carefully while creating your program, you can easily
build up your process to work in parallel.