Assignment 1, CS633: Code Explanation
Assignment 1, CS633: Code Explanation
Directory - Assignment 1
Files
● src.c - source code
● Makefile
● run.py - helper code for execution and plotting
● Readme.pdf - this file
Code Explanation
src.c
● run_stencil_computation function is defined which given the local data points
array and all the received data points in form of 4 receive arrays performs stencil
computation for that time step.
● In the main, we first compute the four adjacent processes and store their rank in
variables left,right,top,bot respectively. (-1 is stored is none exist)
● Data initialisation is performed before each method starts.
● In method 1 we send and receive each individual element separately using
MPI_Issend and MPI_Irecv. We use the MPI_Waitall to apply barriers in code
and use the run_stencil_computation function to perform stencil computation.
● For method 2 we send packed data and receive the entire array of data at once.
Rest remains the same as method 1.
● For method 3 we send data using derived data type which is constructed using
MPI_Type_vector. Rest remains the same as method 1.
● The output is sent in format of
“timeformethod1,timeformethod2,timeformethod3” which is further processed
by run.py to output in the required format.
Plots
Number of processes = 16
Number of processes = 36
Number of processes = 49
Number of processes = 64
Issues faced
● make command leading to clock skew, sometimes interrupt with mpiexec call
● Code takes approx 20-30 mins from compilation to plotting
● Generating host files for every mpiexec call is faster than generating hosts for only
different process counts in terms of complete execution of code.
● [Important] If every mpiexec call with hostfile “hosts” has been given timeout of 25
seconds in case of overload of servers and corresponding new mpiexec call is generated
with hostfile as “hostsimproved” to ensure smoooth execution. Please note there are
such 5*4*7=140 calls, so this may take a slightly large amount of time (timeout may
occur for 1-2 calls). After execution of each call, code prints “ok” on stdout for giving
sense of progress.
● In the job script, we observed that mpiexec throws “not able to parse hostfile error” after
“make” call. On running the mpiexec with generated hostfiles separately on terminal
worked fine. We were not able to understand the nature of error. We have handled this
error with try except call in python by giving 5 seconds of sleep time for ‘except’ part of
the code. This way code finally worked. So, Please wait for 15-20 seconds before
printing of first “ok”.