INF_3201_h24_Assignment_1
INF_3201_h24_Assignment_1
Assignment 1 - MPI
26 August - 16 September 2024
The function p generates a char* of a given size (rst argument) and tests if
the char * given by you (second argument) is equal to the generated one. You
are not able to see what is generated by p. It only returns 0 when the two char
* are equal, 1 otherwise. It is provided by the crackme.o object le, described
in ”crackme.h”. Listing 1 shows one try to test if the value of the char * created
by p is equal to ”I think this is the char* !”.
As it is not the correct one, printf line 5 outputs 1 on stdout.
2 Tasks
You rst have to automate the generation of the char * in a sequential way.
You then have to implement parallel versions that nd the correct char *, using
MPI.
For parallel versions, we ask you to start with a work stealing approach,
where a worker (when being idle) is able to steal tasks allocated to other workers.
When found, the result is sent back to the process that initially distributes tasks
to workers, which writes it at the end of “solution.txt”.
1
Once this version is implemented, you should propose (at least) one opti-
mization to your code. Each optimization should be backed up by evaluations,
against the initial parallel version. Each version of your code should be tagged,
saved and accessible for evaluation. We expect at least 3 code versions: Sequen-
tial, Parallel, Optimizationi .
To evaluate the performance of your implementations, you need to run a
set of measurements. We will focus on time measurements. The sequential
version will be a baseline for comparison against your parallel version(s), running
on multiple nodes. Your solution should be able to scale when size and the
number of processes vary. Use concepts discussed during lectures to make your
performance evaluations.
Hint: In this assignment, we expect and use char * from C language. Nowa-
days, in most systems, a char has a numerical value between -128 and 127 (both
included). A char is a 1-byte value, i.e 8 bits, thus 255 possible values.
Reminder: (just in case) A char* is a pointer that can be used to point to
the rst element of an array of char...
3 Requirements
There are many ways to optimize the workload among the processes with such
a problem. Choices made for this distribution will impact the scalability and
performance of your solution. You should be able to explain why you decided
to go with the chosen distribution(s).
3.1 General
• A high vigilance is given to plagiarism. Always reference when you are
using resources. It is not allowed to use other’s code or code generators.
• Your solution can not use any shortcuts that reduce the functionality of
the program nor the diculty of the assignment. For example, do not use
any available mechanism on the cluster that would allow you to not use
MPI and still provide a solution (e.g a shared le system). If you have a
doubt, ask the TA during colloquiums.
2
3.2 Time performance analysis
Evaluate the time performance of your parallel version. For that, you should
study two dimensions: the number of processes, the size of input.
Have a graphical representation of your two studies (number of processes
and size of input on the x axis and time to solution on the y axis). These two
separate graphs help you answer the following questions: how well does your
solution(s) scale, according to the number of processes (for the highest studied
input size) ? How well does your solution(s) scale according to the size of the
input (for a xed number of processes)?
3.3 Report
The report should have the following sections:
• Introduction - describe your understanding of the assignment
• Sequential solution - explain your sequential solution and how to run it.
• Parallel design solution - describe how you parallelized the code, how the
workload distribution is being done. Explain your optimized version(s).
Explain how to run each version.
• Time performance analysis - describe your experiments and results with
dierent problem size and number of processes, running on various number
of nodes.
• Discussion - discuss positive and negative points of your solutions. Com-
pare the theoretical maximum speedup with your results with dierent
sizes while using dierent number of processes starting from 2 processes
to the maximum available number of processes in the cluster. This is also
the section to discuss the proposed optimizations done to your code.
• Summary - sum up and conclude your work
The report should be between 6 and 10 pages. Remember to:
• Explain everything in detail and answer questions from the assignment.
• Make your gures clear and understandable. Include references, captions
and axes descriptions.
• Take into consideration the hardware used (e.g heterogeneity, shared re-
sources).
4 Archive
In this archive you can nd a directory called code that contains:
3
• crackme.o - Object le that contains a compiled version of the crackme.
Needs to be linked at compilation to use p.
• solution.txt - le to store the numerical values found for p’s char*, one per
size
runs mainM P I with a size of 10, asks for 8 processes, launched on hosts,
located in hostf ile.
5 Cluster
The Computer Science department has a cluster of nodes that can be used
to run your solution. You need to copy your les only to the frontend (iclus-
ter.i.uit.no) to access them from all other nodes.
In order to login to the cluster, use (in linux):
1 ssh < your UiT ID > @ificluster . ifi . uit . no
Make sure that you are familiar with the welcome message from the cluster,
helping you to use the cluster correctly.
6 Hand-in
You will work alone for this assignment. GitHub classroom is used as a
hand-in platform for the course. You can and probably should commit and
push as often as possible. All needed codes and scripts used to run your code (if
anything has been added outside of the initial archive) on cluster nodes needs
to be added and explained.
Remember to push all your changes before Sep 16 September 2024
23:59:59. Any changes pushed to your repository after that deadline will not
be evaluated.