Some Very Under Done Instructions For HPC 2013: Hpc@lists - Iitk.ac - in
Some Very Under Done Instructions For HPC 2013: Hpc@lists - Iitk.ac - in
These instructions are only for a user with some experience. You need proficiency in Linux and parallel
programming. Some details for the HPC2013 Cluster are in the other documents. In case of problems
write to [email protected]. You will also be added to a list [email protected] . Read up on the
instructions for HPC2010 as well as you may find it helpful. There may be some teething problems.
For people used to the older cluster please pay attention to I_MPI_FABRICS in the scripts provided.
workq is an interactive queue that places you on a node where you can run commands. You
cannot access /opt/software otherwise.
3. You can change your password on a CC machine but not on the cluster.
4. The changed password will be effective on the cluster within an hour.
5. We have created a home directory for you which will be initially empty. The path for this
directory is /home/<username>.
6. We have also created a /scratch<your-name> directory that is a temporary directory. The
/scratch is faster than /home so you may prefer to write out temporary results and carry out
computation here. /scratch contents can be deleted at any point of time and if not in use. All
software are in /opt/software
7. Please read the structure of the queues given below
8. There is no backup and you are responsible for taking regular backup of your area.
9. Please use the cluster in a sensible manner, and follow the rules of engagement, otherwise you
may land up causing problems to others.
Currently you can test your programs with the Intel compiler. Here is how.
There is a file in /opt/software/intel/initpaths.
To run the intel 32 bit compilers you have to type
source /opt/software/intel/initpaths ia32
To run the intel 64 bit compilers you have to type
source /opt/software/intel/initpaths intel64
If you want to do special tuning for trace analyzer then the second argument has to be special
but you will have to do your own research on this. The commands above just use the default
analyzer. Please read the Intel site documentation for details. A common mistake is using the
programs compiled on one cluster directly on the another cluster. You need to recompile
programs if clusters are changed.
After you have sourced the files you should compile your programs using the relevant
programs such as mpiicc, mpicc etc. Confusion in PATH settings is one of the main sources of
error.
Use the following in a file say “test” for submitting programs. Remember to do a chmod 755 to the file.
Change this file as per nodes and queue required or for the job name. You can change some variable
names such as the name of the queue and job in example file given below. Number of nodes should
change with the queue. You should always keep ppn 20 except for “hyperthread” queue where ppn is
40 and workq parallel job where it is 4. The hyperthread queue is an experimental queue and may give
better results than normal. If this is the case then please do inform us. Make it less only in exceptional
circumstances and do not make it more. Even then restrict yourself to the number of nodes limit in any
queue.
A short description of the queues is below. Here workq should be an interactive queue as well as a
batch queue and remaining queues are only batch queues. There may be some discrepancy in this
functionality as numbers of days etc. are a policy decision.
queue walltime Max jobs run Min Max Min Max Total
simultaneously cores cores nodes nodes Nodes
workq 24 hours 2 hrs 2 (1 login + 1 testing) 1 6 1 1 4
CPU time
small 5 days 3 running 1 waiting 20 40 1 2 96
medium 4 days 3 running 1 waiting 40 120 2 6 256
large 3 days 2 running 1 waiting 120 640 6 32 482
hyperthread (each node behaves 5 days 1 running 1 waiting 40 80 1 2 16
as if it has 40 cores)
highmem (for large memory 5 days 1 running 1 waiting 2 20 1 1 5
jobs)
mini (for jobs of small duration) 2 hours 1 running 1 waiting 20 40 1 2 32
test - - - - - - 2
If you do not see /opt/software when you login, it is deliberate. You must use qsub -I
The list of all possible scripts would be long but the domain experts, which you as user are supposed to
be, should figure it out. You can install software in your own directories and in no-case would you
require root privileges for installing the software.