This page in intended to supplement Job submission using PBSPro which will be simplified to the basics.
The CHPC's Lengau cluster uses the PBSPro scheduler to manage both interactive and batch access to the compute resources (compute nodes).
Firstly, don't do it.
Secondly, open a Helpdesk ticket and request that we build and install the MPI library for you.
Thirdly, if you ignore the above, you had better do it right, so
Any MPI library you build must use the PBSPro momd
services to launch its ranks on the allocated compute nodes. It must not use ssh
or another transport mechanism.
Depending on the MPI library you are building the exact details will differ and be specific.
(1) Specify the rsh protocol.
For example, with the Intel MPI library you set the variable
I_MPI_BOOTSTRAP=rsh
(2) Specify the /opt/pbs/bin/pbs_tmrsh
remote shell command (instead of ssh
) to launch the ranks.
For example, with Intel MPI
I_MPI_BOOTSTRAP_EXEC=/opt/pbs/bin/pbs_tmrsh
The intention is to make sure that mpirun
or mpiexec
uses /opt/pbs/bin/pbs_tmrsh
instead of ssh
to launch each rank process on the allocated compute nodes.
If necessary, you may need to edit the source code of mpirun
and mpiexec
to make sure that pbs_tmrsh
is being used. For most MPI libraries, these two are shell scripts that can be directly edited; for others, they are written in C or Fortran and have to be changed and recompiled.
The advantage of using pbs_tmrsh
is that you don't have to specify a machine file to mpirun
as it will retrieve the list of nodes from PBSPro directly.
Launch a test job that requests two nodes. Once the job starts, ssh into the second node (as listed in the machine file or by qstat -f <jobid>
) and run
pstree -p
and check that the MPI program (the executable specified to be run by mpirun
or mpiexec
) is listed as a child process of momd
.
If it is listed under sshd
then you have failed. Exit the node, qdel
the job and fix the MPI library source code for mpirun
and mpiexec
.
Open MPI has full support for PBS Pro and should auto-detect it. If not, and you want to make sure your build of Open MPI includes full PBS Pro support, use the -
-with-tm
option when you run the configure
command to configure the build.
See the Open MPI FAQ for more information.