User Tools

Site Tools


guide:pbspro

FIXME

This page in intended to supplement Job submission using PBSPro which will be simplified to the basics.

PBSPro Scheduler

The CHPC's Lengau cluster uses the PBSPro scheduler to manage both interactive and batch access to the compute resources (compute nodes).

Building a New MPI Library

Firstly, don't do it.

Secondly, open a Helpdesk ticket and request that we build and install the MPI library for you.

Thirdly, if you ignore the above, you had better do it right, so

Read and follow the instructions below carefully.

Any MPI library you build must use the PBSPro momd services to launch its ranks on the allocated compute nodes. It must not use ssh or another transport mechanism.

Depending on the MPI library you are building the exact details will differ and be specific.

(1) Specify the rsh protocol.

For example, with the Intel MPI library you set the variable

I_MPI_BOOTSTRAP=rsh 

(2) Specify the /opt/pbs/bin/pbs_tmrsh remote shell command (instead of ssh) to launch the ranks.

For example, with Intel MPI

I_MPI_BOOTSTRAP_EXEC=/opt/pbs/bin/pbs_tmrsh

The intention is to make sure that mpirun or mpiexec uses /opt/pbs/bin/pbs_tmrsh instead of ssh to launch each rank process on the allocated compute nodes.

If necessary, you may need to edit the source code of mpirun and mpiexec to make sure that pbs_tmrsh is being used. For most MPI libraries, these two are shell scripts that can be directly edited; for others, they are written in C or Fortran and have to be changed and recompiled.

The advantage of using pbs_tmrsh is that you don't have to specify a machine file to mpirun as it will retrieve the list of nodes from PBSPro directly.

Now Test It!

Launch a test job that requests two nodes. Once the job starts, ssh into the second node (as listed in the machine file or by qstat -f <jobid>) and run

pstree -p

and check that the MPI program (the executable specified to be run by mpirun or mpiexec) is listed as a child process of momd.

If it is listed under sshd then you have failed. Exit the node, qdel the job and fix the MPI library source code for mpirun and mpiexec.

Open MPI

Open MPI has full support for PBS Pro and should auto-detect it. If not, and you want to make sure your build of Open MPI includes full PBS Pro support, use the --with-tm option when you run the configure command to configure the build.

See the Open MPI FAQ for more information.

/var/www/wiki/data/pages/guide/pbspro.txt · Last modified: 2020/07/10 15:50 by kevin