User Tools

Site Tools


howto:pbs-pro_job_submission_examples

PBS-Pro Job Submission Examples

Job submission scripts are similar to shell scripts with PBS directives in comments.

PLEASE NOTE:

  1. All job script files must be plain text files, as created by unix programs like nano or vi. If you wish to create them on your desktop and copy them over, ensure they contain no formatting and have Unix line endings. Microsoft Word, for example, would be a very bad choice of editor.
  2. The first line should have the hash-bang and shell path, eg. #!/bin/bash or #!/bin/sh if you need to use shell variables and operations. It is not essential for a successful job submission.
  3. The output and error file paths (#PBS -o and #PBS -e directives) should be valid paths to files, preferably in your scratch directory, else the job may be killed. There should be a symbolic link called “scratch” in your home directory that points to the workspace on the Lustre file system, i.e., /lustre/SCRATCH5/users/yourname, if not, you may create it. If you do not specify output and error file names, they will be created in your home directory using the job name and number, eg.: myjob.o765432 and myjob.e765432.
  4. Job names (#PBS -N ) longer than 15 characters do not work. If you do not give a job name, the submission script file name will be used as the job name.

Basic MPI Example

On a cluster system, run the hello mpi executable on 3 nodes using all 8 cores per node with 8 MPI processes for up to 5 hours wall time:

#!/bin/bash
#PBS -e /lustre/SCRATCH5/users/username/stderr.out
#PBS -o /lustre/SCRATCH5/users/username/stdout.out
#PBS -l walltime=5:00:00
#PBS -l select=3:ncpus=8:mpiprocs=8
#PBS -V

source /etc/profile.d/modules.sh
module add openmpi/openmpi-1.6.5_gcc-4.7.2

cd $HOME/scratch

mpirun -np 24  /opt/gridware/bioinformatics/bin/hello

The job submission script above limits wall time to 5h, jobs taking longer will be killed by the scheduler. The line #PBS -V retains the environment variables of the shell from where the job is launched. The script also assumes that a symbolic link exists in the home directory that points to the user's area on /lustre/SCRATCH5/users and changes to that directory before launching the program, to ensure optimal file access speed.

If the link does not exist, create it by changing to the home directory root and running:

ln -s /lustre/SCRATCH5/users/your_username scratch

To verify, the listing ls -al scratch should be similar to the following

scratch → /lustre/SCRATCH5/users/your_username

and you should further be able to change to that directory, create files, etc.

Standard Options and Abilities

How to specify number of nodes and cores for a job

This is done using the #PBS -l select directive in the job submission script. Cluster partitions differ in how many cpu's per node there are:

  • Harpertown: 8 cores per node
  • Nehalem: 8 cores per node
  • Westmere: 12 cores per node
  • Dell: 12 cores per node

Selecting a cluster partition based on number of cores

To run a 24-core job on the Dell partition, for instance, only 2 nodes are needed:

#PBS -l select=2:ncpus=12:mpiprocs=12:jobtype=dell:group=nodetype

While 3 nodes are needed to request the same number of cpu cores on the Nehalem partition:

#PBS -l select=3:ncpus=8:mpiprocs=8:jobtype=nehalem:group=nodetype

Number of cluster nodes in a job

The number of cores can be obtained by:

NP=`cat $PBS_NODEFILE | uniq | wc -l`

Number of cores (cpu's) in a job

To determine the number of cores, include the following line in the submission script:

set NP = `cat $PBS_NODEFILE | wc -l`

Determining the machinefile (list of nodes)

The mpirun command can use the PBS variable $PBS_NODEFILE that contains the file name with the list of nodes:

mpirun -np $NP -machinefile $PBS_NODEFILE ~/bin/myprogram

Example: Join error and standard output, run multithreaded 24 cores for up to 5 hours, retain environment

#!/bin/bash
#PBS -j oe
#PBS -l walltime=5:00:00
#PBS -l ncpus=24
#PBS -V

myexecutable arg1 arg2 arg3 -num_threads 24

Using Modules

To use the module system, add the following line to your submission script after the #PBS lines:

source /etc/profile.d/modules.sh

Then you will be able to include your required module to set up an environment:

module add openmpi/openmpi-1.6.5_gcc-4.7.2

To list modules, the MODULEPATH variables normally are set by default and can be viewed from the command line using:

module avail

For Bioinformatics users, add the additional modules in your .profile as follows:

export MODULEPATH=/opt/gridware/bioinformatics/modules:$MODULEPATH

How long can a job run?

There is no limit to the amount of time that a PBS job can run. If no time limit is specified, a default of 12 hours is assigned. Remember to check point long running jobs.

To specify a time, include the following line in the submission script:

#PBS -l walltime=hhhh:mm:ss

Exclusive use of cluster nodes

On cluster systems, if a job does not request all the cores on a node, it is possible another job will share the same node. To prevent this, request exclusive use.

#PBS -l place=excl

How to export your current session's environment to the job being submitted

To pass your login shell environment to the cluster job, either use:

qsub -V

Or add it via a #PBS directive to the submission script:

#PBS -V

Hyper-Threading

On the CHPC cluster, Hyperthreading is disabled, except on the MIC nodes.

Job Arrays

Suppose I want to do a parameter study. How do I submit all these jobs with a different value of parameter?

Use PBS Job Arrays. You can specify how many jobs to run by adding the directive

#PBS -J <range>

where range is X-Y:Z. X is the start, Y is the end of the range, and Z is the increment.

For example, 2-10:2 indicates all the even jobs from 2 to 10, i.e., 2,4,6,8 and 10.

PBS defines two environment variables:

PBS_ARRAY_INDEX job array index
PBS_ARRAY_ID job array id

These variables are also defined as attributes:

array_index
array_id

Based on the job array index, different input files can be used for each job in the job array.

Job array example: a job script that submits two jobs, each using one processor.

NOTE: The PBS directives specify the resources that will be used by EACH individual job, NOT all the jobs together.

#!/bin/sh
#PBS -V
#PBS -l select=1:ncpus=1:mpiprocs=1,walltime=48:00
#PBS -N Job_Array_Test
#PBS -j oe -o ja.^array_index^.pbs
#PBS -J 1-2

source /etc/profile.d/modules.sh
module add openmpi/openmpi-1.6.5_gcc-4.7.2
unset echo

cd $PBS_O_WORKDIR             
echo $PBS_ARRAY_INDEX               
echo $PBS_ARRAY_ID $PBS_ARRAY_INDEX >> ja.$PBS_ARRAY_INDEX.out
echo ' '        >> ja.$PBS_ARRAY_INDEX.out
/opt/gridware/bioinformatics/bin/pi < pi.inp >> ja.$PBS_ARRAY_INDEX.out
                
exit
/var/www/wiki/data/pages/howto/pbs-pro_job_submission_examples.txt · Last modified: 2015/09/17 12:31 by andyr