Job submission scripts are similar to shell scripts with PBS directives in comments.
vi. If you wish to create them on your desktop and copy them over, ensure they contain no formatting and have Unix line endings. Microsoft Word, for example, would be a very bad choice of editor.
#!/bin/shif you need to use shell variables and operations. It is not essential for a successful job submission.
#PBS -edirectives) should be valid paths to files, preferably in your
scratchdirectory, else the job may be killed. There should be a symbolic link called “scratch” in your home directory that points to the workspace on the Lustre file system, i.e.,
/lustre/SCRATCH5/users/yourname, if not, you may create it. If you do not specify output and error file names, they will be created in your home directory using the job name and number, eg.:
#PBS -N) longer than 15 characters do not work. If you do not give a job name, the submission script file name will be used as the job name.
On a cluster system, run the
hello mpi executable on 3 nodes using all 8 cores per node with 8 MPI processes for up to 5 hours wall time:
#!/bin/bash #PBS -e /lustre/SCRATCH5/users/username/stderr.out #PBS -o /lustre/SCRATCH5/users/username/stdout.out #PBS -l walltime=5:00:00 #PBS -l select=3:ncpus=8:mpiprocs=8 #PBS -V source /etc/profile.d/modules.sh module add openmpi/openmpi-1.6.5_gcc-4.7.2 cd $HOME/scratch mpirun -np 24 /opt/gridware/bioinformatics/bin/hello
The job submission script above limits wall time to 5h, jobs taking longer
will be killed by the scheduler. The line
#PBS -V retains the environment
variables of the shell from where the job is launched. The script also assumes
that a symbolic link exists in the home directory that points to the user's
/lustre/SCRATCH5/users and changes to that directory before launching
the program, to ensure optimal file access speed.
If the link does not exist, create it by changing to the home directory root and running:
ln -s /lustre/SCRATCH5/users/your_username scratch
To verify, the listing
ls -al scratch should be similar to the following
scratch → /lustre/SCRATCH5/users/your_username
and you should further be able to change to that directory, create files, etc.
This is done using the
#PBS -l select directive in the job submission script.
Cluster partitions differ in how many cpu's per node there are:
To run a 24-core job on the Dell partition, for instance, only 2 nodes are needed:
#PBS -l select=2:ncpus=12:mpiprocs=12:jobtype=dell:group=nodetype
While 3 nodes are needed to request the same number of cpu cores on the Nehalem partition:
#PBS -l select=3:ncpus=8:mpiprocs=8:jobtype=nehalem:group=nodetype
The number of cores can be obtained by:
NP=`cat $PBS_NODEFILE | uniq | wc -l`
To determine the number of cores, include the following line in the submission script:
set NP = `cat $PBS_NODEFILE | wc -l`
The mpirun command can use the PBS variable
$PBS_NODEFILE that contains the
file name with the list of nodes:
mpirun -np $NP -machinefile $PBS_NODEFILE ~/bin/myprogram
#!/bin/bash #PBS -j oe #PBS -l walltime=5:00:00 #PBS -l ncpus=24 #PBS -V myexecutable arg1 arg2 arg3 -num_threads 24
To use the module system, add the following line to your submission script after
Then you will be able to include your required module to set up an environment:
module add openmpi/openmpi-1.6.5_gcc-4.7.2
To list modules, the MODULEPATH variables normally are set by default and can be viewed from the command line using:
For Bioinformatics users, add the additional modules in your .profile as follows:
There is no limit to the amount of time that a PBS job can run. If no time limit is specified, a default of 12 hours is assigned. Remember to check point long running jobs.
To specify a time, include the following line in the submission script:
#PBS -l walltime=hhhh:mm:ss
On cluster systems, if a job does not request all the cores on a node, it is possible another job will share the same node. To prevent this, request exclusive use.
#PBS -l place=excl
To pass your login shell environment to the cluster job, either use:
Or add it via a
#PBS directive to the submission script:
On the CHPC cluster, Hyperthreading is disabled, except on the MIC nodes.
Use PBS Job Arrays. You can specify how many jobs to run by adding the directive
#PBS -J <range>
where range is
X-Y:Z. X is the start, Y is the end of the range, and Z is the increment.
2-10:2 indicates all the even jobs from 2 to 10, i.e., 2,4,6,8 and 10.
PBS defines two environment variables:
| ||job array index|
| ||job array id|
These variables are also defined as attributes:
Based on the job array index, different input files can be used for each job in the job array.
NOTE: The PBS directives specify the resources that will be used by EACH individual job, NOT all the jobs together.
#!/bin/sh #PBS -V #PBS -l select=1:ncpus=1:mpiprocs=1,walltime=48:00 #PBS -N Job_Array_Test #PBS -j oe -o ja.^array_index^.pbs #PBS -J 1-2 source /etc/profile.d/modules.sh module add openmpi/openmpi-1.6.5_gcc-4.7.2 unset echo cd $PBS_O_WORKDIR echo $PBS_ARRAY_INDEX echo $PBS_ARRAY_ID $PBS_ARRAY_INDEX >> ja.$PBS_ARRAY_INDEX.out echo ' ' >> ja.$PBS_ARRAY_INDEX.out /opt/gridware/bioinformatics/bin/pi < pi.inp >> ja.$PBS_ARRAY_INDEX.out exit