User Tools

Site Tools


howto:pbs-pro_job_submission_examples

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
howto:pbs-pro_job_submission_examples [2015/09/17 12:31]
andyr [Basic MPI Example]
howto:pbs-pro_job_submission_examples [2020/05/26 11:59] (current)
kevin
Line 1: Line 1:
-====== PBS-Pro Job Submission Examples ====== 
  
- +====== ​PBSPro ​Job Submission Examples ​======
-Job submission scripts are similar to shell scripts with PBS directives in comments.  +
- +
-====PLEASE NOTE:==== +
- +
-  - All job script files must be **plain text** files, as created by unix programs like ''​nano''​ or ''​vi''​. If you wish to create them on your desktop and copy them over, ensure they contain no formatting and have Unix line endings. Microsoft Word, for example, would be a very bad choice of editor. +
-  -  The first line should have the hash-bang and shell path, eg. ''#​!/​bin/​bash''​ or ''#​!/​bin/​sh''​ if you need to use shell variables and operations. It is not essential for a successful job submission. +
-  - The output and error file paths (''#​PBS -o''​ and ''#​PBS -e''​ directives) should be valid paths to files, preferably in your ''​scratch''​ directory, else the job may be killed. There should be a symbolic link called "​scratch"​ in your home directory that points to the workspace on the Lustre file system, i.e., ''/​lustre/​SCRATCH5/​users/​yourname'',​ if not, you may create it. If you do not specify output and error file names, they will be created in your home directory using the job name and number, eg.: ''​myjob.o765432''​ and ''​myjob.e765432''​. +
-  - Job names (''#​PBS -N ''​) longer than 15 characters do not work. If you do not give a job name, the submission script file name will be used as the job name. +
- +
-=====Basic MPI Example===== +
- +
-On a cluster system, run the ''​hello''​ mpi executable on 3 nodes using all 8 cores per node with 8 MPI processes for up to 5 hours wall time: +
- +
-<​code>​ +
-#​!/​bin/​bash +
-#PBS -e /​lustre/​SCRATCH5/​users/​username/​stderr.out +
-#PBS -o /​lustre/​SCRATCH5/​users/​username/​stdout.out +
-#PBS -l walltime=5:​00:​00 +
-#PBS -l select=3:​ncpus=8:​mpiprocs=8 +
-#PBS -V +
- +
-source /​etc/​profile.d/​modules.sh +
-module add openmpi/​openmpi-1.6.5_gcc-4.7.2 +
- +
-cd $HOME/​scratch +
- +
-mpirun -np 24  /​opt/​gridware/​bioinformatics/​bin/​hello +
- +
-</​code>​ +
- +
-The job submission script above limits wall time to 5h, jobs taking longer +
-will be killed by the scheduler. The line ''#​PBS -V''​ retains the environment +
-variables of the shell from where the job is launched. The script also assumes +
-that a symbolic link exists in the home directory that points to the user'​s +
-area on ''/​lustre/​SCRATCH5/​users''​ and changes to that directory before launching +
-the program, to ensure optimal file access speed.  +
- +
-If the link does not exist, create it by changing to the home directory root and running: +
- +
-''​ln -s /​lustre/​SCRATCH5/​users/​your_username ​ scratch''​ +
- +
- +
-To verify, the listing ''​ls -al scratch''​ should be similar to the following +
- +
-''​scratch -> /​lustre/​SCRATCH5/​users/​your_username''​ +
- +
-and you should further be able to change to that directory, create files, etc. +
- +
- +
- +
-=====Standard Options and Abilities===== +
- +
-====How to specify number of nodes and cores for a job==== +
- +
-This is done using the ''#​PBS -l select''​ directive in the job submission script.  +
-Cluster partitions differ in how many cpu's per node there are: +
-  * Harpertown: 8 cores per node +
-  * Nehalem: 8 cores per node +
-  * Westmere: 12 cores per node +
-  * Dell: 12 cores per node +
-  +
- +
-=== Selecting a cluster partition based on number of cores === +
- +
-To run a 24-core job on the Dell partition, for instance, only 2 nodes are needed: +
- +
- +
-''#​PBS -l select=2:​ncpus=12:​mpiprocs=12:​jobtype=dell:​group=nodetype''​ +
- +
- +
-While 3 nodes are needed to request the same number of cpu cores on the Nehalem partition:​ +
- +
-''#​PBS -l select=3:​ncpus=8:​mpiprocs=8:​jobtype=nehalem:​group=nodetype''​ +
- +
-====Number of cluster nodes in a job==== +
- +
-The number of cores can be obtained by: +
- +
-''​NP=`cat $PBS_NODEFILE | uniq | wc -l`''​ +
- +
- +
-====Number of cores (cpu'​s) in a job==== +
- +
-To determine the number of cores, include the following line in the submission script: +
- +
-''​set NP = `cat $PBS_NODEFILE | wc -l`''​ +
- +
- +
-====Determining the machinefile (list of nodes)==== +
- +
-The mpirun command can use the PBS variable ''​$PBS_NODEFILE''​ that contains the  +
-file name with the list of nodes: +
- +
- +
-''​mpirun -np $NP -machinefile $PBS_NODEFILE ~/​bin/​myprogram +
-''​ +
- +
- +
- +
-===Example: Join error and standard output, run multithreaded 24 cores for up to 5 hours, retain environment=== +
- +
-<​code>​ +
-#​!/​bin/​bash +
-#PBS -j oe +
-#PBS -l walltime=5:​00:​00 +
-#PBS -l ncpus=24 +
-#PBS -V +
- +
-myexecutable arg1 arg2 arg3 -num_threads 24 +
- +
-</​code>​ +
- +
-====Using Modules==== +
- +
-To use the module system, add the following line to your submission script after +
-the ''#​PBS ''​ lines: +
- +
-''​source /​etc/​profile.d/​modules.sh''​ +
- +
-Then you will be able to include your required module to set up an environment:​ +
- +
-''​module add openmpi/​openmpi-1.6.5_gcc-4.7.2''​ +
- +
-To list modules, the MODULEPATH variables normally are set by default and can be viewed  +
-from the command line using: +
- +
-''​module avail''​ +
- +
-For Bioinformatics users, add the additional modules in your .profile as follows: +
- +
- +
-''​export MODULEPATH=/​opt/​gridware/​bioinformatics/​modules:​$MODULEPATH''​ +
- +
-====How long can a job run?==== +
- +
-There is no limit to the amount of time that a PBS job can run. If no time limit is specified, a default of 12 hours is assigned. Remember to check point long running jobs. +
- +
-To specify a time, include the following line in the submission script: +
- +
-''#​PBS -l walltime=hhhh:​mm:​ss''​ +
- +
-====Exclusive use of cluster nodes==== +
- +
-On cluster systems, if a job does not request all the cores on a node, it is possible another job will share the same node. To prevent this, request exclusive use. +
- +
-''#​PBS -l place=excl''​ +
- +
-====How to export your current session'​s environment to the job being submitted ==== +
- +
-To pass your login shell environment to the cluster job, either use: +
- +
-''​qsub -V''​ +
- +
-Or add it via a ''#​PBS''​ directive to the submission script: +
-         +
-''#​PBS -V''​ +
- +
-====Hyper-Threading==== +
- +
-On the CHPC cluster, Hyperthreading is disabled, except on the MIC nodes. +
- +
-=====Job Arrays===== +
- +
-===Suppose I want to do a parameter study. How do I submit all these jobs with a different value of parameter?​=== +
- +
-Use PBS Job Arrays. You can specify how many jobs to run by adding the directive +
- +
-  #PBS -J <​range>​ +
- +
-where range is ''​X-Y:​Z''​. //X// is the start, //Y// is the end of the range, and //Z// is the increment. +
- +
-For example, ''​2-10:​2''​ indicates all the even jobs from 2 to 10, i.e., 2,4,6,8 and 10. +
- +
-PBS defines two environment variables:​ +
- +
-|  ''​PBS_ARRAY_INDEX'' ​ |  job array index  | +
-|  ''​PBS_ARRAY_ID'' ​    ​| ​ job array id  | +
- +
-These variables are also defined as attributes:​ +
- +
-| ''​array_index'' ​ | +
-| ''​array_id'' ​ | +
- +
-Based on the job array index, different input files can be used for each job in the job array.  +
- +
-====Job array example: a job script that submits two jobs, each using one processor.==== +
- +
-NOTE: The PBS directives specify the resources that will be used by EACH individual job, NOT all the jobs together. +
- +
-<​code>​ +
-#!/bin/sh +
-#PBS -V +
-#PBS -l select=1:ncpus=1:mpiprocs=1,walltime=48:00 +
-#PBS -N Job_Array_Test +
-#PBS -j oe -o ja.^array_index^.pbs +
-#PBS -J 1-2 +
- +
-source /​etc/​profile.d/​modules.sh +
-module add openmpi/​openmpi-1.6.5_gcc-4.7.2 +
-unset echo +
- +
-cd $PBS_O_WORKDIR ​             +
-echo $PBS_ARRAY_INDEX ​               +
-echo $PBS_ARRAY_ID $PBS_ARRAY_INDEX >> ja.$PBS_ARRAY_INDEX.out +
-echo ' ' ​       >> ja.$PBS_ARRAY_INDEX.out +
-/​opt/​gridware/​bioinformatics/​bin/​pi < pi.inp >> ja.$PBS_ARRAY_INDEX.out +
-                 +
-exit +
-</​code>​+
  
  
  
 +See the [[quick:​start|Quick Start Guide]] .
/var/www/wiki/data/attic/howto/pbs-pro_job_submission_examples.1442485867.txt.gz · Last modified: 2015/09/17 12:31 by andyr