User Tools

Site Tools


workshops:hpcprac3

Practical 3: MPI Part I

Job Script

To execute MPI programs you will need to submit a job script to the scheduler. There is an example PBSPro job script on the school00 home directory called example.sh and which can be copied to your current directory in the usual way:

cp ~/../student00/example.sh .

example.sh:

#!/bin/bash
#PBS -N example01
#PBS -l select=1:ncpus=24:mpiprocs=24
#PBS -P Wchpc
#PBS -q R522145
#PBS -W group_list=training
#PBS -l walltime=4:00:00
 
CWD=/mnt/lustre/users/$USER/example
 
 
 
cd $CWD
echo "Working directory is $CWD"
nproc=`cat $PBS_NODEFILE | wc -l`
echo "nproc is $nproc"
echo "Nodes used by this job:"
cat $PBS_NODEFILE
 
echo "Starting run..."
# Insert command to run here:
 
echo "... done."

Note that the options passed to the qsub command appear as shell comment lines that start with #PBS.

Note: the script uses example/ as the working directory. Create this before you run the script:

mkdir ~/lustre/example

or edit the script file to point to the prac directory.

NOTE: your script will fail if you don't use the Lustre file system.

In the example file the command to actually run your code is missing. Depending on which part of the prac you are attempting, this will differ. Insert the command in the indicated place.

To then submit the job script to the scheduler you simply use

qsub example.sh

The qsub command will read all its option parameters from the file.

Job Control

To check the status of your jobs in the Winter School queue:

qstat | grep R522145

This command lists all the jobs in the R522145 queue. Running jobs will have a R in the status column, a Q if the job is queued and waiting.

Use the qdel command to remove a queue job from the queue. The synatx is

qdel jobid

where jobid is replaced by the number of the job (looks like 557809.sched01).

Task 1

Create a sub-directory MPI in your Lustre directory:

cd ~/lustre
mkdir MPI
cd MPI
pwd

The last command will display something like:

/home/student00/lustre/MPI

Now use nano to create a file called hello_mpi.c

/* C Example */
#include <stdio.h>
#include <mpi.h>
 
int main (int argc, char *argv[])
{
  int rank, size;
 
  MPI_Init (&argc, &argv);
  MPI_Comm_rank (MPI_COMM_WORLD, &rank);
  MPI_Comm_size (MPI_COMM_WORLD, &size);
  printf( "Hello world from process %d of %d\n", rank, size );
  MPI_Finalize();
  return 0;
}

Compile this with the mpicc MPI C compiler:

mpicc -o hello_mpi.x  mpi_hello.c
Those of you who prefer Fortran will need to compile your hello_mpi.f90 with the mpif90 compiler instead.
!  Fortran example  
program hello
include 'mpif.h'
integer rank, size, ierror
 
call MPI_INIT(ierror)
call MPI_COMM_SIZE(MPI_COMM_WORLD, size, ierror)
call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierror)
print*, 'node', rank, ': Hello world'
call MPI_FINALIZE(ierror)
end
Compile the Fortran example with:
mpif90 -o hello_mpi.x   mpi_hello.f90

If you get an error message complaining that mpicc (or mpif90) is not found, then you forgot to load the necessary modules:

module unload gcc/6.1.0
module load chpc/openmpi/2.0.2/gcc-5.1.0
module list

The last command should display:

Currently Loaded Modulefiles:
  1) gcc/5.1.0                      2) chpc/openmpi/2.0.2/gcc-5.1.0

Now create a job script file to run your first MPI program, and save it as hello.pbs:

#!/bin/bash
#PBS -N Hello
#PBS -l select=1:ncpus=24:mpiprocs=24
#PBS -P Wchpc
#PBS -q R522145
#PBS -W group_list=training
#PBS -l walltime=4:00:00
 
module load chpc/openmpi/2.0.2/gcc-5.1.0
 
cd /mnt/lustre/users/studentNN/MPI
CWD=`pwd`
echo "Working directory is $CWD"
nproc=`cat $PBS_NODEFILE | wc -l`
echo "nproc = $nproc"
echo "Nodes used by this job:"
cat $PBS_NODEFILE
 
echo "Starting run..."
echo "======="
 
mpirun -n $nproc ./hello_mpi.x
 
echo "======="
echo "... done."

Launch the job script with qsub:

qsub hello.pbs

Check its status with qstat:

qstat | grep student00
Replace the 00 above with your student account number!

Once your job status has changed from R to F or E then look for the output:

ls

The file you want to look at will be called something like Hello.o99999 where the number at the end will be the job ID number:

cat Hello.o12345
Replace 12345 with the job number!

Carry On

Attempt the exercises in the slides.

/var/www/wiki/data/pages/workshops/hpcprac3.txt · Last modified: 2017/07/06 12:51 by kevin