User Tools

Site Tools


guide:gpu

GPU Nodes

The Lengau cluster at the CHPC includes 9 GPU compute nodes with a total of 24 Nvidia V100 GPU devices.

Policies

Access

Access to these GPU node is by PI application only though the CHPC Helpdesk.

Allocation

Research programme allocations will be depleted by a factor of the wallclock time and the number of GPUs (1, 2, or 4) requested by the job.

gpu_allocation_used = 40 * runtime * ngpus

Usage

GPU applications on Lengau

Some pre-built applications have automated scripts that you can use to launch them:

GPU Job Scripts

GPU Queues

There are three queues available in PBSPro which access the GPU nodes:

Queue name Max. CPUs Max. GPUs PBSPro options Comments
gpu_1 10 1 -q gpu_1
-l ncpus=10:ngpus=1
Access one GPU device only.
gpu_2 20 2 -q gpu_2
-l ncpus=20:ngpus=2
Access two GPU devices.
gpu_4 40 4 -q gpu_4
-l ncpus=40:ngpus=4
Access four GPU devices on NVLink nodes.

GPU Queue Limits

The maximum wall clock time on all GPU queues is 12 hours.

#PBS -l walltime=12:00:00
It is better to specify a shorter walltime if your code executes in less time: this allows the scheduler a better chance of running your job sooner.

Interactive Job on a GPU Node

A single interactive session may be request on a GPU node by

qsub -I -q gpu_1 -P PRJT1234

NB: Replace PRJT1234 with your project number.

The default time for an interactive session is 1 hour.

Example Job Script

#!/bin/bash
#PBS -N nameyourjob
#PBS -q gpu_1
#PBS -l ncpus=10:ngpus=1
#PBS -P PRJT1234
#PBS -l walltime=4:00:00
#PBS -o /mnt/lustre/users/USERNAME/cuda_test/test1.out
#PBS -e /mnt/lustre/users/USERNAME/cuda_test/test1.err
#PBS -m abe
#PBS -M your.email@address
 
cd /mnt/lustre/users/USERNAME/cuda_test
 
echo
echo `date`: executing CUDA job on host ${HOSTNAME}
echo
 
# Run program
./hello_cuda

Compiling GPU Code

The Nvidia V100 GPUs are programmed using the CUDA development tools.

To build a CUDA code (library or application) for the GPU nodes requires loading the appropriate CUDA module before compiling. The CUDA runtime tools are already installed on all GPU nodes and won't need to be loaded specifically unless you require (for some reason) a different version. The V100 GPUs have Volta architecture cores. CUDA applications built using CUDA Toolkit versions 2.1 through 8.0 are compatible with Volta as long as they are built to include PTX versions of their kernels. To test that PTX JIT is working for your application, you can do the following: Download and install the latest driver from http://www.nvidia.com/drivers. Set the environment variable CUDA_FORCE_PTX_JIT=1. Launch your application. When starting a CUDA application for the first time with the above environment flag, the CUDA driver will JIT-compile the PTX for each CUDA kernel that is used into native cubin code.

If you set the environment variable above and then launch your program and it works properly, then you have successfully verified Volta compatibility.

Note: Be sure to unset the CUDA_FORCE_PTX_JIT environment variable when you are done testing.

Further Reading

/var/www/wiki/data/pages/guide/gpu.txt · Last modified: 2018/10/02 12:34 by kgovender