Upgraded: more GPUs added.
The Lengau cluster at the CHPC includes 9 GPU compute nodes with a total of 30 Nvidia V100 GPU devices. There are 6 gpu200
n nodes with 3 GPUs in each, and 3 gpu400
n nodes with 4 GPUs in each.
GPU Node | CPU Cores | GPU Devices | Interface |
---|---|---|---|
gpu2001 | 36 | 3× Nvidia V100 16GB | PCIe |
gpu2002 | 36 | 3× Nvidia V100 16GB | PCIe |
gpu2003 | 36 | 3× Nvidia V100 16GB | PCIe |
gpu2004 | 36 | 3× Nvidia V100 16GB | PCIe |
gpu2005 | 36 | 3× Nvidia V100 32GB | PCIe |
gpu2006 | 36 | 3× Nvidia V100 32GB | PCIe |
gpu4001 | 40 | 4× Nvidia V100 16GB | NVlink |
gpu4002 | 40 | 4× Nvidia V100 16GB | NVlink |
gpu4003 | 40 | 4× Nvidia V100 16GB | NVlink |
Jobs that require 1, 2 or 3 GPUs can be allocated to any node, and will share the node if the job does not use all the GPU devices on that node. Jobs that require 4 GPUs can only be allocated togpu4*
nodes and will not be shared, obviously.
Access to the GPU nodes is by PI application only through the CHPC Helpdesk.
Research programme allocations will be depleted by a factor of the wallclock time and the number of GPUs (1, 2, or 4) requested by the job.
gpu_allocation_used = 40 * runtime * ngpus
Some pre-built applications have automated scripts that you can use to launch them:
There are four queues available in PBSPro which access the GPU nodes:
Queue name | Max. CPUs | Max. GPUs | PBSPro options | Comments |
---|---|---|---|---|
gpu_1 | 9 | 1 | -q gpu_1 -l select=1:ncpus=9:ngpus=1 | Access one GPU device only per job. |
gpu_2 | 18 | 2 | -q gpu_2 -l select=1:ncpus=18:ngpus=2 | Access two GPU devices per job. |
gpu_3 | 36 | 3 | -q gpu_3 -l select=1:ncpus=36:ngpus=3 | Access three GPU devices per job. |
gpu_4 | 40 | 4 | -q gpu_4 -l select=1:ncpus=40:ngpus=4 | Access four GPU devices on NVLink nodes. |
Note the ncpus
parameters that should be set to match the number of GPU devices you need.
The maximum wall clock time on all GPU queues is 12 hours.
#PBS -l walltime=12:00:00
It is better to specify a shorter walltime if your code executes in less time: this allows the scheduler a better chance of running your job sooner.
A single interactive session may be request on a GPU node by
qsub -I -q gpu_1 -P PRJT1234
NB: Replace PRJT1234
with your project number.
The default time for an interactive session is 1 hour.
#!/bin/bash #PBS -N nameyourjob #PBS -q gpu_1 #PBS -l select=1:ncpus=4:ngpus=1 #PBS -P PRJT1234 #PBS -l walltime=4:00:00 #PBS -m abe #PBS -M your.email@address cd /mnt/lustre/users/USERNAME/cuda_test echo echo `date`: executing CUDA job on host ${HOSTNAME} echo echo Available GPU devices: $CUDA_VISIBLE_DEVICES echo # Run program ./hello_cuda
The Nvidia V100 GPUs are programmed using the CUDA development tools.
To build a CUDA code (library or application) for the GPU nodes requires loading the appropriate CUDA module before compiling. The current CUDA modules are:
chpc/cuda/11.2/PCIe/11.2 chpc/cuda/11.2/SXM2/11.2 chpc/cuda/11.5.1/PCIe/11.5.1 chpc/cuda/11.6/PCIe/11.6 chpc/cuda/11.6/SXM2/11.6 chpc/cuda/12.0/12.0
with version 12.0 the most recent version.
Note that the 11.x version modules are available in two types:
PCIe
version is for PCIe bus nodes: gpu200x
SXM2
version is for the SXM2 bus nodes: gpu400x