Upgraded: more GPUs added.
The Lengau cluster at the CHPC includes 9 GPU compute nodes with a total of 30 Nvidia V100 GPU devices. There are 6 gpu200
n nodes with 3 GPUs in each, and 3 gpu400
n nodes with 4 GPUs in each.
GPU Node | CPU Cores | GPU Devices | Interface |
---|---|---|---|
gpu2001 | 36 | 3× Nvidia V100 16GB | PCIe |
gpu2002 | 36 | 3× Nvidia V100 16GB | PCIe |
gpu2003 | 36 | 3× Nvidia V100 16GB | PCIe |
gpu2004 | 36 | 3× Nvidia V100 16GB | PCIe |
gpu2005 | 36 | 3× Nvidia V100 32GB | PCIe |
gpu2006 | 36 | 3× Nvidia V100 32GB | PCIe |
gpu4001 | 40 | 4× Nvidia V100 16GB | NVlink |
gpu4002 | 40 | 4× Nvidia V100 16GB | NVlink |
gpu4003 | 40 | 4× Nvidia V100 16GB | NVlink |
Jobs that require 1, 2 or 3 GPUs can be allocated to any node, and will share the node if the job does not use all the GPU devices on that node. Jobs that require 4 GPUs can only be allocated togpu4*
nodes and will not be shared, obviously.
Principle Investigators apply for GPU access for their Research Programme members through the CHPC Helpdesk. RP members may not apply directly. E-mailed applications will not be considered.
Research programme allocations will be depleted by a factor of the wallclock time and the number of GPUs (1, 2, or 4) requested by the job.
gpu_allocation_used = 40 * runtime * ngpus
Some pre-built applications have automated scripts that you can use to launch them:
There are four queues available in PBSPro which access the GPU nodes:
Queue name | Max. CPUs | Max. GPUs | PBSPro options | Comments |
---|---|---|---|---|
gpu_1 | 9 | 1 | -q gpu_1 -l select=1:ncpus=9:ngpus=1 | Access one GPU device only per job. |
gpu_2 | 18 | 2 | -q gpu_2 -l select=1:ncpus=18:ngpus=2 | Access two GPU devices per job. |
gpu_3 | 36 | 3 | -q gpu_3 -l select=1:ncpus=30:ngpus=3 | Access three GPU devices per job. |
gpu_4 | 40 | 4 | -q gpu_4 -l select=1:ncpus=40:ngpus=4 | Access four GPU devices on NVLink nodes. |
Note the ncpus
parameters above is the maximum that should be set to match the number of GPU devices you need.
The maximum wall clock time on all GPU queues is 12 hours.
#PBS -l walltime=12:00:00
It is better to specify a shorter walltime if your code executes in less time: this allows the scheduler a better chance of running your job sooner.
A single interactive session may be request on a GPU node by
qsub -I -q gpu_1 -P PRJT1234 -l select=1:ncpus=9:ngpus=1
NB: Replace PRJT1234
with your project number.
The default time for an interactive session is 1 hour.
#!/bin/bash #PBS -N nameyourjob #PBS -q gpu_1 #PBS -l select=1:ncpus=4:ngpus=1 #PBS -P PRJT1234 #PBS -l walltime=4:00:00 #PBS -m abe #PBS -M your.email@address cd /mnt/lustre/users/USERNAME/cuda_test echo echo `date`: executing CUDA job on host ${HOSTNAME} echo echo Available GPU devices: $CUDA_VISIBLE_DEVICES echo # Run program ./hello_cuda
As usual, replace PRJT1234
with your group's project name, your.email@address
with your email address, and USERNAME
with your cluster user name.
The Nvidia V100 GPUs are programmed using the CUDA development tools.
To build a CUDA code (library or application) for the GPU nodes requires loading the appropriate CUDA module before compiling. The current CUDA modules are:
chpc/cuda/11.2/PCIe/11.2 chpc/cuda/11.2/SXM2/11.2 chpc/cuda/11.5.1/PCIe/11.5.1 chpc/cuda/11.6/PCIe/11.6 chpc/cuda/11.6/SXM2/11.6 chpc/cuda/12.0/12.0
with version 12.0 the most recent version.
Note that the 11.x version modules are available in two types:
PCIe
version is for PCIe bus nodes: gpu200x
SXM2
version is for the SXM2 bus nodes: gpu400x