User Tools

Site Tools


howto:ansys

ANSYS/Fluent

LICENSING PROBLEMS

New problem, as of 11 AM on 25 March 2023. It is not possible to start the Ansys license server on login1 due to a library compatibility problem following the emergency shutdown. For now, the only available Ansys license is on chpclic1, and this works only for pre-R21.1 versions of the software.

Please bear with us while we work with Ansys on sorting out some teething problems introduced by the license system changes.

  • The pre-R21.1 versions also require communications on port 2325 of the license server. Unfortunately, this port is not reliable on login1, which may prevent pre-R21.1 versions of the software from checking out a license. As of August 2022, this appears to no longer be a problem, but if you do experience difficulties with older versions checking out a license from login1, revert to the license running on chpclic1.
  • From the CHPC's point of view, our work-around is to restart this license server when we detect this problem. This does not always work, and requires human intervention.
  • For users, the work-around for this problem if you need the older versions of the software is to rather use the license on chpclic1 untill this problem has been resolved. Please note that the license resources cfd_base and anshpc are not available on chpclic1, and should not be specified.

LICENSE RESOURCE REQUEST CHANGE

With the release of Ansys version 21.1, the structure of the license pool has changed dramatically. The old license resources aa_r_cfd and aa_r_hpc are no longer relevant, and requesting them will prevent your job from running. It is necessary to change these statements to request the new license resources cfd_base and anshpc, as per the information given below.

LICENSE SERVER CHANGE

We are in the process of retiring the existing license server chpclic1. The license has been moved to login1. Please change your .bashrc file and job scripts to point to the new license server, as per the example scripts below.

The CHPC has an installation of Ansys-CFD along with a limited license for academic use only. The license covers use of the Fluent and CFX solvers, as well as the IcemCFD meshing code. If you are a new Ansys user on Lengau, submit a helpdesk ticket, requesting access to the license.

Application Process

If you are a full time student or staff at an academic institution then you may request access to use Ansys-CFD on the CHPC cluster. Please go to the CHPC user database to register and request resources. Commercial use of Ansys software at the CHPC is also possible, but software license resources need to be negotiated directly with Ansys or their local agents. Remote license check-out has not been ruled out by Ansys, but once again this needs to be negotiated with the software vendor.

Installation

Ansys software versions have been installed under /mnt/lustre/apps/CHPC/compmech/CFD/ansys_inc and /home/apps/chpc/compmech/ansys_inc, but all versions have been symbolically linked to /apps/chpc/compmech/CFD/ansys_inc, from where they may be accessed:

v160 -> /mnt/lustre/apps/chpc/compmech/CFD/ansys_inc/v160
v172 -> /mnt/lustre/apps/chpc/compmech/CFD/ansys_inc/v172
v180 -> /mnt/lustre/apps/chpc/compmech/CFD/ansys_inc/v180
v181 -> /mnt/lustre/apps/chpc/compmech/CFD/ansys_inc/v181
v182 -> /mnt/lustre/apps/chpc/compmech/CFD/ansys_inc/v182
v190 -> /mnt/lustre/apps/chpc/compmech/CFD/ansys_inc/v190
v191 -> /mnt/lustre/apps/chpc/compmech/CFD/ansys_inc/v191
v192 -> /mnt/lustre/apps/chpc/compmech/CFD/ansys_inc/v192
v194 -> /mnt/lustre/apps/chpc/compmech/CFD/ansys_inc/v194
v195 -> /home/apps/chpc/compmech/ansys_inc/v195
v212 -> /home/apps/chpc/compmech/ansys_inc/v212
v221 -> /home/apps/chpc/compmech/ansys_inc/v221
v222 -> /home/apps/chpc/compmech/ansys_inc/v222
v231 -> /home/apps/chpc/compmech/ansys_inc/v231
v232 -> /home/apps/chpc/compmech/ansys_inc/v232

Licensing

CHPC has academic licenses for AnsysCFD. If you are a new Ansys user on Lengau, submit a helpdesk ticket, requesting access to the license. There are 25 “solver” processes and 4096 “HPC” licenses. Please use the license resource management system to ensure that your job does not start if there are no licenses available for it.

License resource management system

If you request license resources (as in these example scripts), the scheduler will check for license availability before starting a job. License unavailability will result in the job being held back until the necessary licenses have become available. Although use of the license resource request is not mandatory, its use is strongly recommended. If you do not use the license resource requests, the job will fail if no licenses are available. A single cfd_base license is required to start the solver, and includes up to 4 HPC licenses. Therefore you should request ($nproc-4) anshpc licenses. Do not request more than you need, it will delay the start of your job.

Update. Although we previously stated that “The Fluent licenses are in general highly utilised. The consequence is that jobs may be held back due to unavailability of licenses. It is possible for the CHPC to forcefully apply measures that will ensure fair use. However, in order to avoid this situation, please stick to the following guidelines:

  • Each Ansys user shall submit no more than 2 PBS scripts (tying up two cfd_base) at any given time.
  • Given the constraint of the number of solver licenses, do your Fluent runs sequentially. Do not try to run more than two Fluent analyses at a time.
  • Take full advantage of the large number of cores available on Lengau to run each Fluent analysis faster. Without requesting special permission, you are entitled to use 240 cores for a Fluent run. Our testing has indicated very good parallel scaling down to 10000 grid cells per core (sometimes even less), which means that for any run over about 2 million cells, you should aim to use around 200 cores.
  • If you need to submit a series of jobs, do so with a dependence on previously submitted jobs. The syntax is as follows:
     qsub -W depend=afterany:123456 thisjob.pbs 

    where 123456 should be replaced with the number of the previously submitted job, and thisjob.pbs is simply the name of the new script that you are submitting. The afterany directive will make sure that the dependent job gets launched regardless of whether the running job has finished normally, crashed or been killed.

  • You can launch several fluent solver processes sequentially inside a single PBS script. Simply add in the necessary cd (change directory) and fluent 3d …. etc. lines.
  • If the progress of your work is being limited by the number of licenses available at the CHPC, consider moving some of the runs to open source software.”

The above information is now (as of June 2022) no longer applicable. For the last several months, the usage level of the Ansys license pool has been well below maximum capacity. Users are therefore encouraged to make more aggressive use of the resources, until further notice.

Running a Fluent Job

On the CHPC cluster all simulations are submitted as jobs to the PBS Pro job scheduler which will assign your job to the appropriate queue. Below are given instructions and examples, which are also further elaborated upon here.

Example job script:

runFluent.qsub
#!/bin/bash
##### The following line will request 4 (virtual) nodes, each with 24 cores running 24 mpi processes for
##### a total of 96-way parallel.  Specifying memory requirement is unlikely to be necessary, as the 
##### compute nodes have 128 GB each.
#PBS -l select=4:ncpus=24:mpiprocs=24:mem=32GB:nodetype=haswell_reg
#### Check for license availability.  If insufficient licenses are available, job will be held back untill 
####  licenses are available.  
#PBS -l cfd_base=1
#PBS -l anshpc=92
## For your own benefit, try to estimate a realistic walltime request.  Over-estimating the 
## wallclock requirement interferes with efficient scheduling, will delay the launch of the job,
## and ties up more of your CPU-time allocation untill the job has finished.
#PBS -q normal
#PBS -P myprojectcode
#PBS -l walltime=1:00:00
#PBS -o /mnt/lustre/users/username/FluentTesting/fluent.out
#PBS -e /mnt/lustre/users/username/FluentTesting/fluent.err
#PBS -m abe
#PBS -M username@email.co.za
##### Running commands
#### Put these commands in your .bashrc file as well, to ensure that the compute nodes
#### have the correct environment.  Ensure that any OpenFOAM-related environment
#### settings have been removed. 
####### PLEASE NOTE THAT THE LICENSE SERVER ID HAS NOW CHANGED, IT IS NOW login1
export LM_LICENSE_FILE=1055@login1
export ANSYSLMD_LICENSE_FILE=1055@login1
# Edit this next line to select the appropriate version. 
export PATH=/apps/chpc/compmech/CFD/ansys_inc/v221/fluent/bin:$PATH
export FLUENT_ARCH=lnamd64
#### explicitly set working directory and change to that.
export PBS_JOBDIR=/mnt/lustre/users/username/FluentTesting
cd $PBS_JOBDIR
nproc=`cat $PBS_NODEFILE | wc -l`
exe=fluent
$exe 3d -t$nproc -pinfiniband -ssh -cnf=$PBS_NODEFILE -g < fileContainingTUIcommands > run.out

There are two methods which can be used to submit a series of instructions to Fluent. In the above example, a file containing so-called “TUI” commands is passed to Fluent, either by the “<” redirection symbol, or with the “-i” command line option. There are two disadvantages to using this method:

  • It is not possible to simply record a journal file from the Fluent GUI, as these commands require the GUI to be open, and will not work with the “-g” command line option.
  • It is not possible to generate images during the computation. Instead, these have to be created interactively afterwards.

The second method allows the use of a recorded journal file and also supports “on the fly” generation of images. We have previously made use of the virtual frame buffer “Xvfb” to enable this. However, the frame buffer method has now been deprecated, simply use the addition of the command line options -gu -driver null to enable the generation of images. The following is an example of a PBS job script using this method:

runFluent.qsub
#!/bin/bash
##### The following line will request 4 (virtual) nodes, each with 24 cores running 24 mpi processes for
##### a total of 96-way parallel.
#PBS -l select=4:ncpus=24:mpiprocs=24:mem=32GB:nodetype=haswell_reg
#### License resource request.  
#PBS -l cfd_base=1
#PBS -l anshpc=92
## For your own benefit, try to estimate a realistic walltime request.  Over-estimating the 
## wallclock requirement interferes with efficient scheduling, will delay the launch of the job,
## and ties up more of your CPU-time allocation untill the job has finished.
#PBS -q normal
#PBS -P myprojectcode
#PBS -l walltime=1:00:00
#PBS -o /mnt/lustre/users/username/FluentTesting/fluent.out
#PBS -e /mnt/lustre/users/username/FluentTesting/fluent.err
#PBS -m abe
#PBS -M username@email.co.za
##### Running commands
#### Put these commands in your .bashrc file as well, to ensure that the compute nodes
#### have the correct environment.  Ensure that any OpenFOAM-related environment
#### settings have been removed. 
####### PLEASE NOTE THAT THE LICENSE SERVER ID HAS NOW CHANGED, IT IS login1
export LM_LICENSE_FILE=1055@login1
export ANSYSLMD_LICENSE_FILE=1055@login1
# Edit this next line to select the appropriate version.  
export PATH=/apps/chpc/compmech/CFD/ansys_inc/v221/fluent/bin:$PATH
export FLUENT_ARCH=lnamd64
#### explicitly set working directory and change to that.
export PBS_JOBDIR=/mnt/lustre/users/username/FluentTesting
cd $PBS_JOBDIR
nproc=`cat $PBS_NODEFILE | wc -l`
exe=fluent
$exe 3d -t$nproc -pinfiniband -ssh -cnf=$PBS_NODEFILE -gu -driver null -i journalFile.jou > run.out

Running Fluent on GPUs (experimental)

Within fairly strict limitations, it is now possible to run Fluent on Nvidia GPUs instead of CPUs. We have good news and bad news for you about this. The good news is that the performance is spectacularly good and can be regarded as game-changing. A single V100 card has more or less the same performance as 8 Lengau compute nodes with 192 cores. The bad news items are:

  • Only some physics models and solvers are supported, refer to the Ansys documentation for more information
  • The CHPC's GPU cluster is very small, consisting of only 30 Nvidia V100 cards, distributed over several compute nodes
  • Although multi-GPU running works well, the CHPC's GPU cluster does not currently allow multi-GPU runs that span multiple nodes. It is also only worth doing on nodes where the GPUs are connected through NVlink, rather than the PCIe bus.
  • At this stage, the CHPC is not opening up the GPU resources to Fluent users in general, but watch this space.

Putting together a Fluent GPU job script

The most important thing to bear in mind is that there should be one MPI rank for each GPU. No more and no less. The GPU nodes have lots of CPU cores, so you may as well assign 10 CPU cores for each GPU, or just 1, it does not matter. The entire job runs on GPU, although each GPU requires one CPU core to control it. The resource request line in the job script needs to specify ngpus in addition to the usual ncpus and mpiprocs. At this stage, set anshpc to 20 per GPU.

Please note that the walltime limit on the GPU queues is just 12 hours. The GPUs have limited amounts of memory, so if your job mysteriously crashes, the most probable cause is inadequate memory. Most of the cards have only 16GB, although there are some with 32GB. For this reason, do not use double precision unless you really need it.

There are separate GPU queues for 1, 2, 3 and 4 GPU jobs. Ensure that you use the correct one.

Once the job is running, find the hostname of the node where your job is running with qstat -n1 followed by the job number. Then ssh into that node and monitor the GPU activity with nvidia-smi, or, more usefully, with nvtop. There is a module for nvtop:

 module load chpc/compmech/nvtop/1.2.2

Alternatively, just give the full path to nvtop or set up an alias for it:

/apps/chpc/compmech/nvtop/bin/nvtop

Single GPU example script

runFluent1GPU.qsub
#!/bin/bash
#PBS -l select=1:ncpus=10:mpiprocs=1:ngpus=1
#PBS -l cfd_base=1
#PBS -l anshpc=20
#PBS -q gpu_1
#PBS -P MECH1234
#PBS -l walltime=02:00:00
#PBS -o /mnt/lustre/users/jblogs/FluentGPUcase/fluent_1gpu.out
#PBS -e /mnt/lustre/users/jblogs/FluentGPUcase/fluent_1gpu.err
export LM_LICENSE_FILE=1055@login1
export ANSYSLMD_LICENSE_FILE=1055@login1
# Edit this next line to select the appropriate version.
export PATH=/apps/chpc/compmech/CFD/ansys_inc/v231/fluent/bin:$PATH
export FLUENT_ARCH=lnamd64
#### explicitly set working directory and change to that.
export PBS_JOBDIR=/mnt/lustre/users/jblogs/FluentGPUcase
cd $PBS_JOBDIR
fluent 3d  -t1 -pinfiniband -ssh -cnf=$PBS_NODEFILE -gpuapp -gpgpu=1  -g < iterate.txt > run1gpu.out

Triple GPU example script

runFluent3GPU.qsub
#!/bin/bash
#PBS -l select=1:ncpus=10:mpiprocs=3:ngpus=3
#PBS -l cfd_base=1
#PBS -l anshpc=60
#PBS -q gpu_3
#PBS -P MECH1234
#PBS -l walltime=0:40:00
#PBS -o /mnt/lustre/users/jblogs/FluentGPUcase/fluent_3gpu.out
#PBS -e /mnt/lustre/users/jblogs/FluentGPUcase/fluent_3gpu.err
export LM_LICENSE_FILE=1055@login1
export ANSYSLMD_LICENSE_FILE=1055@login1
# Edit this next line to select the appropriate version.
export PATH=/apps/chpc/compmech/CFD/ansys_inc/v231/fluent/bin:$PATH
export FLUENT_ARCH=lnamd64
#### explicitly set working directory and change to that.
export PBS_JOBDIR=/mnt/lustre/users/jblogs/FluentGPUcase
cd $PBS_JOBDIR
fluent 3d  -t3 -pinfiniband -ssh -cnf=$PBS_NODEFILE -gpuapp -gpgpu=3  -g < iterate.txt > run3gpu.out

Ansys Fluent performance on NVidia-V100 GPUs

This graph compares performance between CPUs and GPUs for single and double precision runs:

If a GUI is required

Some tasks, such as setting up runs, meshing or post-processing may require a graphics-capable login. This is possible in a number of ways. Using a compute node for a task that requires graphics involves a little bit of trickery, but is really not that difficult.

Getting use of a compute node

Obtain exclusive use of a compute node by logging into Lengau according to your usual method, and obtaining an interactive session:

qsub -I -l select=1:ncpus=24:mpiprocs=24 -q smp -P MECH1234 -l walltime=4:00:00

Obviously replace MECH1234 with the shortname of your particular Research Programme. Note down the name of the compute node that you have been given, let us use cnode0123 for this example. You can also use an interactive session like this to perform “service” tasks, such as archiving or compressing data files, which will be killed when attempted on the login node.

Getting a graphics-capable session on a compute node

There are three ways of doing this:

  • X-forwarding by means of a VNC session
  • X-forwarding in two stages
  • Run a VNC session directly on your compute node. Instructions are here:

X-forwarding in two stages is really only a practical proposition if you are on a fast, low-latency connection into the Sanren network. Otherwise, get the VNC session first by following these instructions.

Double X-forwarding

From an X-windows capable workstation (in other words, from a Linux terminal command prompt, or an emulator on Windows that includes an X-server, such as MobaXterm, or a VNC session on one of the visualization nodes), log in to Lengau:

 ssh -X jblogs@lengau.chpc.ac.za 

Once logged in, do a second X-forwarding login to your assigned compute node:

 ssh -X cnode0123 

. Alternatively, you can also do an interactive PBS session with X-forwarding:

 qsub -I -l select=1:ncpus=24:mpiprocs=24 -q smp -P MECH1234 -l walltime=4:00:00 -X 
X-forwarding from the VNC session

A normal broadband connection will probably be too slow to use the double X-forwarding method. In this case, first get the VNC desktop going, as described above, and open a terminal. From this terminal, log in to your assigned compute node:

 ssh -X cnode0123 

Set up the appropriate environment

export LM_LICENSE_FILE=1055@login1
export ANSYSLMD_LICENSE_FILE=1055@login1
export PATH=/apps/chpc/compmech/CFD/ansys_inc/v221/fluent/bin:$PATH
export FLUENT_ARCH=lnamd64

Run fluent

You can now simply start the program in the usual way, with the command

 fluent 3d -t24 -ssh 

Thanks to the magic of software rendering, you have access to the GUI and graphics capability of the interface.

Remote Solution Monitoring and Control

Starting with Version 19.0 of the software, it is now possible to use a GUI to connect to a Fluent process that is already running. The process requires that the Fluent be started with access to an X-server, therefore use a run command that contains the parameters -gu -driver null. Here is a minimalist example of such a script:

runFluentWith_flremote.qsub
#!/bin/bash
#PBS -l select=5:ncpus=24:mpiprocs=24:nodetype=haswell_reg
#PBS -q normal
#PBS -P MECH1234
#PBS -l walltime=12:00:00
#PBS -o /mnt/lustre/users/username/FluentTest/fluent.out
#PBS -e /mnt/lustre/users/username/FluentTest/fluent.err
#PBS -l cfd_base=1
#PBS -l anshpc=116
export LM_LICENSE_FILE=1055@login1
export ANSYSLMD_LICENSE_FILE=1055@login1
export PATH=$PATH:/apps/chpc/compmech/CFD/ansys_inc/v221/fluent/bin
export FLUENT_ARCH=lnamd64
cd /mnt/lustre/users/username/FluentTest
nproc=`cat $PBS_NODEFILE | wc -l`
fluent 3ddp -t$nproc -pinfiniband -ssh -mpi=intel -cnf=$PBS_NODEFILE -gu -driver null -i runCommands.txt | tee fluentrun.out

It is critical that the file containing the run instructions, in this case called runCommands.txt, has the following line:

server/start-server server-info.txt

This will create a file called server-info.txt, which contains the hostname of the master node, as well as a port number which the remote client will need to connect to.

On the viz node (you have a TurboVNC session open, right?), get a terminal, change directory to where your Fluent run is, and issue the following command:

/opt/VirtualGL/bin/vglrun /apps/chpc/compmech/CFD/ansys_inc/v190/fluent/bin/flremote &

The Fluent Remote Visualization Client will start up. Provide the appropriate Server Info Filename and you will be able to connect to your Fluent process.

Different methods of uploading a simulation

Build and test locally, upload .cas file

The “standard” process assumes that the user already has a local license for the software.

  • Mesh and pre-process the simulation as usual for a local simulation.
  • Test it locally to ensure that everything works properly. Be cautious about absolute path file names.
  • Compress the case file, either with gzip or saving it as a .cas.gz file.
  • Upload to CHPC using either scp or rsync. The advantage of rsync is that the transfer can be made persistent, to prevent network communication glitches from killing the file transfer.

Build and test locally, upload geometry and script files only, mesh and pre-process remotely

If your simulations files are too large, or your internet connection too slow, consider transferring geometry and script files only. This will require careful scripting and testing, but is certainly practical.

  • There are two methods available for meshing on the CHPC system. Either work with IcemCFD or use the built-in T-Grid based meshing in Fluent itself. Neither ANSYS-Mesh nor Gambit is available on the CHPC system.
  • If using the internal Fluent meshing, it will be necessary to transfer the surface grid and a file containing the necessary Fluent meshing and job set up instructions.
  • If using IcemCFD, transfer the Icem .prj, .tin, .fbc and .blk (if using hexa) files, along with a recorded Icem script file for generating the mesh and exporting the file in Fluent format. Watch for absolute path names in the script file. Run IcemCFD with the -batch -script options to create the mesh. A more comprehensive Fluent script will be required to import the mesh and pre-process the case. Test locally!
  • If your internet connection is too slow to permit easy case uploading, it will also be far too slow for downloading the results files. Consider generating post-processing images “on the fly”, or alternatively exporting only surface data on completion of the simulation.

General tips and advice

  • Give some thought to the resources being requested. Partitioning a simulation too finely will not necessarily speed it up as expected. Although our tests indicate that Fluent scales well down to as low as 15 000 cells per core, please give some thought to license usage. The Fluent license on the cluster is a shared resource, and using too many of the available 1024 parallel processes may delay the launch of your job, or delay others. Refer to the graphs below to get a better quantitative indication of scaling. Commercial users should also take into account that best performance per node will be achieved by using the full 24 cores per node, but performance per core benefits substantially from using less than 24.
  • A request for a smaller number of cores may result in the job launching earlier, resulting in reduced turn-around time, even if the job takes longer to run.
  • Monitoring convergence of batch jobs can be painful but necessary.
  • Monitor files (such as cd or cl files) can be plotted with gnuplot even if no Fluent GUI is available. On a slow connection, consider using gnuplot with set term dumb to get funky 1970's style ASCII graphics.
  • If you need to submit a large number of small jobs, when doing a parametric study, for example, please use Job Arrays. Refer to the PBS-Pro guide at http://wiki.chpc.ac.za/quick:pbspro for guidance on how to set this up.

/app/dokuwiki/data/pages/howto/ansys.txt · Last modified: 2023/08/01 15:39 by ccrosby