User Tools

Site Tools


howto:wrf

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
howto:wrf [2019/05/20 12:19]
ccrosby [Parallel scaling]
howto:wrf [2019/06/18 14:55] (current)
ccrosby [WRF, Parallel NetCDF and I/O Quilting]
Line 1: Line 1:
 ====== Running ARW / WRF at the CHPC ====== ====== Running ARW / WRF at the CHPC ======
-There are several versions of WRF, using different combinations of compiler and MPI implementation,​ installed on the filesystem in ''/​apps/​chpc/​earth/''​. ​ Tests have indicated a very large benefit from using the Intel compiler and MPI rather than Gnu compiler and OpenMPI or MPICH. ​ MPICH versions need the mpirun argument ''​-iface ib0''​ to force it to use the Infiniband network. ​ Please note that it is essential to set the unlimited stack size for the Intel-compiled version, as done in the script below. ​ To set up an appropriate environment,​ "​source"​ the setWRF file in the required directory with the following type of command: ''​. ​  /​apps/​chpc/​earth/​WRF-3.8-impi/​setWRF''​ .  This command should be placed in the PBS-Pro job submission script. ​ Users need to develop their own workflows, but it is also practical to execute the pre-processing steps ''​geogrid.exe,​ ungrib.exe, metgrid.exe and real.exe''​ in single node mode with an interactive session. ​ Simply give the command ''​qsub -I -q smp -P <​AAAA0000>'',​ where <​AAAA0000>​ should be replaced with **your** project code, to obtain an interactive session. ​ Do not try to run these pre-processing steps from the login shell, as the shared login node cannot sustain a high work load.  The real.exe pre-processing step for large cases may run into memory constraints. ​ In that case, run real.exe in parallel over the requested number of nodes, but with only one process per node, as per the example script.+There are several versions of WRF, using different combinations of compiler and MPI implementation,​ installed on the filesystem in ''/​apps/​chpc/​earth/''​.  The latest version is WRF-4.1.1, built with the Intel compiler.  Tests have indicated a very large benefit from using the Intel compiler and MPI rather than Gnu compiler and OpenMPI or MPICH. ​ MPICH versions need the mpirun argument ''​-iface ib0''​ to force it to use the Infiniband network. ​ Please note that it is essential to set the unlimited stack size for the Intel-compiled version, as done in the script below. ​ To set up an appropriate environment,​ "​source"​ the setWRF file in the required directory with the following type of command: ''​. ​  /​apps/​chpc/​earth/​WRF-3.8-impi/​setWRF''​ .  This command should be placed in the PBS-Pro job submission script. ​ Users need to develop their own workflows, but it is also practical to execute the pre-processing steps ''​geogrid.exe,​ ungrib.exe, metgrid.exe and real.exe''​ in single node mode with an interactive session. ​ Simply give the command ''​qsub -I -q smp -P <​AAAA0000>'',​ where <​AAAA0000>​ should be replaced with **your** project code, to obtain an interactive session. ​ Do not try to run these pre-processing steps from the login shell, as the shared login node cannot sustain a high work load.  The real.exe pre-processing step for large cases may run into memory constraints. ​ In that case, run real.exe in parallel over the requested number of nodes, but with only one process per node, as per the example script.
  
 ==== OpenMP ==== ==== OpenMP ====
-WRF-4.0 ​has been installed with support for OpenMP. It is therefore possible to run using the same number of cores in total, but fewer MPI processes. ​ By default, the environment variable **OMP_NUM_THREADS** is set to 1 in the setWRF script. ​ Testing on Lengau has confirmed that there are substantial performance benefits to be obtained from using OpenMP. ​ Benchmark results are given below, but it appears to be close to optimal to use 6 MPI ranks per node, with 4 OpenMP threads per MPI rank. If you want to experiment with OpenMP, set this variable in your job script ** after ** sourcing the setWRF script. Although the WRF-4 / gcc-8.3.0 / mpich-3.3 installation also supports OpenMP, performance testing indicates that this version does not benefit from using OpenMP. ​ The version compiled with the PGI compiler is competitive with the Intel version when using MPI only, but also does not benefit from adding OpenMP.+WRF-4.0 ​and WRF-4.1.1 have been installed with support for OpenMP. It is therefore possible to run using the same number of cores in total, but fewer MPI processes. ​ By default, the environment variable **OMP_NUM_THREADS** is set to 1 in the setWRF script. ​ Testing on Lengau has confirmed that there are substantial performance benefits to be obtained from using OpenMP. ​ Benchmark results are given below, but it appears to be close to optimal to use 6 MPI ranks per node, with 4 OpenMP threads per MPI rank. If you want to experiment with OpenMP, set this variable in your job script ** after ** sourcing the setWRF script. Although the WRF-4 / gcc-8.3.0 / mpich-3.3 installation also supports OpenMP, performance testing indicates that this version does not benefit from using OpenMP. ​ The version compiled with the PGI compiler is competitive with the Intel version when using MPI only, but also does not benefit from adding OpenMP.
  
 ==== WRF, Parallel NetCDF and I/O Quilting ==== ==== WRF, Parallel NetCDF and I/O Quilting ====
Line 47: Line 47:
 #PBS -m abe #PBS -m abe
 #PBS -M username@unseenuniversity.ac.za #PBS -M username@unseenuniversity.ac.za
-### Source the WRF-3.environment:​ +### Source the WRF-4.1.1 environment:​ 
-export WRFDIR=/​apps/​chpc/​earth/​WRF-3.8-impi_hwl+export WRFDIR=/​apps/​chpc/​earth/​WRF-4.1.1-pnc-impi
 . $WRFDIR/​setWRF . $WRFDIR/​setWRF
 # Set the stack size unlimited for the intel compiler # Set the stack size unlimited for the intel compiler
Line 82: Line 82:
 nnodes=`cat hosts | wc -l` nnodes=`cat hosts | wc -l`
 # Run real.exe with one process per node # Run real.exe with one process per node
-exe=$WRFDIR/​WRFV3/​run/​real.exe+exe=$WRFDIR/​WRF/​run/​real.exe
 mpirun -np $nnodes -machinefile hosts $exe &> real.out mpirun -np $nnodes -machinefile hosts $exe &> real.out
 # Run wrf.exe with the full number of processes # Run wrf.exe with the full number of processes
-exe=$WRFDIR/​WRFV3/​run/​wrf.exe+exe=$WRFDIR/​WRF/​run/​wrf.exe
 mpirun -np $nproc -machinefile $PBS_NODEFILE $exe &> wrf.out mpirun -np $nproc -machinefile $PBS_NODEFILE $exe &> wrf.out
 </​file>​ </​file>​
Line 105: Line 105:
 #PBS -m abe #PBS -m abe
 #PBS -M username@unseenuniversity.ac.za #PBS -M username@unseenuniversity.ac.za
-### Source the WRF-3.environment with parallel NetCDF: +### Source the WRF-4.1.1 environment with parallel NetCDF: 
-export WRFDIR=/​apps/​chpc/​earth/​WRF-3.8-pnc-impi+export WRFDIR=/​apps/​chpc/​earth/​WRF-4.1.1-pnc-impi
 . $WRFDIR/​setWRF . $WRFDIR/​setWRF
 # Set the stack size unlimited for the intel compiler # Set the stack size unlimited for the intel compiler
Line 114: Line 114:
 export PBS_JOBDIR=/​home/​username/​scratch/​WRFV3_test/​run export PBS_JOBDIR=/​home/​username/​scratch/​WRFV3_test/​run
 cd $PBS_JOBDIR cd $PBS_JOBDIR
-exe=$WRFDIR/​WRFV3/​run/​wrf.exe+exe=$WRFDIR/​WRF/​run/​wrf.exe
 # Clear and re-set the lustre striping for the job directory. ​ For the lustre configuration ​ # Clear and re-set the lustre striping for the job directory. ​ For the lustre configuration ​
 # used by CHPC, a stripe size of 12 should work well. # used by CHPC, a stripe size of 12 should work well.
Line 140: Line 140:
 export OMP_STACKSIZE=2G export OMP_STACKSIZE=2G
 ### Source the appropriate environment script ### Source the appropriate environment script
-. /​apps/​chpc/​earth/​WRF-4.0-pnc-impi/​setWRF+. /​apps/​chpc/​earth/​WRF-4.1.1-pnc-impi/​setWRF
 export PBSJOBDIR=/​home/​userid/​lustre/​WRFrun/​wrf4.out export PBSJOBDIR=/​home/​userid/​lustre/​WRFrun/​wrf4.out
 cd $PBSJOBDIR cd $PBSJOBDIR
/var/www/wiki/data/attic/howto/wrf.1558347564.txt.gz · Last modified: 2019/05/20 12:19 by ccrosby