OpenFOAM is an Open-Source Computational Fluid Dynamics (CFD) is a C++ toolbox for the development of customized numerical solvers, and pre-/post-processing utilities for the solution of continuum mechanics problems. The code is typically memory bandwidth limited, such that memory channels are more useful than sheer core count. source
The benchmark used for the code models wind flow in a metropolitan area, with a total of 80 million cells.
Developers Site http://www.openfoam.com/
HPCAC Best Practices http://www.hpcadvisorycouncil.com/pdf/OpenFOAM_Best_Practices.pdf
Install Guide http://openfoamwiki.net/index.php/Installation/Linux/OpenFOAM-2.3.1/CentOS_SL_RHEL
Install repo dependancies:
sudo yum groupinstall 'Development Tools' sudo yum install zlib-devel readline-devel ncurses-devel texinfo gstreamer-plugins-base-devel libXext-devel libGLU-devel libXt-devel sudo yum install libXrender-devel libXinerama-devel libpng-devel libXrandr-devel libXi-devel libXft-devel libXcursor-devel
Download the source:
mkdir OpenFOAM cd OpenFOAM wget http://downloads.sourceforge.net/foam/OpenFOAM-2.4.0.tgz?use_mirror=mesh tar -xf OpenFOAM-2.4.0.tgz wget http://downloads.sourceforge.net/foam/ThirdParty-2.4.0.tgz?use_mirror=mesh tar -xf ThirdParty-2.4.0.tgz
Compile the code:
cd OpenFOAM-2.4.0 source etc/bashrc export MAKEFLAGS='-j<N>' ./Allwmake
Download the benchmark here: SimpleBenchMarkLarge.tar.gz
cd ~ tar -xf SimpleBenchMarkLarge.tar.gz cd SimpleBenchMarkLarge
Edit the decomposition file with number of CPU cores:
vim system/decomposeParDict numberOfSubdomains <N>
Run the domain decomposition with:
decomposePar -force > decompose.out
Run the benchmark with:
mpirun -np <N> -hostfile <HF> simpleFoam -parallel | tee simple.out
By default, the code is setup to run for 100 iterations. Changing this value is not recommended, however the runtime of the benchmark can be alternated by changing the endTime variable in the system/controlDict file. The total memory usage of this benchmark is approximately 80GB.
The Dell C4130, with 24 Haswell cores, completed the 100 iteration benchmark in 41.2 minutes (200 iterations took 79.5 minutes)