NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. Based on Charm++ parallel objects, NAMD scales to hundreds of cores for typical simulations and beyond 500,000 cores for the largest simulations. NAMD uses the popular molecular graphics program VMD for simulation setup and trajectory analysis, but is also file-compatible with AMBER, CHARMM, and X-PLOR. NAMD is distributed free of charge with source code. You can build NAMD yourself or download binaries for a wide variety of platforms. Our tutorials show you how to use NAMD and VMD for biomolecular modeling. source
Developers Site http://www.ks.uiuc.edu/Research/namd/
cd ~ mkdir NAMD cd NAMD mkdir code mkdir bench cd code
The code is shipped as a compiled binary, which has proven sufficient for testing. There are two versions available below; for x86 processors, as well as for CUDA enabled GPUs.
Download the two tars into the current directory.
tar -xf NAMD_2.10_Linux-x86_64-ibverbs-smp.tar.gz tar -xf NAMD_2.10_Linux-x86_64-ibverbs-smp-CUDA.tar.gz
Before beginning the benchmark, ensure that the correct version of the code is sourced,
There are two benchmarks setup for NAMD, which will work with both the CUDA and x86 versions.
The smaller test case is the apoa1 benchmark, with 2500 timesteps and 250 step output frequency: apoa1.tar.gz
The larger test case is the STMV benchmark, with 1500 timesteps and 500 step output frequency: stmv.tar.gz
NOTE: the benchmark calculates a once off FFT based on the number of cores used, so each test should be run twice to avoid this overhead.
To run the benchmark for x86 CPUs, use the following command:
namd2 +idlepoll +p<N> <CONFIGFILE>
Note, the config file has the extension .namd
A Dell C4130 with 24 Haswell cores completed the apoa1 test in 124 seconds, and the STMV test in 794 seconds.
To run the benchmark for CUDA GPUs, use the following command:
namd2 +idlepoll +p<N> +devices <0,...,n> <CONFIGFILE>
Note, tests have shown that in order to service the GPUs effectively, 4-6 CPU threads per GPU is optimal.
A Dell C4130 with 4 Nvidia K80s (8 GPUs) produced to results below:
! GPUs ! apoa1 (seconds) ! STMV (seconds)