User Tools

Site Tools


guide:dell

Dell Cluster

Overview

The Dell cluster consists of 240 Dell C6100 nodes (12 cores, two sockets, Intel Xeon 2.93 GHz, 36GB RAM, QDR ConnectX2 Infiniband) which are linked into the pre-existing cluster ethernet network and infiniband fabric.

These nodes are:

  • cnode-6-1cnode-6-72
  • cnode-7-1cnode-7-72
  • cnode-8-1cnode-8-72
  • cnode-9-1cnode-9-23
  • dell-login01

and run a single operating system image, which at the time of installation is based on the Centos 5.7 distribution of Linux (which is a rebuild of RedHat Enterprise 5.7). Features of this OS distribution include:

  • GCC 4.1.2
  • glibc 2.5
  • kernel 2.6.18-274.3.1.el5
  • OFED 1.5.3

These differences lead to methods of code development which are distinct from the rest of Tsessebe system as it currently operates. However all the Dell nodes see the same /opt/gridware NFS and scratch Lustre filesystems as the older nodes.

Modifications to shell initialization scripts

The following modifications should be made to the shell initialization scripts of new users; current users will need to make these changes individually. These changes are effective only on the Dell nodes (to be exact, on any node which has a file /etc/redhat-release) and will have no effect on the older nodes.

~/.bashrc

# Added for new Dell nodes:
if [ -f /etc/redhat-release ]; then
	. /etc/bashrc
# Initialize CHPC modules:
. /opt/gridware/modules-3.2.7/modules.sh
module load dell/default-environment
fi

~/.profile

# Added for new Dell nodes:
if [ -f /etc/redhat-release ]; then
	. ~/.bashrc
fi

These changes initialize the environment so that Dell-specific software is loaded into the search paths. The new module, dell/default-environment, automatically loads the pre-existing module inteltools (providing Intel compilers 12.0 and Intel MPI 4.0.1) and makes additional environment variable settings.

Please note that these changes are essential for proper operation of an account using the Dell compute nodes.

Custom changes to ~/.bashrc or ~/.profile are still possible but care should be taken not to interfere with the initial environment set up above — this means that customisations should respect the following conditions:

  • customisations should appear in the files after the above sections
  • customisations should not overwrite settings to PATH or LD_LIBRARY_PATH but should append or prepend to them instead.

Code Development

To ensure compatibility of user code with the Dell nodes it is recommended that binaries intended to run on these are recompiled in the target environment. To this end, one node has been configured as a new login node: dell-login01. Users should login to dell-login01 from one of the existing login nodes (login01, login02) using ssh, as in:

ssh dell-login01

Standard linux utilities and the GCC compilers are available on this node, but the Intel compilers 12.0 and Intel MPI are already loaded into the environment by the dell/default-environment module. It is recommended to compile code using the Intel compilers (icc and ifort) for best performance. E.g.

icc -O3 -ip -xHOST -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread prog.c -o prog

where we are also dynamically linking the Intel MKL library (also recommended). Note that for MPI code, the above would become:

mpicc -O3 -ip -xHOST -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread prog.c -o prog

or using ifort or mpif90 for serial or MPI codes, repectively.

Note that the -xHOST flag will optimise the binary for the same type of CPU on which the compilation is taking place. Since dell-login01 is identical in terms of hardware to the Dell compute nodes, that is appropriate here. In practice, the same binary should be compatible with the CPUs on other nodes as long as they are Intel Westmere or Nehalem CPUs (but not Harpertown which has a different microarchitecture). However the program may still fail on other node types for reasons of library incompatibility if the Linux distribution differs.

/var/www/wiki/data/pages/guide/dell.txt · Last modified: 2012/03/02 17:10 by kevin