User Tools

Site Tools


acelab:user_guide

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

acelab:user_guide [2018/07/27 17:59] (current)
smasoka created
Line 1: Line 1:
 +====== CMC User guide ======
 +
 +This guide is intended for CHPC stuff and special users.
 +
 +===== Overview =====
 +
 +The CMC cluster is built using Dell servers. The system consist of a Power Edge M1000e Chassis with 16 M6220 compute nodes. Each of the compute nodes have 12 cores and 36 GB memory. There are 12 management nodes (mix of Power Edge R330'​s,​ R430's and R630) which includes NFS and Lustre management servers. All the servers are interconnected through QDR 40 Gb/s infiniband. ​
 +
 +===== Logging In =====
 +
 +Domain names are not configured yet on the CHPC network. Logging in done using IP addresses. \\
 +**login01 [10.128.24.153]** \\
 +**login02 [10.128.24.133]** \\
 +
 +<​code>​ssh username@10.128.24.153</​code>​
 +
 +It is advisable to change the default password using ''​passwd''​ command as soon as possible.
 +
 +===== Shared Filesystems =====
 +
 +The new cluster has both NFS and the Lustre filesystems over Infiniband:
 +
 +^ Mount point           ​^ ​ File System ^  Size ^ 
 +| ''/​home'' ​            | NFS         | 2.5 TB ​ |
 +| ''/​mnt/​lustre/​users''​ | Lustre ​     | 9.8 TB ​ |
 +| ''/​apps'' ​            | NFS         | 1.5 TB ​ |
 +
 +
 +===== Software =====
 +
 +Software resides in ''/​apps''​ which is an NFS file system mounted on all nodes:
 +
 +^ ''/​apps/''​... ​ ^ Description ​  ^ Comment ​ ^
 +|  ''​chpc/''​ | Application codes supported by CHPC  | (See below) ​ |
 +|  ''​compilers/''​ | Compilers, other programming languages and development tools  | |
 +|  ''​libs/''​ | Libraries ​ |  |
 +|  ''​scripts/''​ | Modules and other environment setup scripts ​ |  |
 +|  ''​tools/''​ | Miscellaneous software tools  |  |
 +|  ''​user/''​ | Code installed by a special user research programme ​ |  |
 +
 +
 +====Application Codes Scientific Domains====
 +
 +^ ''/​apps/​chpc/''​... ​ ^ Scientific Domain ​ ^
 +|  ''​astro/''​ | Astrophysics & Cosmology ​ |
 +|  ''​bio/''​ | BioInformatics ​ |
 +|  ''​chem/''​ | Chemistry ​ |
 +|  ''​compmech/''​ | Mechanics ​ |
 +|  ''​cs/''​ | Computer Science ​ |
 +|  ''​earth/''​ | Earth  |
 +|  ''​image/''​ | Image Processing ​ |
 +|  ''​material''​ | Material Science ​ |
 +|  ''​phys/''​ | Physics ​ |
 +|  ''​space/''​ | Space  |
 +**NB:** Not all applications that exist on ''​lengau''​ exist on the CMC. Users are expected to install their codes in the appropriate directory when they are evaluating/​testing a code. 
 +
 +====Modules====
 +
 +CHPC uses the [[http://​modules.sourceforge.net/​|GNU modules]] utility, which manipulates your environment,​ to provide access to the supported software in ''/​apps/''​.
 +
 +Each of the major CHPC applications has a modulefile that sets, unsets, appends to, or prepends to environment variables such as $PATH, $LD_LIBRARY_PATH,​ $INCLUDE, $MANPATH for the specific application. Each modulefile also sets functions or aliases for use with the application. You need only to invoke a single command to configure the application/​programming environment properly. The general format of this command is:
 +  module load <​module_name>​
 +where <​module_name>​ is the name of the module to load. It also supports Tab-key completion of command parameters.
 +
 +For a list of available modules:
 +  module avail
 +The module command may be abbreviated and optionally be given a search term, eg.:  ​
 +  module ava chpc/open
 +To see a synopsis of a particular modulefile'​s operations:
 +  module help <​module_name>​
 +To see currently loaded modules:
 +  module list
 +To remove a module:
 +  module unload <​module_name>​
 +To remove all modules:
 +  module purge
 +  ​
 +To search for a module name or part of a name
 +  module-search ​ partname  ​
 +
 +After upgrades of software in ''/​apps/'',​ new modulefiles are created to reflect the changes made to the environment variables.
 +
 +**Disclaimer:​** //Codes in ''/​apps/​user/''​ are not supported by the CHPC and the TE for each research programme is required to create the appropriate module file or startup script.//
 +
 +===== Compilers =====
 +
 +Supported compilers for C, C++ and Fortran are found in ''/​apps/​compilers''​ along with interpreters for programming languages like Python.
 +
 +For MPI programmes, the appropriate library and ''​mpi*''​ compile scripts are also available.
 +
 +====GNU Compiler Collection====
 +
 +The default gcc compiler is 6.1.0:
 +<​code>​
 +login2:~$ which gcc
 +/​cm/​local/​apps/​gcc/​6.1.0/​bin/​gcc
 +login2:~$ gcc --version
 +gcc (GCC) 6.1.0
 +</​code>​
 +
 +To use any other version of gcc you need to remove 6.1.0 from all paths with
 +<​code>​
 +module purge
 +</​code>​
 +**before** loading any other modules.
 +
 +The recommended combination of compiler and MPI library is GCC 5.1.0 and OpenMPI 1.8.8 and is accessed by loading //both// modules:
 +<​code>​
 +module purge
 +module add gcc/5.1.0
 +module add chpc/​openmpi/​1.8.8/​gcc-5.1.0
 +</​code>​
 +
 +
 +===== Scheduler =====
 +
 +The CHPC cluster uses PBSPro as its job scheduler. ​ With the exception of //​interactive//​ jobs, all jobs are submitted to a batch queuing system and only execute when the requested resources become available. ​ All batch jobs are queued according to priority. ​ A user's priority is not static: the CHPC uses the "​Fairshare"​ facility of PBSPro to modify priority based on activity. This is done to ensure the finite resources of the CHPC cluster are shared fairly amongst all users.
 +
 +
 +====Queues====
 +
 +''​workq''​ is no longer to be used.
 +
 +The available queues are:
 +
 +^ Queue Name  ^ Max. cores  ^ Min. cores  ^  Max. jobs  ^^  Max. time  ^  Notes  ^ Access ​ ^
 +^ :::  ^  per job  ^^  in queue  ^  running ​ ^  hrs  ^ :::  ^ :::  ^
 +| serial ​ |  12 |  1 |  ??? |  ??? |  48 | For single-node non-parallel jobs.  |  |
 +| smp  |  12 |  1 |  20 |  10 |  96 | For single-node parallel jobs.  |  |
 +^ normal ​ ^  48 ^  24 ^  20 ^  10 ^  48 ^ The standard queue for parallel jobs ^  ^
 +| large  |  72 |  48 |  10 |  2 |  96 | For large parallel runs  | //​Restricted// ​ |
 +| bigmem ​ |  72 |  48 |  10 |  2 |  48 | For large memory parallel runs  | //​Restricted// ​ |
 +| test  |  12 |  1 |  1 |  1 |  3 | Normal nodes, for testing only  |  |
 +
 +===Notes:​===
 +
 +  * A standard compute node has 12 cores and 36 GiB of memory (RAM).
 +  * Additional restrictions:​
 +
 +^  Queue Name  ^  Max. total simultaneous running cores  ^
 +| **normal** ​ |  **48** |
 +| large  |  72 | 
 +
 +====PBS Pro commands====
 +
 +| ''​qstat'' ​ | View queued jobs.  |
 +| ''​qsub'' ​ | Submit a job to the scheduler. ​ |
 +| ''​qdel'' ​ | Delete one of your jobs from queue. ​ |
 +
 +====Job script parameters====
 +
 +Parameters for any job submission are specified as ''#​PBS''​ comments in the job script file or as options to the ''​qsub''​ command. ​ The essential options for the CHPC cluster include:
 +
 +<​code>​
 + -l select=10:​ncpus=12:​mpiprocs=12
 +</​code>​
 +
 +sets the size of the job in number of processors:
 +
 +| ''​select=N'' ​   | number of nodes needed. ​ |
 +| ''​ncpus=N'' ​    | number of cores //per node// ​ |
 +| ''​mpiprocs=N'' ​ | number of MPI ranks (processes) //per node// ​ |
 +
 +<​code>​
 + -l walltime=4:​00:​00
 +</​code>​
 +
 +sets the total expected wall clock time in hours:​minutes:​seconds. Note the wall clock limits for each queue.
 +
 +The job size and wall clock time must be within the limits imposed on the queue used:
 +
 +<​code>​
 + -q normal
 +</​code>​
 +
 +to specify the queue.
 +
 +
 +===Restricted queues===
 +
 +The ''​large''​ and ''​bigmem''​ queues are restricted to users who have need for them.  If you are granted access to these queues then you should specify that you are a member of the ''​largeq''​ or ''​bigmemq''​ groups. ​ For example:
 +
 +<​code>​
 +#PBS -q large
 +#PBS -W group_list=largeq
 +</​code>​
 +
 +or
 +
 +<​code>​
 +#PBS -q bigmem
 +#PBS -W group_list=bigmemq
 +</​code>​
  
/var/www/wiki/data/pages/acelab/user_guide.txt · Last modified: 2018/07/27 17:59 by smasoka