User Tools

Site Tools


howto:bioinformatics.old

Bioinformatics at the CHPC

Welcome to the bioinformatics at the CHPC wiki page! This page describes the basic procedures involved in getting your programs running at the CHPC rather than a description of how to do any particular bioinformatics analysis. If anything is unclear please hover your mouse over the superscripts! 1) For the most part we will be assuming you have at least a little familiarity with Linux. Much of this information is available elsewhere in the CHPC's wiki (probably in more detail), but here we are trying to have everything accessible in one place for the bioinformatics community.

The Bioinformatics Service Platform (BSP) has recently obtained its own domain and website at http://bsp.ac.za/. We also host Globus endpoints at chpcbio#bio.chpc.ac.za and chpcbio#globus.chpc.ac.za.

Web Portal Access

Web-based access to the CHPC cluster is available via a Galaxy2) web interface at http://galaxy.chpc.ac.za/. Another workflow-based system uses chipster3) but uses a dedicated VM and does not run on the cluster http://chipster.chpc.ac.za/.

To transfer files inward using gridftp, the http://globus.org/ system can be used, and is accessible via our endpoint named chpcbio#globus.chpc.ac.za. You should use the same credentials used to log in via ssh.

Command Line Access

Various opensource packages have been pre-installed at the CHPC. For the moment they're on the SUN cluster but where appropriate they will be ported to the other architectures. First one must apply for resources to gain access to the cluster. Once your registration has been approved then Linux and OSX users can simply open a terminal and connect via ssh to the server using a command of the form4):

localuser@my_linux:~ $ ssh username@sun.chpc.ac.za
Last login: Tue Jan 28 14:05:35 2014 from 10.128.23.235
username@login01:~ $

where user is the username you are assigned upon registration. Windows users can download the putty client 5). Once connected users can: use the modules system to get access to bioinformatics programs; create job scripts using editors such as vim6) or nano7); and finally submit and monitor their jobs.

Using Modules

For now a quick and simple way of getting access to the bioinformatics software is using the module function. First of all one should ensure that:

export MODULEPATH=/opt/gridware/bioinformatics/modules:$MODULEPATH

exists in your ~/.profile file8). Then running:

username@login01:~ $ module avail

will present you with the various modules available on the system and you should see something like:

------------------------ /opt/gridware/bioinformatics/modules -------------------------
R/2.15.2                    bowtie/0.12.9               gcc_4.7.2_libs
R/2.15.3                    bowtie/1.0.0                lam/7.1.4
R/3.0.0                     bowtie2/2.1.0               latex/texlive_2012
R/default                   clustal/clustal-omega-1.1.0 mpich2/1.5
beagle/beagle_lib-r1090     clustal/clustalw-2.1        perl/5.16.3
beagle/default              clustal/clustalw-MPI-1.82   python/2.7.3
beast/beast-1.7.2           cufflinks/2.0.2             samtools/0.1.18
beast/default               cufflinks/2.1.1             tophat/2.0.8b
bioperl/1.6.1               emboss/6.5.7                velvet/1.2.08

------------------- /opt/gridware/modules-3.2.7/Modules/3.2.7/CHPC --------------------
amber/12(default)               intel2012
clustertools                    inteltools
dell/default-environment        mvapich2/1.8-gnu
dell/moab                       mvapich2/1.8-r5668
dell/openmpi/intel/1.4.4        netcdf/gnu-4.1.2
dell/torque/2.5.12              netcdf/intel-4.1.2
dlpoly/2.20-impi                openfst/1.3.3-gnu
dlpoly/2.20-steve-impi          openfst/1.3.3-intel
dlpoly/3.07-impi                openmpi/openmpi-1.6.1-gnu
dlpoly/3.09-iompi               openmpi/openmpi-1.6.1-gnu.bak
espresso/3.1.2                  openmpi/openmpi-1.6.1-intel
fftw/3.3.2-intel                openmpi/openmpi-1.6.1-intel.bak
g09                             sapt/2008
gcc/4.6.3                       sunstudio
gcc/4.7.2                       tau
intel                           zlib/1.2.7

Now to make use of, tophat say, one can type:

username@login01:~ $ module add tophat/2.0.8b

The appropriate environmental variables are set (usually as simple as adding a directory to the search path). Notice that often there are several versions of software available, e.g. R versions 2.15.2, 2.15.3 and 3.0.0. The module system then allows you to choose which version specifically you'd like to use by running a command such as module add R/2.15.3. Note: it is better in general to be specific about which version you'd like to use rather than assuming the system will know. Running:

username@login01:~ $ module list

will show which modules have been loaded. Whereas:

username@login01:~ $ module del modulename

will unload a module. And finally:

username@login01:~ $ module show modulename

will show what module modulename actually does.

Create Job Scripts

Next one must create a job script such as the one below:

my_job.qsub
#!/bin/bash
#PBS -l select=1:ncpus=12:jobtype=dell,place=excl
#PBS -l walltime=10:00:00
#PBS -q workq
#PBS -V
#PBS -o /export/home/username/scratch5/NGS_data/stdout.txt
#PBS -e /export/home/username/scratch5/NGS_data/stderr.txt
#PBS -N TophatEcoli
#PBS -M myemailaddress@someplace.com
#PBS -m b
 
source /etc/profile.d/modules.sh
MODULEPATH=/opt/gridware/bioinformatics/modules:${MODULEPATH}
module add tophat/2.0.9
 
NP=`cat ${PBS_NODEFILE} | wc -l`
 
EXE="tophat"
ARGS="--num-threads ${NP} someindex reads1 reads2 -o output_dir"
 
cd /export/home/username/scratch5/NGS_data/
${EXE} ${ARGS}

Note that username should be your username… More details on the job script file can be found in our PBS quickstart guide.

Submit Job Script

Finally submit your job using:

username@login01:~ $ qsub my_job.qsub
 
13614.chpcmoab0
username@login01:~ $

where 13614@chpcmoab0 is the jobID that is returned.

Monitor jobs

Jobs can then be monitored/controlled in several ways:

qstat

check status of pending and running jobs
username@login01:~ $ qstat -u username
 
chpcmoab01:
                                                            Req'd  Req'd   Elap
Job ID          Username Queue    Jobname    SessID NDS TSK Memory Time  S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
13614.chpcmoab0 username workq    TophatEcol 17546   1   12   --  01:00 R 00:00
username@login01:~ $
check status of particular job
username@login01:~ $ qstat -f 13614.chpcmoab01
Job Id: 13614.chpcmoab01
    Job_Name = TophatEcoli
    Job_Owner = username@login01
    resources_used.cpupercent = 0
    resources_used.cput = 00:00:00
    resources_used.mem = 16796kb
    resources_used.ncpus = 12
    resources_used.vmem = 166064kb
    resources_used.walltime = 00:02:32
    job_state = R
    queue = workq
    server = chpcmoab01
    Checkpoint = u
    ctime = Tue Jan 28 13:15:41 2014
    Error_Path = /export/home/username/scratch5/NGS_data/stderr.txt
    exec_host = cnode-9-34/2*12
    exec_vnode = (cnode-9-34:ncpus=12)
    Hold_Types = n
    interactive = True
    Join_Path = n
    Keep_Files = n
    Mail_Points = a
    mtime = Tue Jan 28 13:15:42 2014
    Output_Path = /export/home/username/scratch5/NGS_data/stdout.txt
    Priority = 0
    qtime = Tue Jan 28 13:15:41 2014
    Rerunable = False
    Resource_List.ncpus = 12
    Resource_List.nodect = 1
    Resource_List.place = free
    Resource_List.select = 1:ncpus=12:jobtype=dell
    Resource_List.walltime = 20:00:00
    stime = Tue Jan 28 13:15:42 2014
    session_id = 16154
    jobdir = /export/home/username/scratch5/NGS_data
    substate = 42
    Variable_List = PBS_O_SYSTEM=Linux,PBS_O_SHELL=/bin/bash,
        PBS_O_HOME=/export/home/user,PBS_O_LOGNAME=username,
        PBS_O_WORKDIR=/export/home/username/scratch5/NGS_data,
        PBS_O_LANG=en_US.UTF-8,
        PBS_O_PATH=/opt/gridware/bioinformatics/emacs/emacs-24.3/bin:/export/h
        ome/username/local/bin:/usr/lib64/qt-3.3/bin:/opt/pbs/default/sbin/:/op
        t/pbs/default/bin/:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin,
        PBS_O_MAIL=/var/spool/mail/username,PBS_O_QUEUE=workq,
        PBS_O_HOST=login01
    comment = Job run at Tue Jan 28 at 13:15 on (cnode-9-34:ncpus=12)
    etime = Tue Jan 28 13:15:41 2014
    Submit_arguments = -I -l select=1:ncpus=12:mpiprocs=12:jobtype=dell,
        place=free -N TophatEcoli -l walltime=20:00:00
    project = _pbs_project_default
 
username@login01:~ $
canceljob
username@login01:~ $ qdel 13614.chpcmoab01
username@login01:~ $

Basic examples

Blast

Running Blast on the M9000

One thing to note is that one cannot use scratch on the m9000 – so jobs must be run in the user's home (or a sub-directory of home).

Job script

Your job script will look something like this9):

my_job.qsub
#! /bin/bash
#PBS -l select=1:ncpus=128:mpiprocs=128:jobtype=spark
#PBS -l place=free
#PBS -l walltime=06:00:00
#PBS -q spark
#PBS -o /export/home/username/blastjob/stdout.txt
#PBS -e /export/home/username/blastjob/stderr.txt
#PBS -M youremail@address.com
#PBS -m be
#PBS -N m9000_blast
 
# NOTE: The M9000 has its own scratch space separate from main Lustre storage 
# So run in your home, or a subdir of home, or request via helpdesk that a
# scratch directory be created for you on the m9000, eg. in ''/scratch/work/username''
 
cd /export/home/username/blastjob
NP=`cat $PBS_NODEFILE | wc -l`
 
EXE="/opt/gridware/bioinformatics/m9000/ncbi-blast-2.2.24/bin/blastx"
ARGS="-db /scratch/work/bioinfo/BLASTDB/nr -query my_seqs.fasta -evalue 0.001 -num_alignments 20 -outfmt 5 -num_threads ${NP} > my_results.xml"
 
$EXE $ARGS

Of course one should set the parameters as required. (Setting a small evalue is recommended as is limiting the number of alignments). For blast2go users remember to set -outfmt to 5 for XML output. Note one should also select the correct EXEcutable and -db: blastx, blastn and blastp are available for the former, while nr and nt are available for the latter.

Submit your job

Finally submit your job using:

user@login01:~ $ qsub my_job.qsub

Running Blast on sun cluster

Big thanks to Peter van Heusden for developing this script.

sun_blast.sh
#!/bin/bash                                                                                                                                                                                                                                   
 
WORKDIR="/export/home/${USER}/scratch5/blast_proj"
INPUT_FASTA=${WORKDIR}/data_set.fa.gz
BLAST_E_VAL="1e-3"
BLAST_DB="/lustre/SCRATCH5/groups/bioinfo/DBs/BLAST/nr"
JOBTYPE=nehalem
THREADS=8
BLAST_HOURS=0
BLAST_MINUTES=30
ID_FMT="%01d"
SPLIT_PREFIX="sub_set"
MAIL_ADDRESS="youremail@somewhere.ac.za"
 
zcat ${INPUT_FASTA} | csplit -z -f ${WORKDIR}/${SPLIT_PREFIX} -b "${ID_FMT}.split.fasta" - '/^>/' '{*}'
 
NUM_PARTS=$(ls sub_set*.split.fasta | wc -l)
START=0
END=$(expr $NUM_PARTS - 1)
 
TMPSCRIPT=thejob.sh
# note: make a distinction between variables set by the containing script (e.g. WORKDIR) and                                                                                                                                                  
# ones set in the script (e.g. INDEX). The ones set in the script need to be escaped out                                                                                                                                                      
cat >${TMPSCRIPT} << END                                                                                                                                                                                                                      
#!/bin/bash                                                                                                                                                                                                                                   
 
#PBS -l select=1:ncpus=${THREADS}:jobtype=${JOBTYPE}                                                                                                                                                                                          
#PBS -l place=excl:group=nodetype                                                                                                                                                                                                             
#PBS -l walltime=${BLAST_HOURS}:${BLAST_MINUTES}:00                                                                                                                                                                                           
#PBS -q workq                                                                                                                                                                                                                                 
#PBS -m ae                                                                                                                                                                                                                                    
#PBS -M ${MAIL_ADDRESS}                                                                                                                                                                                                                       
 
. /etc/profile.d/modules.sh                                                                                                                                                                                                                   
module add blast/2.2.29+                                                                                                                                                                                                                      
 
INDEX="${WORKDIR}/${SPLIT_PREFIX}\${PBS_ARRAY_INDEX}"                                                                                                                                                                                         
INFILE="\${INDEX}.split.fasta"                                                                                                                                                                                                                
OUTFILE="\${INDEX}.blastx.xml"                                                                                                                                                                                                                
 
cd ${WORKDIR}                                                                                                                                                                                                                                 
blastx -num_threads 8 -evalue ${BLAST_E_VAL} -db ${BLAST_DB} -outfmt 5 -query \${INFILE} -out \${OUTFILE}                                                                                                                                     
END                                                                                                                                                                                                                                           
 
BLAST_JOBID=$(qsub -N sunblast -J ${START}-${END} ${TMPSCRIPT} | cut -d. -f1)
echo "submitted: ${BLAST_JOBID}"
 
rm ${TMPSCRIPT}
 
cat >${TMPSCRIPT} << END                                                                                                                                                                                                                      
#!/bin/bash                                                                                                                                                                                                                                   
 
#PBS -l select=1:ncpus=1:jobtype=${JOBTYPE}                                                                                                                                                                                                   
#PBS -l place=free:group=nodetype                                                                                                                                                                                                             
#PBS -l walltime=1:00:00                                                                                                                                                                                                                      
#PBS -q workq                                                                                                                                                                                                                                 
#PBS -m ae                                                                                                                                                                                                                                    
#PBS -M ${MAIL_ADDRESS}                                                                                                                                                                                                                       
#PBS -W depend=afterok:${BLAST_JOBID}                                                                                                                                                                                                         
 
cd ${WORKDIR}                                                                                                                                                                                                                                 
tar jcf blast-xml-output.tar.bz2 *.blastx.xml                                                                                                                                                                                                 
END                                                                                                                                                                                                                                           
 
qsub -N tarblast ${TMPSCRIPT}
 
rm ${TMPSCRIPT}

This script is designed to be run from the login node – it creates the job job scripts themselves and submits them. There are a number of things to notice:

  1. The use of heredocs. These allow us to embed scripts that are to be run into another script. Here we can see that they output the text between “cat >${TMPSCRIPT} « END” and “END” into the file ${TMPSCRIPT}
  2. The use of job-arrays – these allow us to submit multiple independent jobs as sub-jobs of one larger script. The line “BLAST_JOBID=$(qsub -N sunblast -J ${START}-${END} ${TMPSCRIPT} | cut -d. -f1)” does multiple things:
    • It submits a job-array with the -J option which contains a STARTing number and an ENDing number. The END value in turn is informed by the line “NUM_PARTS=$(ls sub_set*.split.fasta | wc -l)” which counts the number of sub-fasta files which were created using the “csplit10) command.
    • the “cut -d. -f1” is used to grab the job identifier that is returned from the scheduler when the job is submitted. This is assigned to the variable BLAST_JOBID.
    • Note that job-arrays create the environmental variable PBS_ARRAY_INDEX which is used as a parameter for both the blast's input file and the blast's output file parameters.
    • Another important aspect of the job array is that the walltime parameter is the longest time you'd expect the sub-jobs to run in. So in this case we've divided a fasta file into many smaller faster files – one for each sequence. In the event that your original sequences have widely differing lengths it may pay to have a different approach to the division – perhaps one that results in the sub-fastas having similar sizes.
  3. The use of job dependencies. We see it in the second heredoc in the line “#PBS -W depend=afterok:${BLAST_JOBID}”. What this line does is that it says the job script only runs after the job with ID ${BLAST_JOBID} has successfully finished running, i.e. this job will not run if there are problems with the first job.

Blast2Go

A local instance of blast2go is available at the CHPC. It is accessible from outside the CHPC, however it does require you to set up some port forwarding.

Port Forwarding

This is accomplished via setting up port forwarding in your SSH session. In windows this is usually done in PuTTY, and in unix/osx this is done on the command line. Note: This connection must stay on for as long as you wish to use CHPC's blast2go database.

PuTTY

When setting up the ssh connection you should go to: connection → SSH → Tunnels. Then add 3306 for Source port and 10.128.15.90:3306 for Destination. Then click Add.

You should then save your session (so that you don't have to fill this information in every time) and connect (using your normal CHPC login details).

SSH

Your normal ssh command will change to look more like this:

localuser@my_linux:~ $ ssh username@sun.chpc.ac.za -L 3306:10.128.15.90:3306
Last login: Tue Jan 28 14:05:35 2014 from 10.128.23.235
username@login01:~ $

blast2go Configuration

First you should go to the blast2go website and start blast2go as normal by clicking on the please click here link11).

A small file will download and you should then run it, and then blast2go proceeds to download the rest of the application.

Once blast2go is running go to: Tools → General Settings → DataAccess Settings.

Then set:

  • Own Database
  • DB Name: b2gdbFeb2014
  • DB Host: localhost
  • DB User: b2guser
  • DB Password: blast4it

and click OK. Then you will see in the bottom window/tab that it has connected to the database 12), and should also confirmed right at the bottom in the status message13).

Finally you may test the everything is working as expected by clicking on the white arrow in the green circle and confirm that you get the GO graph.

Gromacs

If you would like to try running gromacs on the gpu please take a look at this.

The job script that follows is for running an MPI compiled version of gromacs 4.6.1 on nehalem. There are many different versions of gromacs, to see what's available try:

user@login01:~ $ module avail

The following example is for working with one of the “_nehalem” gromacs modules – note it's quite important to use the correct version as the input data changes with versions…

Job script

gromacs_nehalem.qsub
#!/bin/bash
#PBS -l select=10:ncpus=8:mpiprocs=8:jobtype=nehalem,place=excl
#PBS -l walltime=00:40:00
#PBS -q workq
#PBS -M user@someinstitution.ac.za
#PBS -m be
#PBS -V
#PBS -e /lustre/SCRATCH5/users/USERNAME/gromacs_data/std_err.txt
#PBS -o /lustre/SCRATCH5/users/USERNAME/gromacs_data/std_out.txt
#PBS -N GROMACS_JOB
#PBS -mb
 
MODULEPATH=/opt/gridware/bioinformatics/modules:$MODULEPATH
source /etc/profile.d/modules.sh
 
#######module add
module add gromacs/4.6.1_nehalem
 
OMP_NUM_THREADS=1
 
NP=`cat ${PBS_NODEFILE} | wc -l`
 
EXE="mdrun_mpi"
ARGS="-s XXX -deffnm YYYY"
 
cd /lustre/SCRATCH5/users/USERNAME/gromacs_data
mpirun -np ${NP} -machinefile ${PBS_NODEFILE} ${EXE} ${ARGS}

Submit your job

Finally submit your job using:

user@login01:~ $ qsub gromacs_nehalem.qsub

bowtie

Things to note about this script – bowtie currently does not run across multiple nodes. So using anything other than select=1 will result in compute resources being wasted14).

Job script

Then your job script called gromacs_nehalem.qsub will look something like this:

bowtie_script.qsub
#! /bin/bash
#PBS -l select=1:ncpus=12
#PBS -l place=excl
#PBS -l walltime=06:00:00
#PBS -q workq
#PBS -o /export/home/username/scratch5/some_reads/stdout.txt
#PBS -e /export/home/username/scratch5/some_reads/stderr.txt
#PBS -M youremail@address.com
#PBS -m be
#PBS -N bowtiejob
 
##################
MODULEPATH=/opt/gridware/bioinformatics/modules:$MODULEPATH
source /etc/profile.d/modules.sh
 
#######module add
module add bowtie2/2.2.2
 
NP=`cat ${PBS_NODEFILE} | wc -l`
 
EXE="bowtie2"
 
forward_reads="A_reads1.fq,B_reads_1.fq"
reverse_reads="A_reads1.fq,B_reads_1.fq"
output_file="piggy_hits.sam"
ARGS="sscrofa --shmem --threads ${NP} --sam -q -1 ${forward_reads} -2 ${reverse_reads} ${output_file}"
 
${EXE} ${ARGS}

Note: username should contain your actual user name!

Submit your job

Finally submit your job using:

user@login01:~ $ qsub bowtie_script.qsub

NAMD2

If you would like to try running namd2 on the GPU please take a look at this.

The job script that follows is for running a NAMD over the infiniband. Note that this does not use MPI so the script is somewhat different from other scripts you may see here.

Job script

namd.qsub
#!/bin/bash
#PBS -l select=10:ncpus=12:mpiprocs=12
#PBS -l place=excl
#PBS -l walltime=00:05:00
#PBS -q workq
#PBS -o /export/home/username/scratch5/namd2/stdout.txt
#PBS -e /export/home/username/scratch5/namd2/stderr.txt
#PBS -m ae
#PBS -M youremail@address.com
#PBS -N NAMD_bench
 
. /etc/profile.d/modules.sh
MODULEPATH=/opt/gridware/bioinformatics/modules:${MODULEPATH}
module add NAMD/2.10_ibverbs
 
cd /export/home/${USER}/scratch5/namd2
 
pbspro_namd apoa1.namd

Submit your job

Finally submit your job using:

user@login01:~ $ qsub namd.qsub

bowtie

Things to note about this script – bowtie currently does not run across multiple nodes. So using anything other than select=1 will result in compute resources being wasted15).

Job script

Then your job script called gromacs_nehalem.qsub will look something like this:

bowtie_script.qsub
#! /bin/bash
#PBS -l select=1:ncpus=12
#PBS -l place=excl
#PBS -l walltime=06:00:00
#PBS -q workq
#PBS -o /lustre/SCRATCH5/users/username/some_reads/stdout.txt
#PBS -e /lustre/SCRATCH5/users/username/some_reads/stderr.txt
#PBS -M youremail@address.com
#PBS -m be
#PBS -N bowtiejob
 
##################
MODULEPATH=/opt/gridware/bioinformatics/modules:$MODULEPATH
source /etc/profile.d/modules.sh
 
#######module add
module add bowtie2/2.2.2
 
NP=`cat ${PBS_NODEFILE} | wc -l`
 
EXE="bowtie2"
 
forward_reads="A_reads1.fq,B_reads_1.fq"
reverse_reads="A_reads1.fq,B_reads_1.fq"
output_file="piggy_hits.sam"
ARGS="sscrofa --shmem --threads ${NP} --sam -q -1 ${forward_reads} -2 ${reverse_reads} ${output_file}"
 
cd /lustre/SCRATCH5/users/username/some_reads
${EXE} ${ARGS}

Note: username should contain your actual user name!

Submit your job

Finally submit your job using:

user@login01:~ $ qsub bowtie_script.qsub

R/bioconductor

pbdR example

Job scripts
pbdtest.qsub
#PBS -l select=2:ncpus=8:mpiprocs=8:jobtype=nehalem,place=excl
#PBS -l walltime=00:01:00
#PBS -q workq
#PBS -M YOUREMAILADDRESS
#PBS -m be
#PBS -V
#PBS -e /lustre/SCRATCH5/users/USERNAME/pbdR_test/std_err.txt
#PBS -o /lustre/SCRATCH5/users/USERNAME/pbdR_test/std_out.txt
#PBS -N PBDR_TEST
#PBS -mb
 
MODULEPATH=/opt/gridware/bioinformatics/modules:$MODULEPATH
source /etc/profile.d/modules.sh
module add R/3.2.0
 
NP=`cat ${PBS_NODEFILE} | wc -l`
 
cd /lustre/SCRATCH5/users/USERNAME/pbdR_test/
mpirun -np ${NP} -machinefile ${PBS_NODEFILE} Rscript test_script.R

Note: USERNAME should contain your actual user name!

test_script.R
library(pbdMPI, quiet=TRUE)
init()
my.rank <- comm.rank()
comm.print(my.rank, all.rank=TRUE)
 
finalize()
Submit your job

Finally submit your job using:

user@login01:~ $ qsub pbdtest.qsub

tophat

tuxedo

biopython

velvet

SOAP

Advanced examples

Databases

Databases are accessible on the cluster in the

/lustre/SCRATCH5/groups/bioinfo/DBs

directory. Alternatively they are also mirrored on the bio machine.

Support

Please contact us to: request software updates/installs; download big datasets; get advice on the best way to run your analysis; or to tell us what is/isn't working!

1) Because they might just give you some hints ;-)
4) Note that localuser@my_linux:~ $ is not part of the command
5) Here is the getting started with putty guide
8) it is not included by default
9) Note you can click on the tab my_job.qsub to download this if you wish to use it as a template. Or you can just copy and paste…
10) csplit is a very useful tool – google it!
11) You may want to change the memory specifications here if you have lots of sequences
12) Open database connection to database 'b2gdbFeb2014' on 'localhost' as 'b2guser', with…
13) Connected to own database: localhost: b2gdbFeb2014
14) , 15) Both because it will only run on a single node, and telling a process to use more threads than it has cores usually results in inefficiencies.
/var/www/wiki/data/pages/howto/bioinformatics.old.txt · Last modified: 2016/04/14 12:24 by dane