Job submission on the QB3 cluster

From Jacobson Lab Wiki

Jump to: navigation, search

Contents

[edit] Overview

Prior to running jobs on the QB3 cluster it would definitely not hurt to leaf through the reader-friendly QB3 wiki (https://salilab.org/qb3cluster/) to familiarize yourself with the cluster's infrastructure. Through understanding how the QB3 cluster is set up in terms of the number of node allocations per lab and which type of MPI architecture is most suitable for your software, and so on, you can optimize the setup for your simulation to maximize your output rate. Therefore, the QB3 cluster can be a very resourceful means for running all kinds of simulations, if utilized to your advantage.

Instructions on accessing the QB3 wiki are available in the document /usr/share/doc/WIKI-ACCESS after you log into chef.compbio.ucsf.edu or sous.compbio.ucsf.edu.

[edit] Submitting a Job

If you have already compiled your software of choice and/or know the path to the executable, you can check out some sample submission scripts.

Five nodes (with a total of 40 cores) are allocated only to the Jacobsonlab: iq103, iq104, iq105, iq106, iq107

[edit] Data Storage

A quota of 4GB for /netapp/home is enforced for each user.

[edit] Resources for Software Compilation

Currently the following nodes in the cluster allows you to compile C++ and F77/F90 code:

optint1 and optint2 (Opteron)
xeonint (32-bit Xeon, x86)

To gain permission to use these nodes, contact Joshua Baker-LePain (jlb@salilab.org) so he can grant you Interactive Access (https://salilab.org/qb3cluster/Interactive_access).

[edit] ccd (Jacobson Lab's in-house Brownian dynamics software)

Software Compilation
You can compile the software on the QB3 cluster, but since ccd is relatively small program comprised of one *.exe file (devoid of any binary files) and has already been compiled on the jacobsonlab server, copying over the executable to your directory on the QB3 cluster would suffice. However, in doing this, make sure to check the bit version of the Intel compiler used to compile the ccd exe - it needs to match that of the MPI architecture specified in your submission script (otherwise your simulation will sit in the queue and never run!). When compiling software, keep in mind that most of the MPI architectures on QB3 are customized for 64-bit software.

Parallel
As of May 2012, Chris McClendon compiled a multi-threaded 64-bit version of ccd on the Jacobsonlab clusters. The executable was copied over as /netapp/home/lilipeng/ccd_mcclendon/ccd. As noted in sample submission scripts, the type of architecture specified for the 64-bit version of ccd is arch=lx24-amd64.

Sample simulations exist under: /netapp/home/lilipeng/ccd/jobs/multi-threaded/monica_testcases

Serial
The long-term goal is to develop a healthy and robust multi-threaded version of ccd. As of May 2012, we are transitioning from using the single-threaded to the multi-threaded versions. Still, right now you still have the option of running ccd simulations using the single-threaded version. Lili compiled a single-threaded 32-bit version of ccd, which was copied over as /netapp/home/lilipeng/ccd/ccd. As noted in sample submission scripts, the type of architecture specified for the 32-bit version of ccd is arch=lx24-x86.

Sample simulations exist under: /netapp/home/lilipeng/ccd/jobs/single-threaded

[edit] GROMACS

Software Compilation
Any version of GROMACS needs to be compiled directly on the QB3 cluster.

Parallel - GROMACS only
Currently there are two versions of GROMACS compiled on the server in the following directories:

GROMACS 4.0.7: /netapp/home/pwassam/local/gromacs-4.0.7/bin
GROMACS 4.5.5: /netapp/home/pwassam/local/gromacs-4.5.5/bin

Single- and double-precision can be invoked by the _s and _d suffixes of each function. A sample submission script for running parallel jobs using GROMACS 4.5.5 is available in sample submission scripts.

Parallel - GROMACS with PLUMED
Both single- and double-precision versions of GROMACS 4.5.5 has been compiled with the PLUMED 1.3 plug-in for free-energy calculations:

GROMACS 4.0.7: /netapp/home/pwassam/gromacs/gromacs-4.0.7-plumed/bin
GROMACS 4.5.5: /netapp/home/pwassam/gromacs/gromacs-4.5.5-plumed/bin

The documentation is provided in /netapp/home/lilipeng/plumed-1.3.

A sample submission script for running parallel jobs using GROMACS 4.5.5 with PLUMED 1.3 is available in sample submission scripts.

[edit] Protein Loop Optimization Program (PLOP)

PLOP (v25.6) is available on /netapp/home/ck/plop directory. To run a plop job, use:

/netapp/home/ck/plop/plop plop_job.con

command on the command line. However, if you want to submit to the queue, use

/netapp/home/ck/qb3_qsub.pl 

script. Please copy this script and edit it for your needs.

However, to submit many plop jobs at the same time, have all PLOP control files and input files in a single directory and submit an array job by adjusting and applying

/netapp/home/fwallrapp/bin/arrayJob.con

[edit] Schrodinger Software

2010 available at /netapp/home/pwassam/schrodinger/Suite2010u1

2012 available at /netapp/home/pwassam/schrodinger/Suite2012


[edit] Please Note!!

The jobs below are being run as array jobs. With the PBS flag "-t", I am telling the cluster that I want to run 15 different IFD jobs, basically 15 different ligands to the same receptor. I set the job file up such that each of my ligand names corresponds with a single task (e.g.; A04 = 1), and then I can run all ligands of interest to one particular system from the same job file. Obviously, if you were interested in one ligand and many proteins, you could set this up differently.


~~the reason that we submit array jobs to QB3 is because submitting a large number of single docking jobs is hard on the scheduler and wastes its resources.

Running Glide/IFD on chef:

#!/bin/tcsh
#
#$ -V
#$ -o /netapp/home/klexa/HSA/Morena_structures/qout
#$ -e /netapp/home/klexa/HSA/Morena_structures/qout
#$ -j y
#$ -r n
#$ -t 1-15
#$ -l arch=linux-x64
#$ -q lab.q,long.q
##$ -l h_rt=200:00:00

cd /netapp/home/klexa/HSA/Morena_structures

setenv LM_LICENSE_FILE "27000@169.230.126.31"

setenv SCHRODINGER /netapp/home/pwassam/schrodinger/Suite2010u1

setenv LD_LIBRARY_PATH /netapp/home/pwassam/schrodinger/32bitlib


set tasks = (A04 A07_S A09 A10_R A10_S A11_R A13 A14 A15 A16 A17_R A17_S A18 A19_R A20_R)

set input="$tasks[$SGE_TASK_ID]"

date

hostname

$SCHRODINGER/ifd IF_1N5U_site1_${input}.inp -WAIT


##########2012##########

#!/bin/tcsh
#
#$ -V
#$ -o /netapp/home/klexa/HSA/A_1E7A/qout
#$ -e /netapp/home/klexa/HSA/A_1E7A/qout
#$ -j y
#$ -r n
#$ -l arch=linux-x64
#$ -q lab.q,long.q
##$ -l h_rt=200:00:00

cd /netapp/home/klexa/HSA/A_1E7A/

setenv LM_LICENSE_FILE "27000@169.230.126.31"

setenv SCHRODINGER /netapp/home/pwassam/schrodinger/Suite2012

set tasks=( acrivastin alprenolol amoxicillin antipyrine atenolol bumetanide caffeine camptothecin captopril )

set input="$tasks[$SGE_TASK_ID]"

date

hostname

$SCHRODINGER/ifd IF_1e7a_site1_${input}.inp -OVERWRITE -WAIT


Running Desmond on chef:

This is a sample script to run a replica exchange job with desmond on chef. Here, we are requesting 8 processors on the same node. cd If you want to use 2010 instead of 2012, you will need to alter some of the command prompts (e.g. -c becomes -cfg in 2010 submission).

#!/bin/sh
#
#$ -S /bin/sh
#$ -cwd
#$ -V
#$ -R yes
#$ -j y
#$ -pe pe_mpich2_onehost 8
#$ -l arch=linux-x64
##$ -l mem_free=1G
#$ -l h_rt=336:00:00
#$ -N dentigerumycin_remd
#$ -q lab.q,long.q

# Assume only one msj file per directory, get INPUTNAME from it

INPUTNAME=`ls *.msj | sed -e 's/\.msj//'`

# Get JOBNAME from the SGE job name (#$ -N line above)

JOBNAME=$JOB_NAME

export SCHRODINGER=/netapp/home/pwassam/schrodinger/Suite2012

export LM_LICENSE_FILE=27000@169.230.126.31

$SCHRODINGER/utilities/multisim -i $INPUTNAME.cms -m $INPUTNAME.msj -c $INPUTNAME.cfg -cpu '2 2 2' -mode umbrella -verbose -WAIT -TMPDIR tmp -JOBNAME $JOBNAME -o $JOBNAME-out.cms

# Compress the checkpoint file

gzip $JOBNAME.cpt

# This prints out CPU and memory usage information to the log
# It is useful for correctly setting -l mem_free above

TASK_ID=$SGE_TASK_ID

if [ $TASK_ID = "undefined" ] ; then TASK_ID="1" fi

qstat -j $JOB_ID | grep -E "^usage +$TASK_ID"

Personal tools