Welcome to the Info TEST server!

Skip to content. | Skip to navigation

Sections
Info Services > Computing Guide > Cluster Processing > Appendix > Advanced qsub and submit

Advanced qsub and submit

Advanced qsub

You can use #PBS directives in the script you submit via qsub.  These directives are the same as command-line options to qsub.  For example, if you wanted to use the -V command-line option to qsub, you could instead include it in your script with the line #PBS -V.  See below for more examples.

The default walltime for batch jobs is 100 days. Your job will be killed if it is still running after 100 days unless you have set a walltime. Setting a walltime requires using qsub and does not work with submit.

Jobs are not restarted if there is a node failure.  Also, any reservations are removed from a node if that node reboots.

#!/bin/sh

# Set PBS Directives
# Lines starting with "#PBS", before any shell commands are
# interpreted as command line arguments to qsub.
# Don't put any commands before the #PBS options or they will not work.
#
#PBS -V    # Export all environment variables from the qsub commands environment to the batch job.
#PBS -l pmem=16gb,pvmem=16gb       # Amount of memory needed by each process (ppn) in the job.
#PBS -d /lustre/aoc/observers/nm-4386 # Working directory (PBS_O_WORKDIR)
#PBS -m bea                 # Send mail on begin, end and abort

# Because these start with "##PBS", they are not read by qsub.
##PBS -l mem="16gb"    # physmem used by job. Ignored if NUM_NODES > 1. Won't kill job.
##PBS -l pmem="16gb"    # physmem used by any process. Won't kill job.
##PBS -l vmem="16gb"    # physmem + virtmem used by job. Kills job if exceeded.
##PBS -l pvmem="16gb"    # physmem + virtmen used by any process. Kills job if exceeded.

##PBS -l nodes=1:ppn=1       # default is 1 core on 1 node
##PBS -M nm-4386@nrao.edu      # default is submitter
##PBS -W umask=0117          # default is 0077
##PBS -l walltime=1:0:0:0      # default is 100 days.  This set it to 1 day

# casa's python requires a DISPLAY for matplot, so create a virtual X server
xvfb-run -d casa --nogui -c /lustre/aoc/observers/nm-4386/run_casa.py

 

Parallel Batch Jobs

The procedure for submitting parallel batch jobs is very similar to submitting serial jobs.  The differences are setting the ppn qsub option to something other than 1 and how casa is executed.

The qsub option ppn specifies the number of cores per node requested by the job.  If this option is not set, it defaults to 1.  It is used in conjunction with the -l nodes option.  For example, to request one node with 8 cores you would type -l nodes=1:ppn=8.

The scheduler creates a file containing the requested node and core count assigned to the job.  The location of this file is stored in the environment variable PBS_NODEFILE.  This file can tell mpicasa on which nodes to run.

#!/bin/sh

# Set PBS Directives
# Lines starting with "#PBS", before any shell commands are
# interpreted as command line arguments to qsub.
# Don't put any commands before the #PBS options or they will not work.
#
#PBS -V    # Export all environment variables from the qsub commands environment to the batch job.
#PBS -l pmem=16gb,pvmem=16gb       # Amount of memory needed by each process (ppn) in the job.
#PBS -d /lustre/aoc/observers/nm-4386 # Working directory (PBS_O_WORKDIR)
#PBS -l nodes=1:ppn=8

CASAPATH=/home/casa/packages/RHEL6/release/current

xvfb-run -d mpicasa -machinefile $PBS_NODEFILE $CASAPATH/bin/casa --nogui -c /lustre/aoc/observers/nm-4386/run_mpicasa.py

For more information regarding how to set memory requests see the Memory Options section of the documentation.

 

Advanced submit

 

 

cluster.req

# This is a config file for submitting jobs to the cluster scheduler.
# The COMMAND is expected to be a script or binary.
# This config file happens to be for running casa.

#
# These are required
#
WORK_DIR="/lustre/aoc/observers/nm-4386"
COMMAND="/lustre/aoc/observers/nm-4386/run_casa.sh"
# Please use at least one of MEM, PMEM, VMEM, PVMEM.  Or use MEMORY.
#MEM="16gb"    # physmem used by job. Ignored if NUM_NODES > 1. Won't kill job.
PMEM="16gb"    # physmem used by any process. Won't kill job.
#VMEM="16gb"    # physmem + virtmem used by job. Kills job if exceeded.
PVMEM="16gb"    # physmem + virtmen used by any process. Kills job if exceeded.
#MEMORY="16gb"    # sets MEM and PMEM (Deprecated).

#
# These are optional
#
#NUM_NODES="2"      # default is 1
#NUM_CORES="4"      # default is 1
#MAILTO="nm-4386"   # default is the user submitting the job
#QUEUE="batch"      # default is the batch queue
#STDOUT="my_out"    # file relative to WORK_DIR.  default is no output
#STDERR="my_err"    # file relative to WORK_DIR.  default is no output
#UMASK="0117"       # default is 0077

# MAIL_OPTIONS:
#   n  no mail will be sent.
#   a  mail is sent when the job is aborted by the batch system.
#   b  mail is sent when the job begins execution.
#   e  mail is sent when the job terminates.
MAIL_OPTIONS="abe"   # default is "n" therefore no email

# JOB_NAME: <= 15 non-whitespace characters.  First character alphabetic.
#JOB_NAME="testing1"   # default is _qsub.
Info Services Contacts
 
Search All NRAO