Interacting with Sun Grid Engine (SGE)
Scope
Metworx manages jobs submitted to the compute grid via Sun Grid Engine (SGE). This page demonstrates commands for interacting with SGE that can be executed at a shell prompt or passed from R using the function system()
.
In this article you will see how to:
- Submit a job to run using
qsub
- Monitor running jobs using
qstat
- Delete running jobs using
qdel
- Decrease the priority of pending jobs using
qalter
- Recognize when a job or queue is having an issue
For background on the Metworx compute grid and its structure, please refer to the Grid Computing Intro.
qsub
qsub
is used to submit a job to a queue. Pass the path to an executable script to qsub
and it will create a job in the queue to run that script.
$ qsub my_script.sh
qsub
has a number of important arguments, for example -pe orte NUMBER-OF-CORES
for reserving multiple cores to run your job in parallel. Call man qsub
for more details.
"Unable to run job" Warning Message
By default, a Metworx workflow starts with no compute nodes active, and it auto scales by launching compute nodes as jobs are submitted to the queue. If there are no compute nodes currently up when you call qsub
, you will see the following message:
Unable to run job: warning: <your-user-name's> job is not allowed to run in any queue
Your job <number> ("<model-name>") has been submitted
Exiting.
This simply means that it can’t run immediately because it has to wait for a worker node. You can run qstat -f
(described below) to verify that it is in the queue.
qsub Script Template
For submitting NONMEM jobs to the grid, MetrumRG recommends using bbr or PsN. However, for many other jobs, it is most effective to use a shell script template.
In this example, we show you how to run an R script on the grid. First, create a script on your workflow disk (let's call it submit.sh
) and copy the following into it:
#!/bin/bash
#$ -cwd
#$ -V
#$ -o qsub-job-$JOB_ID.out
#$ -e qsub-job-$JOB_ID.err
#$ -pe orte NUMBER-OF-CORES
Rscript YOUR-SCRIPT.R
Since this is only a template, you will need to update the following:
- Change
YOUR-SCRIPT.R
in the template to the path to the R script you want to run, relative to the location of this script. - Change
NUMBER-OF-CORES
in the template to the number of cores (or "slots" in SGE terminology) that you want to reserve for your job. It is important to note that this only reserves the space on the grid. If you want your job to actually use multiple cores you will have to have to write code to do that. In the case of an R script, this likely means using thefuture
package,mclapply
, or something similar.
Next, run chmod +x submit.sh
to make sure your script is executable.
Lastly, navigate to the directory containing your new script and call qsub submit.sh
.
Please note that this pattern, and this template script submit.sh
is not specific to R. You can use it to launch any script on the grid by replacing the line Rscript YOUR-SCRIPT.R
with whatever you need to run your script on the command line.
qstat
The qstat
command shows you the status of jobs currently queued or running on the grid (the -f
flag shows a more informative output).
$ qstat -f
queuename qtype resv/used/tot. load_avg arch states
--------------------------------------------------------------------------
all.q@ip-10-128-20-32.ec2 BIP 0/1/8 0.04 linux-x64
562 0.55500 Run001 user0001 r 11/19/2014 11:59:15 1
This call shows that user0001
has one job (Run001
) running with job ID 562
, and it is currently using one of the 8 slots (0/1/8
) available on this compute node. The number of available slots depends on the number of vCPUs that you've configured for your worker nodes.
The r
shows that job is currently running. A qw
in that spot indicates the job is queued and waiting.
Adding the -f
gives you more information about the computes nodes available and the jobs running on them. See Monitoring Parallelization for an example of using qstat -f
to monitor NONMEM jobs on the grid.
qdel
The qdel
command deletes jobs from the grid. This can be called on either queued or currently running jobs.
You can pass your username to delete all of your jobs.
qdel -u <your username>
You can also pass a job ID to delete a specific job. Use qstat
to find your job ID:
$ qstat -f
queuename qtype resv/used/tot. load_avg arch states
--------------------------------------------------------------------------
all.q@ip-10-128-20-32.ec2 BIP 0/1/8 0.04 linux-x64
562 0.55500 Run001 user0001 r 11/19/2014 11:59:15 1
The job ID for Run001
is 562
. Simply run qdel 562
to delete only that job.
You may find it necessary to delete a list of jobs, such as bootstrap jobs that were submitted in error, while other jobs are running. One way is to loop over the job IDs you want to delete in R. Lets assume the job IDs go from 600-1600.
In R you can "shell out" to run commands that you would run on the terminal via the system()
command:
for(i in 600:1600){
system(paste('qdel', i))
}
The block of code above will delete job IDs 600-1600 from within RStudio. You could also substitute a vector of run numbers if the job IDs were not consecutive.
qalter
The qalter
command alters the running or pending job. For example, you submit 500 bootstrap runs that load up your workflow, and then you need to submit additional runs for a different project or model before your bootstrap runs have complete. Since the SGE system uses a first-in-first-out approach, your additional runs won't start until your bootstrap runs are complete.
In most cases, you would prefer the one-off model runs to take priority over the bootstraps. In this case, you can use qalter
to decrease the priority of the pending bootstrap runs so the one-off model fills the next available slot. Lets take a look at some qstat -f
output below.
$ qstat -f
queuename qtype resv/used/tot. load_avg arch states
-------------------------------------------------------------------
all.q@ip-10-128- BIP 0/8/8 0.06 linux-x64
562 0.55500 Run001 user0001 r 11/19/2014 11:59:15 1
563 0.55500 Run002 user0001 r 11/19/2014 11:59:20 1
564 0.55500 Run003 user0001 r 11/19/2014 11:59:23 1
565 0.55500 Run004 user0001 r 11/19/2014 12:00:15 1
566 0.55500 Run005 user0001 r 11/19/2014 12:00:18 1
567 0.55500 Run006 user0001 r 11/19/2014 12:00:26 1
568 0.55500 Run007 user0001 r 11/19/2014 12:01:15 1
569 0.55500 Run008 user0001 r 11/19/2014 12:02:15 1
####################################################
- PENDING JOBS - PENDING JOBS - PENDING JOBS
####################################################
571 0.55500 Run010 user0001 p 11/19/2014 12:01:15 1
572 0.55500 Run011 user0001 p 11/19/2014 12:02:15 1
573 0.55500 Run012 user0001 p 11/19/2014 12:03:15 1
574 0.55500 Run013 user0001 p 11/19/2014 12:01:15 1
580 0.55500 Run035 user0001 p 11/19/2014 12:03:15 1
Job ID 580
and the remaining four bootstrap jobs are pending (not assigned to a machine). We want to decrease the priority of job IDs 571-574 so that job ID 580 will run before the remaining pending bootstrap jobs. The priority is shown as the decimal number in the qstat
output above. Currently all jobs have the same priority. We will decrease the priority of job IDs 571-574 using the qalter
command. Again, in this example, we loop in R and call out to qalter
via the system()
function:
for(i in c(571:574)){
system(paste('qalter -p -10', i))
}
The value following -p
in the above code block sets the priority. You can supply a number between -1023
and 0
. By passing -10
, you decrease the priority of those jobs. If you look at qstat -f
output again, it looks as follows.
$ qstat -f
queuename qtype resv/used/tot. load_avg arch states
-----------------------------------------------------------------------
all.q@ip-10-128- BIP 0/8/8 0.06 linux-x64
562 0.55500 Run001 user0001 r 11/19/2014 11:59:15 1
563 0.55500 Run002 user0001 r 11/19/2014 11:59:20 1
564 0.55500 Run003 user0001 r 11/19/2014 11:59:23 1
565 0.55500 Run004 user0001 r 11/19/2014 12:00:15 1
566 0.55500 Run005 user0001 r 11/19/2014 12:00:18 1
567 0.55500 Run006 user0001 r 11/19/2014 12:00:26 1
568 0.55500 Run007 user0001 r 11/19/2014 12:01:15 1
569 0.55500 Run008 user0001 r 11/19/2014 12:02:15 1
####################################################
- PENDING JOBS - PENDING JOBS - PENDING JOBS
####################################################
580 0.55500 Run035 user0001 qw 11/19/2014 12:03:15 1
571 0.55100 Run010 user0001 qw 11/19/2014 12:01:15 1
572 0.55100 Run011 user0001 qw 11/19/2014 12:02:15 1
573 0.55100 Run012 user0001 qw 11/19/2014 12:03:15 1
574 0.55100 Run013 user0001 qw 11/19/2014 12:01:15 1
The above output indicates that job ID 580 is the next job to be executed when a core is available.
Troubleshooting
Sometimes you may not know why a NONMEM job is pending in the queue. One example is with parallel runs, when more nodes are requested than are available. The qstat
output below demonstrates this situation. (Note: this only happens when using -pe smp
instead of -pe orte
.)
queuename qtype resv/used/tot. load_avg arch states
-----------------------------------------------------------------
all.q@ip-10-128- BIP 0/0/8 0.06 linux-x64
####################################################
- PENDING JOBS - PENDING JOBS - PENDING JOBS
################################################### #
590 0.60383 Run100 user0001 qw 11/19/2014 12:03:15 12
591 0.60383 Run101 user0001 Eqw 11/19/2014 12:03:15 1
The 12
at the end of job ID 590 indicates this job needs 12 compute cores. Since there are only eight available, it will pend indefinitely. You should stop this job using qdel
and resubmit it with the correct number of cores.
The Eqw
state of job ID 591 means SGE failed when it tried to schedule the job. At this point, the job will not run either, and you should also stop it using qdel
. In this case, there was likely an error with the job being run. Check your code or control stream for any obvious errors and then resubmit the job.
On rare occasions, the qstat
output will indicate that a given node is in alarm state (a
), error state (E
), unreachable (U
), suspended (S
), or disabled (d
). Some examples of this are shown below.
queuename qtype resv/used/tot. load_avg arch states
---------------------------------------------------------------------------
all.q@ip-10-128-20-32.ec2 BIP 0/0/3 0.06 lx24-amd64 E
queuename qtype resv/used/tot. load_avg arch states
--------------------------------------------------------------------------
all.q@ip-10-128-22-180.ec2 BIP 0/0/3 0.06 lx24-amd64 U
In the above output, all.q@ip-10-128-20-32.ec2
is in error state and all.q@ip-10-128-22-180.ec2
is unreachable. If after following this troubleshooting guidance, errors continue to persist with nodes that are marked E
, U
, d
, or S
you should contact the Metworx help desk. If this does occur, you may need to delete the existing workflow and restart a new one.
Additional Learning
Additional information on all SGE commands is found in the man pages for each command. You can access these from the system command prompt by typing man <command>
. For example, to learn more about qstat
, you could type man qstat
:
$ man qstat
QSTAT(1) Grid Engine User Commands QSTAT(1)
NAME
qstat - show the status of Grid Engine jobs and queues
SYNTAX
qstat [-ext] [-f] [-F [resource_name,...]] [-g c|d|t[+]] [-help] [-j [job_list]] [-l
resource=val,...] [-ne] [-pe pe_name,...] [-ncb] [-pri] [-q wc_queue_list] [-qs
a|c|d|o|s|u|A|C|D|E|S] [-r] [-s {r|p|s|z|hu|ho|hs|hd|hj|ha|h|a}[+]] [-t] [-U user,...] [-u
user,...] [-urg] [-xml]
DESCRIPTION
qstat shows the current status of the available Grid Engine queues and the jobs associated with
the queues. Selection options allow you to get information about specific jobs, queues or users.
If multiple selections are done, a queue is only displayed if all selection criteria for a queue
instance are met. Without any option qstat will display only a list of jobs, with no queue status
information.
...