Nonmem via PsN


Scope

Perl-speaks-NONMEM (PsN) is a collection of Perl modules and programs help develop non-linear mixed effect models using NONMEM. Its functionality ranges from simpler tasks such as extracting parameter estimates from output files, data file subsetting and resampling, to advanced computer-intensive statistical methods. More information is available on GitHub. Here we provide a NONMEM example (in a linked zip file) and use it to illustrate common PsN commands, as well as directions for running models in parallel. Finally, we briefly demonstrate PsNR, an R package that comes pre-installed on the Metworx 21.08 (and newer) blueprints.

PsN Examples

Example PsN/Pirana data and models are available in this nonmem-examples.zip file. Use RStudio to upload to your /data directory and it automatically unzips. Example command line PsN calls are shown below. These should be run in the directory created when the zip file is uploaded. Migrate to the directory containing the model files in your terminal.

Execute (run) a model

Using the default NONMEM version:

execute run105.mod

Using a specific NONMEM version and installation - NONMEM 7.4 installed via nmqual:

execute -nmqual -nm_version='nm74_gf' run105.mod

Bootstrap

Using 16 samples on 4 threads (and the default NONMEM version):

bootstrap -samples=16 -threads=4 -dir=bs_1 run105.mod

Stepwise Covariate Modeling (SCM)

To perform SCM, you need an scm file that denotes which model to run and the parameters to vary. For more information, see the user guide in the nonmem-examples.zip file or visit PsN’s website. The run105.scm reads as follows:

model=scm105.mod
search_direction=forward
p_forward=0.05
continuous_covariates=WT

[test_relations]
V=WT
[valid_states]
continuous = 1,5
categorical = 1,5
scm run105.scm

Parallelization

To run NONMEM in parallel, you need to specify a parallelization file (denoted parafile). This file specifies the number compute nodes to use, among other parameters. You must provide the path to the parafile and include the argument -parafile=path/to/<parafile>.pnm.

If you don't already have a parafile, you can use the 105.pnm file from nonmem-examples.zip, or create an empty file (typically called <run>.pnm with <run> replaced by your model name) in the same directory as your model file, and copy the following into it:

$GENERAL
NODES=4 PARSE_TYPE=2 TIMEOUTI=100 TIMEOUT=10 PARAPRINT=0 TRANSFER_TYPE=1
;SINGLE node: NODES=1
;MULTI node: NODES>1
;WORKER node: NODES=0
; parse_num=number of subjects to give to each node
; parse_type=0, give each node parse_num subjects
; parse_type=1, evenly distribute numbers of subjects among available nodes
; parse_type=2, load balance among nodes
; parse_type=3, assign subjects to nodes based on idranges
; parse_type=4, load balance among nodes, taking into account loading time.  Will assess ideal number of nodes.
; If loading time too costly, will eventually revert to single CPU mode.
; timeouti=seconds to wait for node to start.  if not started in time, deassign node, and give its load to next worker, until next iteration
; timeout=minutes to wait for node to compelte.  if not completed by then, deassign node, and have manager complete it.
; paraprint=1  print to console the parallel computing process.  Can be modified at runt-time with ctrl-B toggle.
; But parallel.log always records parallelization progress.
; transfer_type=0 for file transfer, 1 for mpi

;THE EXCLUDE/INCLUDE may be used to selectively use certain nodes, out of a large list.
;$EXCLUDE 5-7 ; exclude nodes 5-7
;$EXCLUDE ALL 
;$INCLUDE 1,4-6

$COMMANDS ;each node gets a command line, used to launch the node session
1: /usr/local/mpich3/bin/mpiexec -wdir "$PWD" -n 1 ./nonmem $*
2:-wdir "$PWD" -n 3 ./nonmem -wnf

$DIRECTORIES
1:NONE ; FIRST DIRECTORY IS THE COMMON DIRECTORY
2-[nodes]:worker{#-1} ; NEXT SET ARE THE WORKER directories

;$IDRANGES ; USED IF PARSE_TYPE=3
;1:1,50
;2:51,100

This .pnm file parallelizes this model across 4 cores. To submit the job to the Sun Grid Engine (SGE), the PsN command should also specify the number of cores to parallelize across, in this case, -pe orte 4. To change the number of cores for a parallel run you need to change both the .pnm file and the PsN call. For example, to increase the from 4 to 8 cores, the PSN command is -pe orte 8 and the .pnm file should be updated on Line 2 NODES=4 to NODES=8 and Line 26 -n 3 to -n 7 (i.e., one less than the number of cores you specified in Line 2).

Testing different numbers of cores

Unfortunately, there is no easy way to know the ideal number of cores for a given NONMEM model. It is dependent on many factors, for example:

  • The data

    • How many observations? How many subjects?
    • How much noise? Does the model fit the data well?
  • Model complexity

    • Number of compartments, number of parameters estimated
    • Number of random effects
    • “Stiffness” of differential equations, non-linear systems
    • Estimation method
  • How urgently do you need it?

    • There are often diminishing speed gains from adding more and more cores, but sometimes it is worth paying for the extra cores if you're on a tight deadline.

The best way to the ideal number of cores for your model is to test it with different numbers of cores. You easily do this by setting MAXEVAL to something small (typically between 10 and 100 is good, depending on how long each iteration takes). Then make copies of your modified model file and run it with different numbers of cores.

An example of testing for the optimal number of cores, with helper scripts, are provided: testingcorespsn.zip. This example is also demonstrated with bbr in Running NONMEM in Parallel: bbr Tips and Tricks

Some things to remember when testing:

  • Speed-up is not linear

    • Twice as many cores is usually, at best, between 1.5- and 1.8-times faster.
  • Over-parallelizing (too many cores) will cause slowdown

    • Remember: there is overhead to parallelizing, primarily caused by passing around data. See the Parallel Computing Intro for further explanation.
  • Avoid spreading a single model over multiple nodes

    • Multiply out your number of cores per node and cores per model to get the models to fit nicely on your compute nodes.

Using the Grid

You can run a model on the grid via the arguments run_on_sge and sge_prepend_flags. sge_prepend_flags denotes how many cores you want to utilize per compute node.

execute -run_on_sge -parafile=105.pnm -sge_prepend_flags='-pe orte 4 -V' run105.mod

Bootstrap NONMEM jobs

Bootstrap using 20 samples and running 20 models simultaneously across compute nodes.

bootstrap -samples=20 -threads=20 -dir=bs_3 -run_on_sge -sge_prepend_flags='-V' run105.mod

PsNR

PsNR is an R package that allows you to take advantage of the PsN command -rplots, which generate a list of predefined plots in an Rmarkdown file. PsNR and its dependencies only come pre-installed on the Metworx 21.08 (and newer) blueprints, though you can install them manually on other older blueprints. Note this argument comes after the model call. The available R plot specifications are as follows:

  • -rplots < 0 means the R-script is not generated
  • -rplots = 0 (default) means the R-script is generated but not run
  • -rplots = 1 means basic R plots are generated
  • -rplots = 2 means basic and extended R plots are generated

Examples

To create a list of plots, tell PsN what to capture.

Execute

For the execute example we use the run105_table.mod, which captures the following data:

$TABLE ID TIME DV IPRED IRES IWRES Y NOPRINT ONEHEADER FILE=sdtab105
$TABLE ID CL V KA ETA1 ETA2 NOPRINT NOAPPEND ONEHEADER FILE=patab105
$TABLE ID SEX ETN NOPRINT NOAPPEND ONEHEADER FILE=catab105
$TABLE ID WT NOPRINT NOAPPEND ONEHEADER FILE=cotab105

To run the model and generate basic R plots

execute run105_table.mod -rplots=1

Output:

0ITERATION NO.:   54    OBJECTIVE VALUE:   2604.00137466228        NO. OF FUNC. EVALS.:  80
 CUMULATIVE NO. OF FUNC. EVALS.:      717
 NPARAMETR:  2.3130E+00  7.8144E+01  4.6205E+02 -7.9657E-02  4.1173E+00  5.5869E-01  1.9924E-02  1.4961E-01
 PARAMETER:  2.4538E-01  3.3599E+00  3.9331E+00 -3.9828E-01  4.1173E-01 -4.8216E-01 -3.6006E-01 -4.5157E-02
 GRADIENT:   1.3162E-01 -8.3486E-02 -7.9655E-01 -4.9272E-03 -6.7705E-02 -5.1738E-02 -4.1889E-04  9.4174E-03
 Elapsed estimation  time in seconds:     3.72
 Elapsed covariance  time in seconds:     0.83
 Elapsed postprocess time in seconds:     0.02
 Elapsed finaloutput time in seconds:     0.05
Done with nonmem execution
F:1 .. 
Running PsN_execute_plots.Rmd...

execute done

Bootstrap

bootstrap -samples=5 -threads=5 run105_table.mod -rplots=1

SCM:

scm run105.scm -rplots=1