R Package Management
Managing R packages (and their dependencies) can be challenging and often frustrating. Today, packages are highly interwoven and it is becoming increasingly difficult to manage the set of packages in a transparent and robust way. To address this, MetrumRG has developed a suite of tools that make R package management simple, easy, and reproducible, both within a single project and between multiple projects. This article will give you an overview of how to effectively manage packages in R using these tools.
Metrum Package Network (MPN)
Modern approaches to running analyses with R increasingly rely on contributed packages that are not part of base R. To ensure a reproducible analysis environment, all project collaborators must work with the same versions of these packages. It's also important to future-proof your projects, so whether you run this code tomorrow or in a year's time, the output is the same and your code is not broken by changes in later package versions. This is achieved by installing the same versions of all R packages, a challenging task given the quantity, diversity, and interdependence of available packages. The Metrum Package Network, or MPN, was developed by MetrumRG to provide a repository of stable, curated snapshots of R packages (from CRAN and other repositories). By installing packages from a specific MPN snapshot, rather than directly from CRAN, you can ensure it is always the same version of a given package. Furthermore, if you want to update your packages mid-project, you can simply update to a new snapshot that was curated to ensure all packages in that snapshot work well together.
To help each collaborator on a project quickly identify and install all the packages relevant to a given project, MetrumRG developed a command line tool called pkgr. This tool allows you to define all aspects of your global environment with focus on two vital components of pharmacometric analysis, reproducibility and auditability. pkgr works by allowing you to write a single configuration file, pkgr.yml, where you specify:
- The required top level packages (pkgr and MPN manage all of the dependencies)
- The specific MPN snapshot you want to install packages from (and any other repositories you may need)
- Any customizations you need for package installation, for example, you can opt to install specific packages from specific repositories
- The R version for which the packages should be installed
The packages and any customizations described in the pkgr.yml can then be installed directly from the command line. Additionally, once you’ve created the pkgr.yml, all collaborators can use it to quickly and easily configure their R environment to match your specifications.
To ensure different projects can have different, potentially conflicting package versions, each project should be isolated from each other. MetrumRG recommends using renv (alongside pkgr) to ensure each project and its packages are independent of whatever is installed system-wide or on another project. renv can also export the version and source of every package in your project (via a json file), allowing external collaborators or regulatory agencies not using this suite of package management tools to re-create the project package set.