pkgr For R Package Development
pkgr
, used in conjunction with renv
and optionally MPN, can be very useful for R package development. Most Metrum-developed packages use this paradigm. You can see an example of this in the bbr
repo (development environment setup instructions here, pkgr.yml
file here.
Motivation
When developing an R package, it is important to pay attention to the versions of your dependencies (i.e. the packages you import, typically by listing them in your DESCRIPTION
file under Imports
, Depends
, or Suggests
). Using pkgr
and renv
allows you to control which versions of your dependencies you are pulling in, and keeps them isolated from any other installations of those same packages elsewhere on your system.
Setup
This section takes you step-by-step through setting up pkgr
and renv
in your package. If you are a new developer working on a package that has pkgr
and renv
already set up in the repo, you can skip steps 5 and 9 because there should already be a pkgr.yml
file in the repo and your *ignore
files should already be set up correctly.
- Clone the repository of the package.
- Make sure
pkgr
is installed. If you are using Metworx, it will already be installed. - Make sure you have a global installation of
renv
. This can be done by opening an R console and typingpackageVersion("renv")
. You must have at least version0.8.3-4
. If you are on Metworx, this will already be installed. If you do not have it, you can install it from CRAN withinstall.packages("renv")
, from an R session that is outside of an activatedrenv
project. - Start an R session in your project folder (click your
.Rproj
file if you have one) and then runrenv::init(bare = TRUE)
in the R console to initiaterenv
in the project. This will do several things, including modifying your.Rprofile
and creating anrenv/
directory. - Add a
pkgr.yml
file to the top-level directory of your project. See "Setting up your pkgr.yml file" below for details. Add^pkgr.yml$
to your.Rbuildignore
file. Note: do not addpkgr.yml
to your.gitignore
. You do want this file checked into version control. - In your terminal, in the top-level directory of your project, run
pkgr plan
to preview the packages that will be installed. You should also see an installation directory containingrenv/library/
(you can runpkgr plan | grep 'Library path'
to filter to only that relevant line). - If everything looks right, run
pkgr install
in the same terminal. Your packages will begin installing. If you are on Metworx, we recommend using a Workflow with at least 8 vCPU's to take advantage ofpkgr
's parallelism. If you have less vCPU's, and you are installing a non-trivial number of packages, this could take awhile. - Once
pkgr
has finished, restart your R session. In the R console, enter.libPaths()
. You should see the same path you saw above, but this time as an absolute path. This verifies that you are using the isolated package installations that we have just set up. - As a final step, we recommend adding
.Rprofile
,renv/
, andrenv.lock
to your top-level.gitignore
file. This will make it easier for other developers to use this same process on this repo in the future.renv/
andrenv.lock
should also be in your.Rbuildignore
, butrenv::init()
should have already added them. If you'd like, you can check to be sure.
Setting up your pkgr.yml file
A complete example pkgr.yml
for package development might look like this:
Version: 1
Descriptions:
- DESCRIPTION
Packages:
- devtools
- usethis
- pkgdown
Repos:
- CRAN: https://cran.rstudio.com
- MPN: https://mpn.metworx.com/snapshots/stable/2022-02-11 # used for mrgval
Lockfile:
Type: renv
Each part of this is explained below.
Lockfile
We will start at the bottom, because this is the essential section for using pkgr
and renv
together. You are required to specify either Lockfile
or Library
to tell pkgr
where to install packages. Since we are using renv
to isolate our package environment, you only have to include the following, and renv
will tell pkgr
where to install packages.
Lockfile:
Type: renv
Note that the library path used will change (and force you to re-install packages) if you switch to a different version of R or a different operating system. This is a good thing, and will avoid other weird errors that could occur if you switched and did not re-install your packages.
Version
This is the version of the pkgr.yml
configuration file, and is required in every pkgr.yml
. At this point, it should always say Version: 1
.
Descriptions
This is the key element for package development. Setting the following will tell pkgr
to pull out all dependencies listed in the DESCRIPTION
file for your package.
Descriptions:
- DESCRIPTION
This means that you don't have to explicitly list all of the packages you want in the Packages:
section. Simply update your DESCRIPTION
file as you normally would, and pkgr
will pick up any changes.
Packages
If you have listed your DESCRIPTION
file (described in the previous section) then this section only needs to contain packages that you would like to use for development purposes that are not in your DESCRIPTION
file (i.e. are not formal dependencies of your package).
Packages:
- devtools
- usethis
- pkgdown
devtools
is a prime example of this kind of package, but there are others you may consider (for example, pkgdown
for building a documentation site, or usethis
for setting up package scaffolding).
Repos
What you put in this section is up to developer discretion. There are two options:
- Pull packages from CRAN (or another repository that is always updated with the most recent versions of packages)
- Pull packages from MPN (or another repository that is a stable unchanging snapshot of packages)
Pros and cons
The advantage of pulling from CRAN is that you can run pkgr install --update
frequently and you will always have the most recent versions of your dependencies. This will allow you to catch any breaking changes as soon as possible. The downside is that it forces the developer to deal with those breaking changes as soon as they appear. Metrum typically uses this strategy.
The advantage of using something like MPN is that you will always know what versions of your dependencies you are developing with. This provides more stability while developing, and allows you to conscientiously update to a newer snapshot when you are ready to test the package with updated dependencies. The downside is that there may be breaking changes in your dependencies that you will not catch until you have updated your snapshot, and users may experience the results of these before you notice them (if the users are pulling other packages from CRAN).
Other considerations
You will see the example above has both CRAN and MPN, with CRAN listed first:
Repos:
- CRAN: https://cran.rstudio.com
- MPN: https://mpn.metworx.com/snapshots/stable/2022-02-11 # used for mrgval
This will make pkgr
first look on CRAN for all packages, and then look on MPN for any packages it didn't find on CRAN. This is useful if you have dependencies that are not on CRAN.
Another option is to use the Customizations:
section to pin the version of specific packages by telling pkgr
to download them from a stable snapshot repo, even if the rest of your dependencies are getting pulled from CRAN.
Customizations:
Packages:
- dplyr:
Repo: MPN
There may be situations where you want to do this because of a known breaking change that you have not yet dealt with. However, you should take this path with care. In general, having a strict (forward-looking) version constraint on your dependencies is not an ideal situation long term.