pkgr For R Package Development
pkgr, used in conjunction with renv and optionally MPN, can be very useful for R package development. Most Metrum-developed packages use this paradigm. You can see an example of this in the bbr repo (development environment setup instructions here, pkgr.yml file here.
Motivation
When developing an R package, it is important to pay attention to the versions of your dependencies (i.e. the packages you import, typically by listing them in your DESCRIPTION file under Imports, Depends, or Suggests). Using pkgr and renv allows you to control which versions of your dependencies you are pulling in, and keeps them isolated from any other installations of those same packages elsewhere on your system.
Setup
This section takes you step-by-step through setting up pkgr and renv in your package. If you are a new developer working on a package that has pkgr and renv already set up in the repo, you can skip steps 5 and 9 because there should already be a pkgr.yml file in the repo and your *ignore files should already be set up correctly.
- Clone the repository of the package.
- Make sure
pkgris installed. If you are using Metworx, it will already be installed. - Make sure you have a global installation of
renv. This can be done by opening an R console and typingpackageVersion("renv"). You must have at least version0.8.3-4. If you are on Metworx, this will already be installed. If you do not have it, you can install it from CRAN withinstall.packages("renv"), from an R session that is outside of an activatedrenvproject. - Start an R session in your project folder (click your
.Rprojfile if you have one) and then runrenv::init(bare = TRUE)in the R console to initiaterenvin the project. This will do several things, including modifying your.Rprofileand creating anrenv/directory. - Add a
pkgr.ymlfile to the top-level directory of your project. See "Setting up your pkgr.yml file" below for details. Add^pkgr.yml$to your.Rbuildignorefile. Note: do not addpkgr.ymlto your.gitignore. You do want this file checked into version control. - In your terminal, in the top-level directory of your project, run
pkgr planto preview the packages that will be installed. You should also see an installation directory containingrenv/library/(you can runpkgr plan | grep 'Library path'to filter to only that relevant line). - If everything looks right, run
pkgr installin the same terminal. Your packages will begin installing. If you are on Metworx, we recommend using a Workflow with at least 8 vCPU's to take advantage ofpkgr's parallelism. If you have less vCPU's, and you are installing a non-trivial number of packages, this could take awhile. - Once
pkgrhas finished, restart your R session. In the R console, enter.libPaths(). You should see the same path you saw above, but this time as an absolute path. This verifies that you are using the isolated package installations that we have just set up. - As a final step, we recommend adding
.Rprofile,renv/, andrenv.lockto your top-level.gitignorefile. This will make it easier for other developers to use this same process on this repo in the future.renv/andrenv.lockshould also be in your.Rbuildignore, butrenv::init()should have already added them. If you'd like, you can check to be sure.
Setting up your pkgr.yml file
A complete example pkgr.yml for package development might look like this:
Version: 1
Descriptions:
- DESCRIPTION
Packages:
- devtools
- usethis
- pkgdown
Repos:
- CRAN: https://cran.rstudio.com
- MPN: https://mpn.metworx.com/snapshots/stable/2022-02-11 # used for mrgval
Lockfile:
Type: renvEach part of this is explained below.
Lockfile
We will start at the bottom, because this is the essential section for using pkgr and renv together. You are required to specify either Lockfile or Library to tell pkgr where to install packages. Since we are using renv to isolate our package environment, you only have to include the following, and renv will tell pkgr where to install packages.
Lockfile:
Type: renvNote that the library path used will change (and force you to re-install packages) if you switch to a different version of R or a different operating system. This is a good thing, and will avoid other weird errors that could occur if you switched and did not re-install your packages.
Version
This is the version of the pkgr.yml configuration file, and is required in every pkgr.yml. At this point, it should always say Version: 1.
Descriptions
This is the key element for package development. Setting the following will tell pkgr to pull out all dependencies listed in the DESCRIPTION file for your package.
Descriptions:
- DESCRIPTIONThis means that you don't have to explicitly list all of the packages you want in the Packages: section. Simply update your DESCRIPTION file as you normally would, and pkgr will pick up any changes.
Packages
If you have listed your DESCRIPTION file (described in the previous section) then this section only needs to contain packages that you would like to use for development purposes that are not in your DESCRIPTION file (i.e. are not formal dependencies of your package).
Packages:
- devtools
- usethis
- pkgdowndevtools is a prime example of this kind of package, but there are others you may consider (for example, pkgdown for building a documentation site, or usethis for setting up package scaffolding).
Repos
What you put in this section is up to developer discretion. There are two options:
- Pull packages from CRAN (or another repository that is always updated with the most recent versions of packages)
- Pull packages from MPN (or another repository that is a stable unchanging snapshot of packages)
Pros and cons
The advantage of pulling from CRAN is that you can run pkgr install --update frequently and you will always have the most recent versions of your dependencies. This will allow you to catch any breaking changes as soon as possible. The downside is that it forces the developer to deal with those breaking changes as soon as they appear. Metrum typically uses this strategy.
The advantage of using something like MPN is that you will always know what versions of your dependencies you are developing with. This provides more stability while developing, and allows you to conscientiously update to a newer snapshot when you are ready to test the package with updated dependencies. The downside is that there may be breaking changes in your dependencies that you will not catch until you have updated your snapshot, and users may experience the results of these before you notice them (if the users are pulling other packages from CRAN).
Other considerations
You will see the example above has both CRAN and MPN, with CRAN listed first:
Repos:
- CRAN: https://cran.rstudio.com
- MPN: https://mpn.metworx.com/snapshots/stable/2022-02-11 # used for mrgvalThis will make pkgr first look on CRAN for all packages, and then look on MPN for any packages it didn't find on CRAN. This is useful if you have dependencies that are not on CRAN.
Another option is to use the Customizations: section to pin the version of specific packages by telling pkgr to download them from a stable snapshot repo, even if the rest of your dependencies are getting pulled from CRAN.
Customizations:
Packages:
- dplyr:
Repo: MPNThere may be situations where you want to do this because of a known breaking change that you have not yet dealt with. However, you should take this path with care. In general, having a strict (forward-looking) version constraint on your dependencies is not an ideal situation long term.