Rightsizing Workflows

When launching your Metworx workflow, you can choose between many different combinations of CPUs and RAM for your head (master) and compute nodes. This page provides guidance on how to size and monitor your workflows to ensure they are configured appropriately.


General guidance

For less computationally intensive activities on the master node (e.g., data assembly, postprocessing results, or submitting jobs to the grid), two vCPUs should be sufficient.

For heavy processing, four vCPUs should be sufficient.

In general, unless you're running large simulations or working with many datasets that are 100s of megabytes or gigabytes, 16 GB of RAM should be sufficient.

Another consideration is the installation of packages at project setup. Given pkgr can use multiple cores, we suggest to use a 4- or 8-core workflow when you need to install a large number of packages (such as when setting up your environment for a new project or to perform QC)

Monitoring your workflow

You have a couple of options to monitor your workflows and confirm they are configured appropriately.

Starting with the 21.08 Metworx series, you can use Grafana dashboards to see key metrics related to the master node and the grid. You can learn more about Grafana dashboards here.

Grafana Dashboard

Alternatively, you can check the RStudio Admin dashboard available at https://<YOURWORKFLOW-ID>.metworx.com/rstudio/admin. This dashboard shows historical RAM and CPU usage to identify if you may need additional RAM or CPU.

If you are on Metworx version 21.08 or newer, we recommend that you use the Grafana dashboards to monitor your workflows. However, if you are on Metworx version 20.12 or older, the RStudio Admin dashboard is still a very useful tool.


Video tutorial